After months of running a production pipeline that uses AI daily for image generation, video creation, and voiceover, here's what actually survived the "day 100 test."
The survivors:
- Gemini 3 Pro for image generation. Not the flashiest in demos, but the most consistent when you need dozens of images that all match a style. The instruction-following is what keeps me here.
- - Kling 2.6 for video. Handles motion and physics better than anything else I've tested at this price point. Not perfect, but predictable.
- - ElevenLabs for voice. Latency is low, quality is high, and the timestamp API makes automated subtitle sync actually work.
What I dropped:
- Models that looked incredible in curated demos but produced wildly inconsistent results at scale. The gap between "cherry-picked showcase" and "Tuesday afternoon batch run" is massive with some tools.
- - Any tool that requires custom prompt engineering for each generation. If it can't follow a structured template reliably, it doesn't survive in a pipeline.
The meta-lesson: the best AI tool isn't the one that produces the single best output. It's the one that produces acceptable-to-good output 95% of the time without babysitting.
Curious about your experience — do you choose your AI tools based on peak demo quality or on day-100 reliability?