112M tokens. The bill outside Claude Max was about $20.
My Claude Code dashboard: - 430 sessions - 78,537 messages - 112M tokens - 22-day streak - "Favorite model": MiniMax-M2.7 That looks like a small fortune. It wasn't. On top of a Claude Max 20x plan, third-party API spend was probably $10 to $30. The reason it's that low is advisor-driven development. The pattern Opus stays in the orchestration chair. It plans, makes the calls, reviews the work. It does not type out the diff, run the test loop, or grep through 400 files looking for one symbol. That's grunt work. Grunt work goes down the model stack. - Opus for architecture, judgment, debugging the hard ones, anything where my voice is on the line - Sonnet for focused multi-step execution where reasoning still matters - Haiku for mechanical edits, file scaffolding, log parsing, small refactors - MiniMax for long bulk passes where context is huge but the thinking is shallow (asset audits, content sweeps, batch transforms) The bulk of those 112M tokens went through MiniMax and Haiku. That's why the bill is small. The dashboard literally tells you: favourite model, MiniMax. Opus is the rarest model in the mix and it's the one I pay a flat subscription for. Why quality holds Cheap models don't degrade your output if they're never the ones making the call. They're hands. Opus is the brain. If the brain decides what to build and reviews what came back, the hands can be as cheap as you want. The failure mode is asking Haiku to architect, or asking MiniMax to make a taste call. Don't. Route by what kind of thinking the task requires, not by what's available or what's cheapest. The principle Preserve your expensive horsepower for the moments it actually matters. Most of what an AI does in a coding session is mechanical. Move the mechanical work down the model stack. Keep the judgment seat sacred. The bill drops to a fraction of what people assume, and the work doesn't get worse. The power truly comes from when you and Opus are collaborators and co-authors of plans, then dispatch the work to models that can genuinely do this in their sleep.