Tokens = Money. Most people burn 3–5× more than they need on Claude without realizing it.
I put together a 2-page cheat sheet covering everything I wish someone had handed me when I started:
Page 1 — the essentials
- Pricing per 1M tokens (Opus, Sonnet, Haiku)
- The Savings Stack (these multiply, not add — 80–95% cuts are normal)
- The 5 core tactics: right-sizing the model, prompt caching, shorter prompts, output control, context trimming
- Common token wasters to stop doing today
Page 2 — the deep dive
- Claude Code in the terminal: slash commands, CLI flags, how to slim your CLAUDE.md - Prompt caching full Python example (what to cache, what the response tells you)
- Prefill tricks that kill the "Sure, here you go..." preamble
- Model decision tree (when Haiku, Sonnet, Opus each)
- Real-world recipes: cheap classifier, long-doc Q&A, cost-capped agent loops, batch evals
- A pre-ship checklist you can run through before deploying
📎 Attached: Token_Saving_Cheat_Sheet
Drop a comment with the one tip that surprised you most, or your own
token-saving trick.
Vincent