A simple prompting trick is getting a lot of attention right now.
A developer shared that by forcing shorter, stripped-down responses (no extra explanation, no fluff), they were able to reduce Claude’s output token usage by up to 75%.
If this is true, it’s a big deal.
Because in AI systems, fewer tokens = lower cost + faster responses.
The idea is simple:
- Don’t let the model over explain.
- Guide it to be direct and concise.
This also shows something important-
Sometimes it’s not about better models, it’s about better prompts.
Small changes → big impact.
Curious to see how this evolves and how many people start optimizing prompts for cost, not just output.