facing high token burn challenges
Hey folks/quick question for anyone running Clawdbot. what do you consider a healthy “always-on” context size per session from a token + cost perspective? I’m currently tightening (or at least trying to/lol) things to: - ~15K tokens for active context - Last 3–5 messages only - Aggressive summarization + compaction beyond that - Using a cheaper model for non-thinking tasks (summaries, formatting, validation) Curious: - Where do you cap context in practice? - Do you rely on auto-compaction (maybe the gateway helps to compact but the bot adds all contexts which always pushes the size of contexts) or manual summaries? - Any gotchas you’ve hit with session memory blowing up costs? Would love to hear real-world numbers vs theory.