Running ICM in production right now, daily. Real talk up front: ICM by itself won't make the model faster or smarter. What it actually does is control what the model sees at every step, and that's where all the accuracy and hallucination wins come from. That distinction matters for what you're evaluating. Here's what ours actually looks like on disk. Every automation is its own pipeline of numbered stages, and every stage has a CONTEXT.md that says exactly what goes in, what happens, and what comes out: Automations/Weekly-Rollup/ CONTEXT.md pipeline routing references/ stable reference docs stages/ 01_gather-week/ CONTEXT.md the contract: inputs / process / outputs input/ output/<run-date>/ per-run state 02_reflect-synthesize/ 03_draft-digest/ 04_finalize/ The whole thing is basically a state machine on disk. Numbered folders = execution order, files in output/ = state. Nothing magic about it, and that's the point I can open any stage and see exactly what the model saw and what it produced. To your questions: Can ICM handle it alone at enterprise scale? No and honestly that's the wrong framing. ICM is the control layer, not the retrieval layer. With a huge collection of databases you still need some kind of index over your schemas (catalog metadata, embeddings, whatever works). ICM's job is making sure the SQL generation step only ever sees the 5 relevant tables instead of 5,000. Pair it with a schema catalog and it scales fine. Throw one giant prompt at everything and no architecture saves you. Where it kills hallucinations: Three things did the heavy lifting for us. 1. Small scoped context per stage. Hallucinated columns come from the model guessing when the schema isn't in context, or drowning when there's too much of it. Discovery stage narrows to candidate tables, generation stage gets ONLY that shortlist plus the real column definitions. Small correct context beats big noisy context every single time.