The ICM paper put words to something my code already enforces.

https://www.skool.com/cliefnotes/icm-paper-published-to-pre-print-today?p=5d686529

I almost didn't write this one. I read the Interpretable Context Methodology paper last week. If you haven't already, it's worth your time.

@Jake Van Clief

and

@David McDermott

I hope you don't mind but I did what any interested, semi decent AI enthusiast would do, I ran the paper against my own markdown and .rs crate stack and here's the result...

Folder structure as agent architecture, the filesystem instead of an orchestration framework — and my first instinct was just to comment "great paper" and move on. But it kept nagging at me, so here we are.

What got me wasn't that it was new. It was the opposite. I've been heading down this verification-first build path for months now and the paper kept describing things I'd already built.

The difference is except it framed them as good ideas to aim for, and I'd been treating them as rules the system isn't allowed to break. That's what I wanted to talk about. Three places it overlapped, and where I ended up going a step further, not because I'm clever, mostly because it felt natural to go down that route.

The paper's big idea is observability: every intermediate state is a readable file, so the system was never opaque, so there's nothing to explain. I love that. But "readable" wasn't enough for me and it took me a while to realize why. Readable still means I have to read it, and at some point there's too much. So I leaned on formal proofs. I used a model checker for the core properties. Not "I looked and it seemed fine," but "this holds for every input, proven." Same instinct as ICM, just pushed until "looks right" isn't in the loop anymore.

I resonated with the Cross-Stage Trace Verification portion of 6.2 (2). The paper's "Toward Semantic Debugging" section proposes that a workflow stage should get a "Verify" section, a check that its output lines up with earlier stages. Honestly that came from trying to articulate everything manually in the beginning. I'd review something carefully, signed off, and missed a case anyway when the implementation was completed.

As it turns out, that's not future for me, that's how my code is written. I already have verification gates sitting between stages, pure deterministic checks. This isn't something for me to implement down the road, I'm doing this now... It means the direction's probably right.

6.3 Source Integrity and the Edit-Source Principle. Describes "Editing the output fixes this run. Editing the source fixes every future run". My whole improvement loop runs on exactly that, nothing improves by patching output, it improves by a human authoring a change to the source, recorded. The agent never edits itself. ICM got there from content production, I got there from being unaccepting about drift. Same place I think.

Okay, the part I can actually show instead of just claim.

My system is built on a small fixed set of primitives. Closed. Never extended. The bet is that every capability, however big the thing gets, is just a composition of that fixed set. And a bet like that is worthless if you don't check it — so I wrote a test that checks it. It walks every meaningful capability in the codebase and forces each one into a bin: it's a composition of the primitives, or it's a pure check over them, or it's plumbing. No fourth bin. Anything that fits nowhere is the alarm — logic that escaped the vocabulary.

I ran it. 73 capabilities. Zero escapes. Everything classified.

The test prints, every single run, the list of stuff it hasn't looked at yet. So the honest result is "this holds across 73 capabilities," not "this holds, done." I think this is inline with ICM. To me, the paper's best premise is making structure visible so you can trust it. The audit did the same thing with one turn added, make it visible and prove it didn't quietly drift, and make it admit what it hasn't seen.

So I guess my one addition to ICM would be: make the structure visible, then write the test that proves it stayed that way at least in my case. Visibility is trust you can see. A passing audit is trust you can run.

Question for the room. Is anyone else building on a fixed set of primitives or a constrained vocabulary? And if you are, do you have something that proves nothing's escaped it? I'm really interested to know.

2 comments

The ICM paper put words to something my code already enforces.