I hit a display bug on one of my plugins this morning.
Felt sure about the fix.
Told the Sonnet worker to swap the logic to kCFNull and push it.
Walked away.
Worker came back and said no.
Flagged that my fix was a regression against working Phase 2 code.
Reverted it.
Added diagnostics.
Proved the original empty-dict approach was right all along.
I was the junior in that exchange. The advisor had the wrong mental model.
The worker was closer to the code and called it.
Three things this only works if you have in place:
- The worker can disagree. If the system prompt says "execute what the advisor says," the worker agrees and ships the regression. My workers are briefed to push back when the instruction conflicts with what the code is actually doing.
- The worker can revert. Full commit access, not suggestion-only. It rolled back my change, added diagnostics, and proved the case with evidence before I saw any of it.
- The advisor can hear it. I had to clock the pushback as a feature, not an annoyance. First instinct was frustration. Second instinct was "oh, I was wrong." The gap between those two instincts is where most people break their own system.
This is the hands-in-the-shadows setup I've been building for months.
Opus in the advisor seat making judgment calls. Sonnet and Haiku workers executing in the background, with tools and permissions. Today was the first time a worker straight up told me I was wrong and had receipts.
If your AI always agrees with you, it's not a worker. It's a mirror.
The willingness of the system to correct its own operator is the part most people don't build in.