Observability layer on top of ICM workspaces. Does our reasoning hold up?
Hi all, We're a small AI consultancy in Germany (mid-market clients, running first pilot projects, fixed-fee model) and we've been exploring ICM as the delivery format for the workflows we ship to clients. The reasoning behind switching from custom-coded pipelines to ICM is mostly the obvious stuff — auditability for the upcoming EU AI Act, lower handover friction, faster iteration with the client in the loop. Jakes’ paper articulates the case better than we could. But while planning the commercial rollout, we kept running into the same question, and we'd love a sanity check from people who've thought about this longer than we have. The observation: ICM commoditizes the workflow itself. Once we hand a client a workspace, they can in principle edit it, fork it, or hire someone else to maintain it. That's a feature, not a bug and it's exactly what makes the model trustworthy. But it also means our differentiation as a service provider has to move up the stack. The workspace can't be the moat. The idea we're testing: An observability layer that sits above a fleet of ICM workspaces, not inside them. Each stage emits a small telemetry event when it finishes (stage name, model used, tokens in/out, duration, success/failure, whether a human intervened). Events flow to a central collector. From that data we build: - Per-workspace ROI metrics (time saved vs. baseline) - LLM cost aggregation across providers - Compliance / audit reports auto-generated for EU AI Act (documentation) - Drift alerts when new model versions are released ("here are the 3 workspaces likely to benefit from Claude 5 — estimated quality delta") - Optimization recommendations based on patterns across stages The premise is that this layer is genuinely hard for the client to replicate because aggregation, history and cross-workspace insights require infrastructure they don't want to run themselves and that this is what justifies recurring revenue, not (only) maintenance bug-fixes. What we're unsure about: