DeepMind just dropped their Frontier Safety Framework 3.0, and it’s a clear sign that they’re not playing around when it comes to managing high-risk AI behavior.
This update goes beyond the usual “is it ethical?” stuff — it dives into what happens if AI starts acting in ways that resist control or influence humans too strongly.
What’s new:
Shutdown resistance tracking
— They’re now monitoring if AI tries to avoid being shut off or resists having its instructions changed (yes, that’s a real concern).
Persuasive influence tracking
— Watching for signs that a model is having too much influence on people’s beliefs or decisions — especially in sensitive areas like finance, health, or politics.
Stricter risk levels
— DeepMind refined its “Critical Capability Levels” (CCLs) to flag serious threats that need governance
before launch.
Pre-launch safety checks
— Every high-risk system gets reviewed before it’s released — even the internal ones still in research.
Why it matters:
We’re moving from “what can AI do?” to “what could it unexpectedly do?” This isn’t just about performance anymore — it’s about keeping things from going sideways as AI gets smarter and weirder.
DeepMind’s not alone here — OpenAI, Anthropic, and others are all doubling down on early safety frameworks. Because if we wait until something breaks, it’s already too late.
Bottom line?
Proactive safety beats reactive damage control — especially with frontier-level AI.
So the next question is - do we trust all these AI companies to actually limit what AI can do, and to shut it down rather than see what happens if?.
I suspect that the 'money people' will always want to push beyond safety boundaries, to see if there's more Trillions for them, and whoops too late.