How would you use a 12M-token window? · Automate What Academy

How would you use a 12M-token window?

Okay, this one made me do a double take: 12 million tokens in one context window is wild.

Subquadratic just launched a new AI model architecture that claims to handle massive context more efficiently, which could change how we think about RAG, coding agents, research tools, and long-running automations.

- 12-million-token context window available through an API

- Subquadratic Selective Attention, built to avoid quadratic attention costs

- Claimed linear scaling in compute and memory

- 52x faster than dense attention at 1M tokens

- 92.1% needle-in-a-haystack retrieval at 12M tokens

- MRCR v2 score of 83, reportedly beating GPT-5.5

- Potentially less chunking, routing, and context stitching

- Bigger working memory for AI agents and automation flows

- Coding agent and deep research tool launching in beta

- Still early, with big claims that need real-world testing

What I’m watching here is whether huge context windows actually simplify automation, or whether we still need smart retrieval, memory, and workflow design around the model.

Read the full article here: https://thenewstack.io/subquadratic-12-million-context-window/

0 comments