Karpathy's LLM Wiki vs RAG - Here's Why He Abandoned Vector DBs

There's a reason Andrej Karpathy doesn't use RAG for his personal knowledge system.

He tried it. It's brittle. The retrieval layer fails on nuanced queries. The embedding pipeline is another moving part. The cost compounds at scale.

His replacement method has 3 components:

1. Raw sources - untouched markdown. Never let AI modify ground truth.

2. Wiki layer - AI-authored pages that cross-reference the raw sources. Grows every session.

3. Claude.md - a schema file that loads before every LLM call. This is the memory layer.

The insight: the LLM doesn't need a vector search to recall things. It needs a well-structured index and the discipline to keep it clean. At human-scale knowledge bases (a few thousand files), full

context loading beats retrieval every time.

We built this live using Claude Code as the vault operator - autonomous ingest, link resolution, and a self-healing lint loop that keeps the wiki consistent.

The interesting engineering problem was the index design. Claude needs to be able to navigate the vault in 2 reads: index first, then the relevant page. If the index is dirty, everything downstream

breaks.

Anyone else running a similar architecture? Curious how others are handling vault maintenance at scale.

1 comment