What LLM Wiki Looks Like After Six Months
/ 3 min read
Karpathy published a gist this week describing a pattern he calls LLM Wiki — instead of RAG, have the LLM incrementally build and maintain a persistent wiki of interlinked markdown files. 5000+ stars in five days. The pattern clearly resonates.
He’s right about the core insight: the expensive part of a knowledge base isn’t the reading or thinking — it’s the bookkeeping. Cross-references, consistency, keeping summaries current. LLMs make that cost near zero.
I’ve been running a system like this for about six months. Here’s what I’ve learned about where the 101 version breaks down and what you actually need.
Convention doesn’t survive context loss. Karpathy’s architecture has a “schema” — a CLAUDE.md that tells the LLM how to maintain the wiki. This works in a single session. Across sessions, the LLM drifts. It forgets conventions, invents new ones, contradicts its own prior work. The fix isn’t a better schema — it’s enforcement. Pre-commit hooks that reject malformed entries. Validators that check frontmatter. Gates that run before the LLM can write. Anything that can be deterministic should be deterministic. The schema becomes a thin judgment layer on top of programmatic enforcement, not the primary mechanism.
Knowledge that doesn’t fire is dead weight. A wiki page that sits waiting to be queried is useful. A knowledge entry that automatically gets prepended to every work dispatch is a force multiplier. The difference: passive knowledge requires the right question at the right time; active knowledge shapes behavior without being asked. Coaching notes that correct recurring agent mistakes, feedback entries that adjust response style, epistemics that get grepped and loaded when you enter a matching context — these aren’t pages in a wiki. They’re executable knowledge.
You need decay, not just growth. Karpathy’s wiki grows and updates. It never shrinks. In practice, knowledge has a half-life. A finding about a tool’s behavior becomes stale when the tool updates. A project note loses relevance when the project ships. Without decay, your wiki becomes a swamp — technically accurate entries that dilute the signal. Tag entries with durability. Archive aggressively. Set budgets (“max 80 lines for the index”) that force you to prioritize.
One agent isn’t enough. The LLM Wiki pattern assumes one human and one LLM in a single conversation. The interesting problems start when you have multiple agents — an architect that plans, workers that execute, reviewers that check. Now your knowledge layer isn’t just a wiki — it’s coordination infrastructure. Specs flow from architect to worker. Coaching notes correct worker behavior across dispatches. Review findings flow back into coaching. The knowledge base becomes the message bus between agents with different capabilities and costs.
The query-to-artifact loop needs to be explicit. Karpathy mentions that “good answers can be filed back into the wiki as new pages.” This is the most underrated line in the gist. In practice, the best synthesis happens in conversation — a comparison you asked for, a connection you spotted, a research answer that took real work. If filing it back is a manual decision, you’ll forget half the time. Make it a first-class operation with low friction.
None of this invalidates the original pattern. The core is right: persistent, compiled, LLM-maintained markdown beats RAG for personal knowledge. But the 101 version is a starting point, not a destination. The interesting work is in the enforcement, the decay model, the multi-agent coordination, and turning passive pages into active knowledge that shapes behavior without being asked.