coding-agents
5 essays on this topic.
- Assume the LLM never ran
A 208 MB log, 59,356 retries, and zero LLM calls. A debugging story about what happens when the symptom lies about the cause.
- Test-first dispatch for AI coding agents
The architect writes the tests. The implementer makes them pass. No prose specs, no circular validation.
- I made my coding agent dispatch system improve itself
I dispatched a 952-line monolithic CLI through my own coding-agent dispatch system to be refactored into seven modules. It worked. Notes on what self-bootstrap reveals about agent harness design.
- What 16,000 Simon Willison posts reveal about the state of AI coding agents
I scraped 16,181 of Simon Willison's posts and analysed the 395 from 2026. An inflection in November 2025, GLM-5 closing the gap, and why the harness — not the model — is the competitive moat.
- Correctness is model-determined
I benchmarked four AI coding harnesses on 12 tasks using the same model. The harness barely matters for correctness — it's all about the model.