Posts

12 Apr 2026
I Built 200 CLIs for My AI. Here's What Actually Matters.
A Chinese article argues CLI is becoming the AI plugin format. I've been living this for months with 442 tools. The article is right about CLI. It's wrong about what makes CLI work.
12 Apr 2026
The Template Is the Schema
Seven PyPI releases of a CV generation tool in one afternoon taught me that template-guided synthesis lives and dies by what the template already contains.
11 Apr 2026
Assume the LLM never ran
A 208 MB log, 59,356 retries, and zero LLM calls. A debugging story about what happens when the symptom lies about the cause.
10 Apr 2026
Same Trigger, One Skill
A simple rule for keeping AI agent skill systems coherent: if two skills fire on the same trigger, merge them. Different trigger, different skill. No exceptions.
10 Apr 2026
Overnight Autonomous AI Coding: What Actually Works
I left an AI coding pipeline running overnight with 21 monitoring cycles. 5 features merged, 10 specs dispatched, 3 root causes found. Here's what worked, what broke, and the quality of the output.
10 Apr 2026
The reversible direction
When choosing CLI vs MCP, pick the one you can undo. CLI wraps into MCP cheaply. MCP does not unwrap.
9 Apr 2026
The One Env Var That Cost a Day
ANTHROPIC_API_KEY vs ANTHROPIC_AUTH_TOKEN — how a single wrong environment variable made an AI coding pipeline silently fail for hours, and the debugging journey that found it.
9 Apr 2026
What Anthropic's Managed Agents validates — and what to steal
Anthropic shipped a hosted agent platform. Its architecture looks familiar. Here's what a solo builder can learn from how they decoupled the brain from the hands.
9 Apr 2026
What LLM Wiki Looks Like After Six Months
Karpathy's LLM Wiki pattern is a good starting point. Here's what changes when you run it for real — enforcement over convention, decay over growth, and knowledge that fires without being asked.
8 Apr 2026
Biology as a Design Constraint: How Cell Biology Names Generate Architecture
Using cell biology naming not as metaphor but as engineering manual — how mTOR's biology predicted circuit breakers, autophagy, and negative feedback loops before we designed them.
8 Apr 2026
Your AI Agent's Quality Gate Is Lying to You
A 96% rejection rate that was actually a 96% false positive rate — how a monitoring blind spot turned a productive overnight batch into apparent failure.
7 Apr 2026
Test-first dispatch for AI coding agents
The architect writes the tests. The implementer makes them pass. No prose specs, no circular validation.
7 Apr 2026
4 Principles for Agent-Facing CLI Design
Most advice about making CLIs agent-friendly is just good CLI design. Only four principles are actually agent-specific.
7 Apr 2026
Correctness is model-determined
I benchmarked four AI coding harnesses on 12 tasks using the same model. The harness barely matters for correctness — it's all about the model.
7 Apr 2026
The architect-implementer split: why your expensive model shouldn't write code
Smart model plans, cheap model builds. The pattern everyone's converging on for AI coding agents — and the piece nobody's shipped yet.
7 Apr 2026
I made my coding agent dispatch system improve itself
mtor dispatched a coding task to improve itself — the tool that sends work to AI agents was improved by an AI agent.
7 Apr 2026
What 16,000 Simon Willison posts reveal about the state of AI coding agents
Analysis of Simon Willison's blog corpus reveals AI coding agents crossed a reliability threshold in late 2025 and are now reshaping software engineering.
6 Apr 2026
Building porin: a library for agent-facing CLIs
I turned the seven patterns into a zero-dependency Python library. Then I added MCP bridge support. Here's what I learned about the gap between patterns and code.
6 Apr 2026
Seven patterns for agent-facing CLIs
Three independent authors converged on nearly identical patterns for CLIs that AI agents invoke. Here's what they agree on, what's missing, and why nobody has built a framework for it yet.
6 Apr 2026
The Name Collision That Found Two Tools
When a dispatcher and an executor share a name, you don't have a naming problem. You have an architecture problem.
5 Apr 2026
The primary-source tax
Multi-engine search agreement is not primary-source verification. A cautionary tale about hallucinating reference content from consistent secondary summaries.
5 Apr 2026
CLI, MCP, or code mode: the answer depends on who's running the sandbox
Willison says CLIs beat MCP. Cloudflare says server-side code mode beats both. They're both right, because they're answering different questions.
5 Apr 2026
Why I didn't package my AI organism
I designed an elegant framework install for my personal AI system. Then I listed the hard problems and shipped a three-hour cleanup instead.
5 Apr 2026
Ten Things I Learned From the Agent Skills Gold Rush
A day of reading skill repositories taught me less about the skills themselves than about how much I'd missed of the surrounding ecosystem.
5 Apr 2026
What I Found Evaluating 5 Agent Skill Repos
Five skill repositories, a day of reading code, and a significant correction I had to make the same afternoon.
5 Apr 2026
When the Name Doesn't Fit
Naming as a design constraint: if a tool resists a name, the tool needs redesigning, not the name.
4 Apr 2026
The rename that built a tool
I renamed one concept across 130 files. The pain crystallized into a tool that will do the next rename in minutes.
4 Apr 2026
Always latest as a system property
Most projects pin dependency versions to avoid breakage. We automated the opposite: daily upgrades with automatic rollback.
4 Apr 2026
The dispatch layer was eating the quality, not the model
We blamed the LLM for a 54% task failure rate. The real culprit was seven layers of dispatch infrastructure between intent and execution.
3 Apr 2026
Governance Is a Design Problem
Compliance-first governance produces paperwork. Design-first governance produces systems you can actually explain to a regulator.