Posts about engineering

10 Apr 2026
The reversible direction
When choosing CLI vs MCP, pick the one you can undo. CLI wraps into MCP cheaply. MCP does not unwrap.
9 Apr 2026
What Anthropic's Managed Agents validates — and what to steal
Anthropic shipped a hosted agent platform. Its architecture looks familiar. Here's what a solo builder can learn from how they decoupled the brain from the hands.
7 Apr 2026
The architect-implementer split: why your expensive model shouldn't write code
Smart model plans, cheap model builds. The pattern everyone's converging on for AI coding agents — and the piece nobody's shipped yet.
6 Apr 2026
Building porin: a library for agent-facing CLIs
I turned the seven patterns into a zero-dependency Python library. Then I added MCP bridge support. Here's what I learned about the gap between patterns and code.
6 Apr 2026
Seven patterns for agent-facing CLIs
Three independent authors converged on nearly identical patterns for CLIs that AI agents invoke. Here's what they agree on, what's missing, and why nobody has built a framework for it yet.
5 Apr 2026
CLI, MCP, or code mode: the answer depends on who's running the sandbox
Willison says CLIs beat MCP. Cloudflare says server-side code mode beats both. They're both right, because they're answering different questions.
5 Apr 2026
Why I didn't package my AI organism
I designed an elegant framework install for my personal AI system. Then I listed the hard problems and shipped a three-hour cleanup instead.
5 Apr 2026
Ten Things I Learned From the Agent Skills Gold Rush
A day of reading skill repositories taught me less about the skills themselves than about how much I'd missed of the surrounding ecosystem.
4 Apr 2026
The rename that built a tool
I renamed one concept across 130 files. The pain crystallized into a tool that will do the next rename in minutes.
4 Apr 2026
Always latest as a system property
Most projects pin dependency versions to avoid breakage. We automated the opposite: daily upgrades with automatic rollback.
4 Apr 2026
The dispatch layer was eating the quality, not the model
We blamed the LLM for a 54% task failure rate. The real culprit was seven layers of dispatch infrastructure between intent and execution.
25 Mar 2026
Split on Access Control, Not Abstraction
Repo boundaries enforce access control, not abstraction. Directories handle abstraction. If two things have the same visibility requirement, they belong in the same repo.
14 Mar 2026
Planning Needs Eyes
A 3-pass AI planning pipeline caught 0 out of 6 design issues. The same planning done in-session with tool access caught 2.5. Planning isn't a prompt problem — it's a tools problem.
14 Mar 2026
Progressive Trust: How to Give AI Agents Autonomy Without Gambling
The debate about AI agent autonomy is wrong. It's not a binary choice — it's a graduated trust system with observability.
12 Mar 2026
Cross-Cutting Is Just Another Word for Optional
In AI agent architecture, calling something a 'cross-cutting concern' without naming an owner and a gate is just a polite way of saying nobody owns it.
8 Mar 2026
When Better Is Worse
Upgrading to a more capable model made my tool sixty times slower. The lesson isn't about models — it's about the difference between capability and fit.
7 Mar 2026
The Problem With Clever Browser Automation
The most sophisticated solution to a problem is usually a sign you haven't found the right abstraction yet.
7 Mar 2026
Expansion, Not Speedup
The real ROI of AI coding isn't doing the same work faster. It's doing work that wasn't worth doing before.
7 Mar 2026
The Trust Spectrum
Peter Steinberger stopped reviewing AI-generated code entirely. That works for indie software. In regulated environments, it can't. Here's how to think about where you sit.
7 Mar 2026
Traces Are the New Debugger
When behaviour emerges from both code and model responses, reading source files isn't enough. You debug by examining execution traces.
6 Mar 2026
Agentic Engineering: Why Less Is More
Tool enthusiasm is often net-negative. Context pollution degrades performance faster than features improve it. The principles that actually work.
6 Mar 2026
AI Evals: Why Teams Build Metrics Before They've Read a Trace
Most teams build evaluators before reading a single trace. The sequence that actually works is the opposite: observe, categorise, then measure.
6 Mar 2026
The Kutta Condition of AI: Engineering Ships Before Theory Catches Up
Aeronautics flew for decades before anyone could explain why wings worked. AI is in the same position. The engineering is ahead of the theory.
6 Mar 2026
Agent-First CLI Design: TTY Detection as Philosophy
The primary user of my CLI tools isn't me anymore. Designing for that changes everything about how output should work.
6 Mar 2026
Three Crates Before Lunch
I published three Rust CLI tools to crates.io before noon — none existed at breakfast. The interesting part isn't the speed. It's that the bottleneck moved.
6 Mar 2026
When to Build vs. When to Wait: The Recurrence Rule for AI Tooling
Most AI tooling debates are actually recurrence debates. The question isn't whether to build — it's how many times you'll need it.
6 Mar 2026
The Contract Pattern: Hard Gates for AI Agents
AI agents know how to start a task. They don't always know when to stop. The contract pattern is the architectural fix.

Posts about engineering

The reversible direction When choosing CLI vs MCP, pick the one you can undo. CLI wraps into MCP cheaply. MCP does not unwrap.

What Anthropic's Managed Agents validates — and what to steal Anthropic shipped a hosted agent platform. Its architecture looks familiar. Here's what a solo builder can learn from how they decoupled the brain from the hands.

The architect-implementer split: why your expensive model shouldn't write code Smart model plans, cheap model builds. The pattern everyone's converging on for AI coding agents — and the piece nobody's shipped yet.

Building porin: a library for agent-facing CLIs I turned the seven patterns into a zero-dependency Python library. Then I added MCP bridge support. Here's what I learned about the gap between patterns and code.

Seven patterns for agent-facing CLIs Three independent authors converged on nearly identical patterns for CLIs that AI agents invoke. Here's what they agree on, what's missing, and why nobody has built a framework for it yet.

CLI, MCP, or code mode: the answer depends on who's running the sandbox Willison says CLIs beat MCP. Cloudflare says server-side code mode beats both. They're both right, because they're answering different questions.

Why I didn't package my AI organism I designed an elegant framework install for my personal AI system. Then I listed the hard problems and shipped a three-hour cleanup instead.

Ten Things I Learned From the Agent Skills Gold Rush A day of reading skill repositories taught me less about the skills themselves than about how much I'd missed of the surrounding ecosystem.

The rename that built a tool I renamed one concept across 130 files. The pain crystallized into a tool that will do the next rename in minutes.

Always latest as a system property Most projects pin dependency versions to avoid breakage. We automated the opposite: daily upgrades with automatic rollback.

The dispatch layer was eating the quality, not the model We blamed the LLM for a 54% task failure rate. The real culprit was seven layers of dispatch infrastructure between intent and execution.

Split on Access Control, Not Abstraction Repo boundaries enforce access control, not abstraction. Directories handle abstraction. If two things have the same visibility requirement, they belong in the same repo.

Planning Needs Eyes A 3-pass AI planning pipeline caught 0 out of 6 design issues. The same planning done in-session with tool access caught 2.5. Planning isn't a prompt problem — it's a tools problem.

Progressive Trust: How to Give AI Agents Autonomy Without Gambling The debate about AI agent autonomy is wrong. It's not a binary choice — it's a graduated trust system with observability.

Cross-Cutting Is Just Another Word for Optional In AI agent architecture, calling something a 'cross-cutting concern' without naming an owner and a gate is just a polite way of saying nobody owns it.

When Better Is Worse Upgrading to a more capable model made my tool sixty times slower. The lesson isn't about models — it's about the difference between capability and fit.

The Problem With Clever Browser Automation The most sophisticated solution to a problem is usually a sign you haven't found the right abstraction yet.

Expansion, Not Speedup The real ROI of AI coding isn't doing the same work faster. It's doing work that wasn't worth doing before.

The Trust Spectrum Peter Steinberger stopped reviewing AI-generated code entirely. That works for indie software. In regulated environments, it can't. Here's how to think about where you sit.

Traces Are the New Debugger When behaviour emerges from both code and model responses, reading source files isn't enough. You debug by examining execution traces.

Agentic Engineering: Why Less Is More Tool enthusiasm is often net-negative. Context pollution degrades performance faster than features improve it. The principles that actually work.

AI Evals: Why Teams Build Metrics Before They've Read a Trace Most teams build evaluators before reading a single trace. The sequence that actually works is the opposite: observe, categorise, then measure.

The Kutta Condition of AI: Engineering Ships Before Theory Catches Up Aeronautics flew for decades before anyone could explain why wings worked. AI is in the same position. The engineering is ahead of the theory.

Agent-First CLI Design: TTY Detection as Philosophy The primary user of my CLI tools isn't me anymore. Designing for that changes everything about how output should work.

Three Crates Before Lunch I published three Rust CLI tools to crates.io before noon — none existed at breakfast. The interesting part isn't the speed. It's that the bottleneck moved.

When to Build vs. When to Wait: The Recurrence Rule for AI Tooling Most AI tooling debates are actually recurrence debates. The question isn't whether to build — it's how many times you'll need it.

The Contract Pattern: Hard Gates for AI Agents AI agents know how to start a task. They don't always know when to stop. The contract pattern is the architectural fix.

The reversible direction
When choosing CLI vs MCP, pick the one you can undo. CLI wraps into MCP cheaply. MCP does not unwrap.

What Anthropic's Managed Agents validates — and what to steal
Anthropic shipped a hosted agent platform. Its architecture looks familiar. Here's what a solo builder can learn from how they decoupled the brain from the hands.

The architect-implementer split: why your expensive model shouldn't write code
Smart model plans, cheap model builds. The pattern everyone's converging on for AI coding agents — and the piece nobody's shipped yet.

Building porin: a library for agent-facing CLIs
I turned the seven patterns into a zero-dependency Python library. Then I added MCP bridge support. Here's what I learned about the gap between patterns and code.

Seven patterns for agent-facing CLIs
Three independent authors converged on nearly identical patterns for CLIs that AI agents invoke. Here's what they agree on, what's missing, and why nobody has built a framework for it yet.

CLI, MCP, or code mode: the answer depends on who's running the sandbox
Willison says CLIs beat MCP. Cloudflare says server-side code mode beats both. They're both right, because they're answering different questions.

Why I didn't package my AI organism
I designed an elegant framework install for my personal AI system. Then I listed the hard problems and shipped a three-hour cleanup instead.

Ten Things I Learned From the Agent Skills Gold Rush
A day of reading skill repositories taught me less about the skills themselves than about how much I'd missed of the surrounding ecosystem.

The rename that built a tool
I renamed one concept across 130 files. The pain crystallized into a tool that will do the next rename in minutes.

Always latest as a system property
Most projects pin dependency versions to avoid breakage. We automated the opposite: daily upgrades with automatic rollback.

The dispatch layer was eating the quality, not the model
We blamed the LLM for a 54% task failure rate. The real culprit was seven layers of dispatch infrastructure between intent and execution.

Split on Access Control, Not Abstraction
Repo boundaries enforce access control, not abstraction. Directories handle abstraction. If two things have the same visibility requirement, they belong in the same repo.

Planning Needs Eyes
A 3-pass AI planning pipeline caught 0 out of 6 design issues. The same planning done in-session with tool access caught 2.5. Planning isn't a prompt problem — it's a tools problem.

Progressive Trust: How to Give AI Agents Autonomy Without Gambling
The debate about AI agent autonomy is wrong. It's not a binary choice — it's a graduated trust system with observability.

Cross-Cutting Is Just Another Word for Optional
In AI agent architecture, calling something a 'cross-cutting concern' without naming an owner and a gate is just a polite way of saying nobody owns it.

When Better Is Worse
Upgrading to a more capable model made my tool sixty times slower. The lesson isn't about models — it's about the difference between capability and fit.

The Problem With Clever Browser Automation
The most sophisticated solution to a problem is usually a sign you haven't found the right abstraction yet.

Expansion, Not Speedup
The real ROI of AI coding isn't doing the same work faster. It's doing work that wasn't worth doing before.

The Trust Spectrum
Peter Steinberger stopped reviewing AI-generated code entirely. That works for indie software. In regulated environments, it can't. Here's how to think about where you sit.

Traces Are the New Debugger
When behaviour emerges from both code and model responses, reading source files isn't enough. You debug by examining execution traces.

Agentic Engineering: Why Less Is More
Tool enthusiasm is often net-negative. Context pollution degrades performance faster than features improve it. The principles that actually work.

AI Evals: Why Teams Build Metrics Before They've Read a Trace
Most teams build evaluators before reading a single trace. The sequence that actually works is the opposite: observe, categorise, then measure.

The Kutta Condition of AI: Engineering Ships Before Theory Catches Up
Aeronautics flew for decades before anyone could explain why wings worked. AI is in the same position. The engineering is ahead of the theory.

Agent-First CLI Design: TTY Detection as Philosophy
The primary user of my CLI tools isn't me anymore. Designing for that changes everything about how output should work.

Three Crates Before Lunch
I published three Rust CLI tools to crates.io before noon — none existed at breakfast. The interesting part isn't the speed. It's that the bottleneck moved.

When to Build vs. When to Wait: The Recurrence Rule for AI Tooling
Most AI tooling debates are actually recurrence debates. The question isn't whether to build — it's how many times you'll need it.

The Contract Pattern: Hard Gates for AI Agents
AI agents know how to start a task. They don't always know when to stop. The contract pattern is the architectural fix.