Posts about ai-agents

5 Apr 2026
Why I didn't package my AI organism
I designed an elegant framework install for my personal AI system. Then I listed the hard problems and shipped a three-hour cleanup instead.
4 Apr 2026
The rename that built a tool
I renamed one concept across 130 files. The pain crystallized into a tool that will do the next rename in minutes.
24 Mar 2026
The Boundary Is an Assessment
The tool/skill distinction isn't a property of the capability. It's a property of the context it operates in.
24 Mar 2026
The Test Before the Output
The line between tool and skill is whether you can write the test before seeing the result.
24 Mar 2026
Judgment Is a Moving Boundary
The line between tool and skill isn't a property of the task. It's a property of how well you understand the task.
24 Mar 2026
Skills Should Die
Every AI skill should be trying to make itself unnecessary. The ones that survive are the ones that haven't been understood yet.
24 Mar 2026
The LLM Is the Tool
When the transformation is predictable, the LLM is just a runtime. A cheaper, more flexible runtime than custom code.
20 Mar 2026
270 Agents While I Slept
I ran an autonomous agent loop overnight — 43 waves, ~270 dispatches, ~250 vault files produced. Here's what I learned about building systems that work while you sleep.
19 Mar 2026
The Unexplainable Alpha
In AI agent systems, execution commoditizes. Research commoditizes. Coordination commoditizes. Taste — the ability to forecast what will matter — is the bottleneck that doesn't automate away.
19 Mar 2026
The Navigation Problem in Agent Flywheels
Your agent system shouldn't stop when the task list is empty. The real bottleneck isn't execution — it's discovering what's worth doing next.
19 Mar 2026
Programs Over Prompts
The temptation in agent systems is to make everything a prompt. But most of the work is deterministic — and deterministic work deserves code, not suggestions.
18 Mar 2026
Taste Is the Bottleneck
When you can run 60 agents overnight, knowing what to build matters more than building it.
18 Mar 2026
Meta-Skills Are the Multiplier
We cut from 181 skills to 35 and added a 15-row routing table. Behavior improved across the board. The lesson: meta-skills compound, tool wrappers just add.
18 Mar 2026
Optimize for Routing, Not Tokens
With 1M context windows, token savings are rounding error. The real metric is P(right tool | user intent) — does your agent reach for the right tool at the right moment?
18 Mar 2026
The Reliability Hierarchy: Hooks, Rules, Skills
In AI agent systems, use the most reliable trigger mechanism that fits — most builders default to skills for everything, which is using the weakest mechanism as the default.
18 Mar 2026
Skills as Prototype, MCP as Production
Skills and MCP servers aren't competitors. They're different stages of the same lifecycle. Build the procedure as a skill first. Graduate the tool parts to MCP when they stabilize.
18 Mar 2026
The Three Paradigms of Agent Knowledge
Agent knowledge systems have three fundamental paradigms: static context, dynamic tools, and retrieval. Most stop at two. The third is the biggest unexploited opportunity.
18 Mar 2026
Match Form to Access Pattern
The governing principle for structuring knowledge in AI agent systems isn't 'always atomic' — it's matching how knowledge is stored to how it's accessed.
18 Mar 2026
Play Within the Design
Every AI coding platform has mechanisms designed for specific purposes. Using them as intended beats clever hacks — and the reason is deeper than cleanliness.
18 Mar 2026
Stealing from Peers: A Truth-Seeking Discipline
Most people scan competitors for positioning. I scan them for transferable patterns — and route each steal to every domain it applies to.
18 Mar 2026
The Boring Future of AI Agents
The real arrival of AI agents isn't spectacular. It's when you stop noticing.
17 Mar 2026
The Treadmill and the Loop
Getting ahead of AI best practices is a treadmill. The durable skill is testing assumptions faster than they expire.
17 Mar 2026
Personas Exploit a Blind Spot in LLM-as-Judge Evaluation
Persona prompting generates the exact type of hallucination that automated LLM judges reward as 'depth.' Two experiments, blind evaluation, and a fact-check that flipped the finding.
16 Mar 2026
The Persona Paradox in AI Agent Teams
Personas hurt for structured tasks, help for judgment-heavy tasks. Two experiments, blind evaluation, frontier models. The distinction is task-dependent, not binary.
15 Mar 2026
The Debate Round Is Where Value Lives
Independent parallel reviews produce overlapping findings. The cross-critique round produces resolution. That's where multi-agent value actually emerges.
14 Mar 2026
Planning Needs Eyes
A 3-pass AI planning pipeline caught 0 out of 6 design issues. The same planning done in-session with tool access caught 2.5. Planning isn't a prompt problem — it's a tools problem.
14 Mar 2026
Put the Rule Where It Fires
Documenting a rule is half a loop. The rule only works when it fires at the moment of decision — not when it sits in a file nobody reads.
14 Mar 2026
What Human Memory Teaches AI Agents (and What It Doesn't)
A calculator doesn't simulate forgetting — it manages its context budget. What to cherry-pick from cognitive science for AI agent memory, and what to leave behind.
14 Mar 2026
The MTEB Leader Barely Beats a Free Model on Agent Memory
I benchmarked 10 memory backends and multiple embedding models on actual agent memory retrieval. The results challenge common assumptions about what matters.
14 Mar 2026
The Agent Governance Gap Is Already Here
Agentic AI isn't a future governance problem — it arrived ungoverned, and this week saw the first enforcement action.

Posts about ai-agents

Why I didn't package my AI organism I designed an elegant framework install for my personal AI system. Then I listed the hard problems and shipped a three-hour cleanup instead.

The rename that built a tool I renamed one concept across 130 files. The pain crystallized into a tool that will do the next rename in minutes.

The Boundary Is an Assessment The tool/skill distinction isn't a property of the capability. It's a property of the context it operates in.

The Test Before the Output The line between tool and skill is whether you can write the test before seeing the result.

Judgment Is a Moving Boundary The line between tool and skill isn't a property of the task. It's a property of how well you understand the task.

Skills Should Die Every AI skill should be trying to make itself unnecessary. The ones that survive are the ones that haven't been understood yet.

The LLM Is the Tool When the transformation is predictable, the LLM is just a runtime. A cheaper, more flexible runtime than custom code.

270 Agents While I Slept I ran an autonomous agent loop overnight — 43 waves, ~270 dispatches, ~250 vault files produced. Here's what I learned about building systems that work while you sleep.

The Unexplainable Alpha In AI agent systems, execution commoditizes. Research commoditizes. Coordination commoditizes. Taste — the ability to forecast what will matter — is the bottleneck that doesn't automate away.

The Navigation Problem in Agent Flywheels Your agent system shouldn't stop when the task list is empty. The real bottleneck isn't execution — it's discovering what's worth doing next.

Programs Over Prompts The temptation in agent systems is to make everything a prompt. But most of the work is deterministic — and deterministic work deserves code, not suggestions.

Taste Is the Bottleneck When you can run 60 agents overnight, knowing what to build matters more than building it.

Meta-Skills Are the Multiplier We cut from 181 skills to 35 and added a 15-row routing table. Behavior improved across the board. The lesson: meta-skills compound, tool wrappers just add.

Optimize for Routing, Not Tokens With 1M context windows, token savings are rounding error. The real metric is P(right tool | user intent) — does your agent reach for the right tool at the right moment?

The Reliability Hierarchy: Hooks, Rules, Skills In AI agent systems, use the most reliable trigger mechanism that fits — most builders default to skills for everything, which is using the weakest mechanism as the default.

Skills as Prototype, MCP as Production Skills and MCP servers aren't competitors. They're different stages of the same lifecycle. Build the procedure as a skill first. Graduate the tool parts to MCP when they stabilize.

The Three Paradigms of Agent Knowledge Agent knowledge systems have three fundamental paradigms: static context, dynamic tools, and retrieval. Most stop at two. The third is the biggest unexploited opportunity.

Match Form to Access Pattern The governing principle for structuring knowledge in AI agent systems isn't 'always atomic' — it's matching how knowledge is stored to how it's accessed.

Play Within the Design Every AI coding platform has mechanisms designed for specific purposes. Using them as intended beats clever hacks — and the reason is deeper than cleanliness.

Stealing from Peers: A Truth-Seeking Discipline Most people scan competitors for positioning. I scan them for transferable patterns — and route each steal to every domain it applies to.

The Boring Future of AI Agents The real arrival of AI agents isn't spectacular. It's when you stop noticing.

The Treadmill and the Loop Getting ahead of AI best practices is a treadmill. The durable skill is testing assumptions faster than they expire.

Personas Exploit a Blind Spot in LLM-as-Judge Evaluation Persona prompting generates the exact type of hallucination that automated LLM judges reward as 'depth.' Two experiments, blind evaluation, and a fact-check that flipped the finding.

The Persona Paradox in AI Agent Teams Personas hurt for structured tasks, help for judgment-heavy tasks. Two experiments, blind evaluation, frontier models. The distinction is task-dependent, not binary.

The Debate Round Is Where Value Lives Independent parallel reviews produce overlapping findings. The cross-critique round produces resolution. That's where multi-agent value actually emerges.

Planning Needs Eyes A 3-pass AI planning pipeline caught 0 out of 6 design issues. The same planning done in-session with tool access caught 2.5. Planning isn't a prompt problem — it's a tools problem.

Put the Rule Where It Fires Documenting a rule is half a loop. The rule only works when it fires at the moment of decision — not when it sits in a file nobody reads.

What Human Memory Teaches AI Agents (and What It Doesn't) A calculator doesn't simulate forgetting — it manages its context budget. What to cherry-pick from cognitive science for AI agent memory, and what to leave behind.

The MTEB Leader Barely Beats a Free Model on Agent Memory I benchmarked 10 memory backends and multiple embedding models on actual agent memory retrieval. The results challenge common assumptions about what matters.

The Agent Governance Gap Is Already Here Agentic AI isn't a future governance problem — it arrived ungoverned, and this week saw the first enforcement action.

Why I didn't package my AI organism
I designed an elegant framework install for my personal AI system. Then I listed the hard problems and shipped a three-hour cleanup instead.

The rename that built a tool
I renamed one concept across 130 files. The pain crystallized into a tool that will do the next rename in minutes.

The Boundary Is an Assessment
The tool/skill distinction isn't a property of the capability. It's a property of the context it operates in.

The Test Before the Output
The line between tool and skill is whether you can write the test before seeing the result.

Judgment Is a Moving Boundary
The line between tool and skill isn't a property of the task. It's a property of how well you understand the task.

Skills Should Die
Every AI skill should be trying to make itself unnecessary. The ones that survive are the ones that haven't been understood yet.

The LLM Is the Tool
When the transformation is predictable, the LLM is just a runtime. A cheaper, more flexible runtime than custom code.

270 Agents While I Slept
I ran an autonomous agent loop overnight — 43 waves, ~270 dispatches, ~250 vault files produced. Here's what I learned about building systems that work while you sleep.

The Unexplainable Alpha
In AI agent systems, execution commoditizes. Research commoditizes. Coordination commoditizes. Taste — the ability to forecast what will matter — is the bottleneck that doesn't automate away.

The Navigation Problem in Agent Flywheels
Your agent system shouldn't stop when the task list is empty. The real bottleneck isn't execution — it's discovering what's worth doing next.

Programs Over Prompts
The temptation in agent systems is to make everything a prompt. But most of the work is deterministic — and deterministic work deserves code, not suggestions.

Taste Is the Bottleneck
When you can run 60 agents overnight, knowing what to build matters more than building it.

Meta-Skills Are the Multiplier
We cut from 181 skills to 35 and added a 15-row routing table. Behavior improved across the board. The lesson: meta-skills compound, tool wrappers just add.

Optimize for Routing, Not Tokens
With 1M context windows, token savings are rounding error. The real metric is P(right tool | user intent) — does your agent reach for the right tool at the right moment?

The Reliability Hierarchy: Hooks, Rules, Skills
In AI agent systems, use the most reliable trigger mechanism that fits — most builders default to skills for everything, which is using the weakest mechanism as the default.

Skills as Prototype, MCP as Production
Skills and MCP servers aren't competitors. They're different stages of the same lifecycle. Build the procedure as a skill first. Graduate the tool parts to MCP when they stabilize.

The Three Paradigms of Agent Knowledge
Agent knowledge systems have three fundamental paradigms: static context, dynamic tools, and retrieval. Most stop at two. The third is the biggest unexploited opportunity.

Match Form to Access Pattern
The governing principle for structuring knowledge in AI agent systems isn't 'always atomic' — it's matching how knowledge is stored to how it's accessed.

Play Within the Design
Every AI coding platform has mechanisms designed for specific purposes. Using them as intended beats clever hacks — and the reason is deeper than cleanliness.

Stealing from Peers: A Truth-Seeking Discipline
Most people scan competitors for positioning. I scan them for transferable patterns — and route each steal to every domain it applies to.

The Boring Future of AI Agents
The real arrival of AI agents isn't spectacular. It's when you stop noticing.

The Treadmill and the Loop
Getting ahead of AI best practices is a treadmill. The durable skill is testing assumptions faster than they expire.

Personas Exploit a Blind Spot in LLM-as-Judge Evaluation
Persona prompting generates the exact type of hallucination that automated LLM judges reward as 'depth.' Two experiments, blind evaluation, and a fact-check that flipped the finding.

The Persona Paradox in AI Agent Teams
Personas hurt for structured tasks, help for judgment-heavy tasks. Two experiments, blind evaluation, frontier models. The distinction is task-dependent, not binary.

The Debate Round Is Where Value Lives
Independent parallel reviews produce overlapping findings. The cross-critique round produces resolution. That's where multi-agent value actually emerges.

Planning Needs Eyes
A 3-pass AI planning pipeline caught 0 out of 6 design issues. The same planning done in-session with tool access caught 2.5. Planning isn't a prompt problem — it's a tools problem.

Put the Rule Where It Fires
Documenting a rule is half a loop. The rule only works when it fires at the moment of decision — not when it sits in a file nobody reads.

What Human Memory Teaches AI Agents (and What It Doesn't)
A calculator doesn't simulate forgetting — it manages its context budget. What to cherry-pick from cognitive science for AI agent memory, and what to leave behind.

The MTEB Leader Barely Beats a Free Model on Agent Memory
I benchmarked 10 memory backends and multiple embedding models on actual agent memory retrieval. The results challenge common assumptions about what matters.

The Agent Governance Gap Is Already Here
Agentic AI isn't a future governance problem — it arrived ungoverned, and this week saw the first enforcement action.