Posts about workflow

17 Mar 2026
The Confidence Stack
Not all knowledge is equally trustworthy. Three tiers of validation — from 'a model said it' to 'it survived reality' — and why tracking the difference matters.
15 Mar 2026
Mining Your LLM
Your AI already knows things that would make it better at helping you. The trick is extracting that knowledge and making it permanent.
14 Mar 2026
The CLI Boundary
Which parts of an AI dev workflow can be wrapped in a CLI, and which can't — learned the hard way by building the wrong thing and measuring it.
13 Mar 2026
The Black Box That Responds to Role Play
An LLM can't feel accountability pressure. But structured role-play — simulated rejection, persona assignment, adversarial review — produces measurably better output. The mechanism is opaque; the effect is real.
13 Mar 2026
The 'Are You Sure?' Loop
AI's first 'I'm done' is almost never its best work. Simulated accountability pressure — just asking 'are you sure?' — surfaces blind spots that self-review misses.
13 Mar 2026
Guardrails Beat Guidance
Prompt instructions are suggestions. Hooks are constraints. One survives a model swap.
12 Mar 2026
The Wrong Metric: Why I Stopped Switching AI Models Mid-Session
Per-task model routing optimises cost per token. But at personal assistant scale, friction is the real cost.
12 Mar 2026
Stop Asking Which AI Model Is Better. Ask Which Phase.
The planning/execution split is more useful than any benchmark comparison.
12 Mar 2026
The second pass finds more
When red-teaming a document with multiple AI models, the second review — run on the edited version — consistently finds more than the first. Here's why, and what it means for how many rounds to run.
11 Mar 2026
From Chatbots to Event Loops
The shift from agents you summon to agents that watch. Enterprise AI workflows are becoming continuous loops — and the failure modes are different.
11 Mar 2026
LLMs Are Better at Editing Than Writing
Ask an AI to write from scratch and you get the average of the corpus. Give it something rough and it amplifies what's already there. The workflow implications are significant.
11 Mar 2026
What It Actually Feels Like to Use AI for 80% of Your Work
Not productivity. Something stranger — the cognitive texture of days when the bottleneck shifts from execution to articulation.
9 Mar 2026
Benchmark Your Research Stack
Running 10 real queries through 5 tools revealed that theoretical routing rules have systematic gaps — and the surprises were more useful than the confirmations.
9 Mar 2026
The Queue Should Live Where Your Thoughts Live
AI agent results should be push, not pull. The feedback loop should close on mobile. Most tools miss all three — not from ignorance, but because dashboards photograph better.
9 Mar 2026
The Infra Trap
Building tools to support your work can quietly become a substitute for the work itself.
9 Mar 2026
The Queue That Texts You Back
Personal AI infrastructure should report results to you, not wait for you to go looking. A small architecture shift changes the whole dynamic.
9 Mar 2026
Eliminate the Reminder, Don't Schedule It
When you catch yourself setting a reminder to check something later, that's usually a signal that a tool is failing to report what it should.
7 Mar 2026
Your Tool Shouldn't Know What to Ignore
Configuration that belongs to the data shouldn't live in the tool. .gitignore figured this out thirty years ago.
7 Mar 2026
Rules Decay, Hooks Don't
The difference between writing down a rule and making the system enforce it — illustrated by a 15-line hook.
6 Mar 2026
I Made the AI Remind Me of My Own Blind Spots
I kept missing things at the end of AI sessions. So I stopped relying on willpower and systematised the nudge instead.
6 Mar 2026
CLIs Enforce Structure and Save Tokens — Not Just Discipline
The instinct to add a rule to a skill file is usually the wrong abstraction. A CLI wrapper enforces at the tool level: zero deliberation, zero token cost.
6 Mar 2026
Building My Own Consulting Toolkit Before Day One
Most consultants arrive at a new firm and learn their tools from colleagues. I tried something different.
6 Mar 2026
Skills as Behavioral Nudges: The Lightweight Alternative to Fine-Tuning
We fine-tune models with gradient descent. We nudge agents with skill files. Same goal, radically different cost.
7 Aug 2025
Claude Code Mobile is Better Than Desktop
Walking meetings, voice input, and location changes unlock cognitive advantages desktop workflows can't access.

Posts about workflow

The Confidence Stack Not all knowledge is equally trustworthy. Three tiers of validation — from 'a model said it' to 'it survived reality' — and why tracking the difference matters.

Mining Your LLM Your AI already knows things that would make it better at helping you. The trick is extracting that knowledge and making it permanent.

The CLI Boundary Which parts of an AI dev workflow can be wrapped in a CLI, and which can't — learned the hard way by building the wrong thing and measuring it.

The Black Box That Responds to Role Play An LLM can't feel accountability pressure. But structured role-play — simulated rejection, persona assignment, adversarial review — produces measurably better output. The mechanism is opaque; the effect is real.

The 'Are You Sure?' Loop AI's first 'I'm done' is almost never its best work. Simulated accountability pressure — just asking 'are you sure?' — surfaces blind spots that self-review misses.

Guardrails Beat Guidance Prompt instructions are suggestions. Hooks are constraints. One survives a model swap.

The Wrong Metric: Why I Stopped Switching AI Models Mid-Session Per-task model routing optimises cost per token. But at personal assistant scale, friction is the real cost.

Stop Asking Which AI Model Is Better. Ask Which Phase. The planning/execution split is more useful than any benchmark comparison.

The second pass finds more When red-teaming a document with multiple AI models, the second review — run on the edited version — consistently finds more than the first. Here's why, and what it means for how many rounds to run.

From Chatbots to Event Loops The shift from agents you summon to agents that watch. Enterprise AI workflows are becoming continuous loops — and the failure modes are different.

LLMs Are Better at Editing Than Writing Ask an AI to write from scratch and you get the average of the corpus. Give it something rough and it amplifies what's already there. The workflow implications are significant.

What It Actually Feels Like to Use AI for 80% of Your Work Not productivity. Something stranger — the cognitive texture of days when the bottleneck shifts from execution to articulation.

Benchmark Your Research Stack Running 10 real queries through 5 tools revealed that theoretical routing rules have systematic gaps — and the surprises were more useful than the confirmations.

The Queue Should Live Where Your Thoughts Live AI agent results should be push, not pull. The feedback loop should close on mobile. Most tools miss all three — not from ignorance, but because dashboards photograph better.

The Infra Trap Building tools to support your work can quietly become a substitute for the work itself.

The Queue That Texts You Back Personal AI infrastructure should report results to you, not wait for you to go looking. A small architecture shift changes the whole dynamic.

Eliminate the Reminder, Don't Schedule It When you catch yourself setting a reminder to check something later, that's usually a signal that a tool is failing to report what it should.

Your Tool Shouldn't Know What to Ignore Configuration that belongs to the data shouldn't live in the tool. .gitignore figured this out thirty years ago.

Rules Decay, Hooks Don't The difference between writing down a rule and making the system enforce it — illustrated by a 15-line hook.

I Made the AI Remind Me of My Own Blind Spots I kept missing things at the end of AI sessions. So I stopped relying on willpower and systematised the nudge instead.

CLIs Enforce Structure and Save Tokens — Not Just Discipline The instinct to add a rule to a skill file is usually the wrong abstraction. A CLI wrapper enforces at the tool level: zero deliberation, zero token cost.

Building My Own Consulting Toolkit Before Day One Most consultants arrive at a new firm and learn their tools from colleagues. I tried something different.

Skills as Behavioral Nudges: The Lightweight Alternative to Fine-Tuning We fine-tune models with gradient descent. We nudge agents with skill files. Same goal, radically different cost.

Claude Code Mobile is Better Than Desktop Walking meetings, voice input, and location changes unlock cognitive advantages desktop workflows can't access.

The Confidence Stack
Not all knowledge is equally trustworthy. Three tiers of validation — from 'a model said it' to 'it survived reality' — and why tracking the difference matters.

Mining Your LLM
Your AI already knows things that would make it better at helping you. The trick is extracting that knowledge and making it permanent.

The CLI Boundary
Which parts of an AI dev workflow can be wrapped in a CLI, and which can't — learned the hard way by building the wrong thing and measuring it.

The Black Box That Responds to Role Play
An LLM can't feel accountability pressure. But structured role-play — simulated rejection, persona assignment, adversarial review — produces measurably better output. The mechanism is opaque; the effect is real.

The 'Are You Sure?' Loop
AI's first 'I'm done' is almost never its best work. Simulated accountability pressure — just asking 'are you sure?' — surfaces blind spots that self-review misses.

Guardrails Beat Guidance
Prompt instructions are suggestions. Hooks are constraints. One survives a model swap.

The Wrong Metric: Why I Stopped Switching AI Models Mid-Session
Per-task model routing optimises cost per token. But at personal assistant scale, friction is the real cost.

Stop Asking Which AI Model Is Better. Ask Which Phase.
The planning/execution split is more useful than any benchmark comparison.

The second pass finds more
When red-teaming a document with multiple AI models, the second review — run on the edited version — consistently finds more than the first. Here's why, and what it means for how many rounds to run.

From Chatbots to Event Loops
The shift from agents you summon to agents that watch. Enterprise AI workflows are becoming continuous loops — and the failure modes are different.

LLMs Are Better at Editing Than Writing
Ask an AI to write from scratch and you get the average of the corpus. Give it something rough and it amplifies what's already there. The workflow implications are significant.

What It Actually Feels Like to Use AI for 80% of Your Work
Not productivity. Something stranger — the cognitive texture of days when the bottleneck shifts from execution to articulation.

Benchmark Your Research Stack
Running 10 real queries through 5 tools revealed that theoretical routing rules have systematic gaps — and the surprises were more useful than the confirmations.

The Queue Should Live Where Your Thoughts Live
AI agent results should be push, not pull. The feedback loop should close on mobile. Most tools miss all three — not from ignorance, but because dashboards photograph better.

The Infra Trap
Building tools to support your work can quietly become a substitute for the work itself.

The Queue That Texts You Back
Personal AI infrastructure should report results to you, not wait for you to go looking. A small architecture shift changes the whole dynamic.

Eliminate the Reminder, Don't Schedule It
When you catch yourself setting a reminder to check something later, that's usually a signal that a tool is failing to report what it should.

Your Tool Shouldn't Know What to Ignore
Configuration that belongs to the data shouldn't live in the tool. .gitignore figured this out thirty years ago.

Rules Decay, Hooks Don't
The difference between writing down a rule and making the system enforce it — illustrated by a 15-line hook.

I Made the AI Remind Me of My Own Blind Spots
I kept missing things at the end of AI sessions. So I stopped relying on willpower and systematised the nudge instead.

CLIs Enforce Structure and Save Tokens — Not Just Discipline
The instinct to add a rule to a skill file is usually the wrong abstraction. A CLI wrapper enforces at the tool level: zero deliberation, zero token cost.

Building My Own Consulting Toolkit Before Day One
Most consultants arrive at a new firm and learn their tools from colleagues. I tried something different.

Skills as Behavioral Nudges: The Lightweight Alternative to Fine-Tuning
We fine-tune models with gradient descent. We nudge agents with skill files. Same goal, radically different cost.

Claude Code Mobile is Better Than Desktop
Walking meetings, voice input, and location changes unlock cognitive advantages desktop workflows can't access.