Posts
RSS feed- The Skill Is Knowing What Matters
The bottleneck in a world of AI tools isn't crafting the output — it's knowing which output is worth crafting.
- Act-on-Receipt: The Third Task Class
Most task systems are binary, but a third class exists — tasks triggered by external notifications — and managing them like a backlog item is the wrong move entirely.
- Push Not Pull
AI agents that require you to go looking for their results aren't agents — they're automation with better UX. The loop closes when results arrive, not when you remember to check.
- The Human Bus Problem
Adding more AI tools doesn't make you faster if you're still the junction between every agent step.
- The Identification Problem
Having great AI delegation tools and not using them isn't a tool problem — it's a pattern recognition problem, and that distinction changes everything.
- The Last 10% Is the Feedback Loop
The execution layer of an AI system is only half the infrastructure — the reporting layer is what determines whether anyone acts on the results.
- The Session Boundary Is Why You Still Don't Have AI Agents
The gap between AI assistants and AI agents isn't about reasoning capability — it's about whether the thing can survive your laptop closing.
- Agentic AI in Production Looks Like a Workflow
The gap between 'agentic AI' hype and what actually ships in production turns out to be a workflow — and that's a feature, not a failure.
- Shifting Priors Is Not Finding Truth
An experiment with AI deliberation revealed something uncomfortable: accumulating confident opinions feels like convergence on truth, but isn't.
- The Deliberation Format Is the Product
I ran an experiment to find where multi-model deliberation adds value. The answer surprised me: it's the structured format, not the model diversity.
- The System for Checking Is Not the Checking
On the difference between eliminating friction and eliminating anxiety — and how to know when you've crossed the line.
- The Wrong Metric: Why I Stopped Switching AI Models Mid-Session
Per-task model routing optimises cost per token. But at personal assistant scale, friction is the real cost.
- Cross-Cutting Is Just Another Word for Optional
In AI agent architecture, calling something a 'cross-cutting concern' without naming an owner and a gate is just a polite way of saying nobody owns it.
- Stop Asking Which AI Model Is Better. Ask Which Phase.
The planning/execution split is more useful than any benchmark comparison.
- The second pass finds more
When red-teaming a document with multiple AI models, the second review — run on the edited version — consistently finds more than the first. Here's why, and what it means for how many rounds to run.
- LLM evals aren't data science
Evaluating LLM systems requires judgment, not statistics. That shifts who's qualified to do it — and where the gap is in most organisations.
- Progressive disclosure in MCP tools
When building MCP servers, search should return scannable summaries — not full content. Let the model decide what to read.
- RAG Solved the Wrong Problem
The retrieval pipeline was built for systems that couldn't reason about their own information needs. Agents can.
- This Year's DeepSeek
An open-source AI agent framework became the fastest-growing project in GitHub history — mostly in China. The pattern is the same as last year. So is the security panic.
- Enterprise AI Has a Plumbing Problem, Not a Model Problem
Most enterprises are optimising the wrong variable. The gap between 5% and 40% agent adoption won't be closed by better models.
- The Accidental Life OS
I spent an afternoon researching AI tools for personal life management. The conclusion was that I should stop looking.
- The $1 Billion Bet Against LLMs
One of the architects of modern deep learning just raised $1B on the thesis that token prediction can't reach real reasoning. Here's what he's proposing instead — and why it matters even if he's wrong.
- The First Datapoint
An AI agent ran unsupervised for two days and found twenty improvements to another model's training. Not an AGI claim. A rate claim.
- From Chatbots to Event Loops
The shift from agents you summon to agents that watch. Enterprise AI workflows are becoming continuous loops — and the failure modes are different.
- What MCP Actually Changes for Enterprise AI
Not better function calling — decoupling. When tools expose MCP servers, any agent can compose any system freely. The heterogeneity problem becomes a configuration problem.
- Language Is the Medium, Not the Purpose
We called them language models and spent years confused about why they could reason. The name stuck to the interface, not the mechanism.
- LLMs Are Better at Editing Than Writing
Ask an AI to write from scratch and you get the average of the corpus. Give it something rough and it amplifies what's already there. The workflow implications are significant.
- Consulting Is Mostly About Reducing Uncertainty
Clients hire consultants to solve problems. What they're actually paying for is the reduction of a particular feeling. The distinction matters.
- The Case Against Knowledge Management Systems
Most PKM tools are procrastination with better aesthetics. The problem isn't the software — it's that filing a note feels like understanding it.
- What It Actually Feels Like to Use AI for 80% of Your Work
Not productivity. Something stranger — the cognitive texture of days when the bottleneck shifts from execution to articulation.