Posts about ai
-
The $1 Billion Bet Against LLMs
One of the architects of modern deep learning just raised $1B on the thesis that token prediction can't reach real reasoning. Here's what he's proposing instead — and why it matters even if he's wrong.
-
The First Datapoint
An AI agent ran unsupervised for two days and found twenty improvements to another model's training. Not an AGI claim. A rate claim.
-
From Chatbots to Event Loops
The shift from agents you summon to agents that watch. Enterprise AI workflows are becoming continuous loops — and the failure modes are different.
-
What MCP Actually Changes for Enterprise AI
Not better function calling — decoupling. When tools expose MCP servers, any agent can compose any system freely. The heterogeneity problem becomes a configuration problem.
-
Language Is the Medium, Not the Purpose
We called them language models and spent years confused about why they could reason. The name stuck to the interface, not the mechanism.
-
LLMs Are Better at Editing Than Writing
Ask an AI to write from scratch and you get the average of the corpus. Give it something rough and it amplifies what's already there. The workflow implications are significant.
-
What It Actually Feels Like to Use AI for 80% of Your Work
Not productivity. Something stranger — the cognitive texture of days when the bottleneck shifts from execution to articulation.
-
The Calibration Trap
The comfort trap is about effort. This one is about epistemics — and it's harder to see.
-
The Comfort Trap
The right test for any AI interaction isn't 'did it help me?' but 'am I more capable after it?'
-
The Personalised System Era
AI coding agents didn't just make developers faster. They changed who gets to have a bespoke system.
-
Let the OS Schedule, Let Your Tool Dispatch
The moment I stopped building scheduling into my tools, everything got simpler.
-
Benchmark Your Research Stack
Running 10 real queries through 5 tools revealed that theoretical routing rules have systematic gaps — and the surprises were more useful than the confirmations.
-
Eliminate the Reminder, Don't Schedule It
When you catch yourself setting a reminder to check something later, that's usually a signal that a tool is failing to report what it should.
-
You Are the Bottleneck in Your Own Agentic Workflow
Adding more AI tools doesn't help if you're still the bus between them.
-
Where Rules Live
The difference between a rule that works and a rule that doesn't is usually not the content of the rule — it's where it lives.
-
When Better Is Worse
Upgrading to a more capable model made my tool sixty times slower. The lesson isn't about models — it's about the difference between capability and fit.
-
The Experiment Loop Without the GPU
Andrej Karpathy's autoresearch project is being read as a demo of what H100s can do overnight. It's actually a discipline for doing rigorous work on anything measurable.
-
Instructions Don't Enforce Behavior. Templates Do.
Why the structure of an output matters more than the instructions that produce it.
-
I Didn't Mean to Kill My Todo App
A coding assistant quietly made three productivity apps redundant. Not by replacing them — by making context collapse the boundaries between them.
-
What it actually takes to run an AI agent in a bank
The resistance to AI agents in banking isn't mostly cultural. It's infrastructure — and the gap is more interesting than the politics.
-
AI Fixed My Perfectionism (Sort Of)
On why the blank page stopped being the hard part.
-
I Made the AI Remind Me of My Own Blind Spots
I kept missing things at the end of AI sessions. So I stopped relying on willpower and systematised the nudge instead.
-
AI Evals: Why Teams Build Metrics Before They've Read a Trace
Most teams build evaluators before reading a single trace. The sequence that actually works is the opposite: observe, categorise, then measure.
-
Backtest vs Operational Validation: The Control You Think You Have
A model control that's never fired in production isn't a control — it's a hypothesis. The gap between backtest and operational validation is invisible until someone asks.
-
AI Succeeds, Economy Breaks: The Displacement Loop Nobody Models
The standard AI economic models assume wage effects and retraining timelines. They don't model the feedback loop where successful AI deployment reduces the customer base that purchases AI-enabled products.
-
Why AI Assistants Make Us Dumber (And What Governance Should Do About It)
The cognitive offloading problem is real. The governance response mostly isn't. There's a specific mechanism at work, and it has a specific fix.
-
The Kutta Condition of AI: Engineering Ships Before Theory Catches Up
Aeronautics flew for decades before anyone could explain why wings worked. AI is in the same position. The engineering is ahead of the theory.
-
The Failure Mode of AI Advice Isn't Hallucination
The failure mode of AI advice isn't hallucination. It's that it agrees with you. Here's the architecture that fixes it.
-
Building My Own Consulting Toolkit Before Day One
Most consultants arrive at a new firm and learn their tools from colleagues. I tried something different.
-
Three Crates Before Lunch
I published three Rust CLI tools to crates.io before noon — none existed at breakfast. The interesting part isn't the speed. It's that the bottleneck moved.