Posts about production
-
The Production Gap: Why AI Pilots Fail
The consulting question isn't how to build AI — it's how to get it past the 62% graveyard.
-
Expansion, Not Speedup
The real ROI of AI coding isn't doing the same work faster. It's doing work that wasn't worth doing before.
-
The Trust Spectrum
Peter Steinberger stopped reviewing AI-generated code entirely. That works for indie software. In regulated environments, it can't. Here's how to think about where you sit.
-
Traces Are the New Debugger
When behaviour emerges from both code and model responses, reading source files isn't enough. You debug by examining execution traces.
-
AI Evals: Why Teams Build Metrics Before They've Read a Trace
Most teams build evaluators before reading a single trace. The sequence that actually works is the opposite: observe, categorise, then measure.
-
Backtest vs Operational Validation: The Control You Think You Have
A model control that's never fired in production isn't a control — it's a hypothesis. The gap between backtest and operational validation is invisible until someone asks.
-
Production AI vs Demos: The Intent Classification Reality Check
Building AI systems that work in the real world requires thinking beyond the demo. What actually matters when users depend on your models.