skip to content
Topic

experiment

3 essays on this topic.

  1. The Treadmill and the Loop

    Getting ahead of AI best practices is a treadmill. The durable skill is testing assumptions faster than they expire.

  2. Personas Exploit a Blind Spot in LLM-as-Judge Evaluation

    Persona prompting generates the exact type of hallucination that automated LLM judges reward as 'depth.' Two experiments, blind evaluation, and a fact-check that flipped the finding.

  3. The Persona Paradox in AI Agent Teams

    Personas hurt for structured tasks, help for judgment-heavy tasks. Two experiments, blind evaluation, frontier models. The distinction is task-dependent, not binary.