Managing AI Agents Like Managing a Team • Terry Li

I run an autonomous AI system that works overnight — dispatching work waves, managing its own budget, and reviewing its own output quality using independent model families. Over the past few months, the governance architecture converged on something familiar: it looks exactly like managing a team.

The three layers

A good manager doesn’t review every piece of work. They operate at three layers:

Cadence reviews — regular check-ins at a fixed rhythm. Daily standups, weekly 1:1s , monthly retrospectives. These catch normal drift: are we on track? Are priorities right? Any blockers? The key property of cadence reviews is that they’re predictable. Everyone knows when they happen, and the cost of waiting for the next one is low.

Escalation channels — for things that can’t wait. The on-call page, the “urgent” Slack, the knock on the door. These exist specifically for silent failures and urgent anomalies — situations where a 7-day delay until the next cadence review would cause real damage.

Tiebreaking — the manager’s judgment on genuinely ambiguous decisions. Not “should we do this task?” (the team can decide) but “these two valid approaches conflict — which one?” The manager’s attention is spent only where it has maximum information value.

The same architecture, for AI agents

My autonomous system landed on exactly this structure:

Cadence reviews = skills. Daily taste calibration reviews, weekly system health checks, monthly retrospectives. User-initiated, predictable rhythm. The AI prepares a summary; I review at the scheduled time. Most governance happens here.

Escalation channels = intelligent monitors. Background processes that watch structured logs and only alert on genuine anomalies. Not “your budget is at 15%” (I’ll see that at the next check-in) but “your budget is climbing at 5x normal rate” (that can’t wait). The test for whether something needs a monitor: would I notice this went wrong without being told? If no, and it’s urgent, it needs a monitor.

Tiebreaking = cross-model review disagreements. Multiple AI models independently review the same output. When they agree, no human input needed. When they disagree, that disagreement surfaces to me as a decision. I’m the tiebreaker, not the reviewer.

The management insight that transfers

The progression I went through with my AI system mirrors how trust develops on any team:

Week 1: Review everything. Read every output. Check every decision. This is onboarding — you don’t know the team’s judgment yet.

Month 1: Review at boundaries. The work is consistently good, so you stop reviewing drafts and start reviewing outputs. Hold at the send gate, not at every edit.

Month 3: Review only disagreements. The team’s judgment is calibrated. You intervene when there’s a genuine conflict, not when things are running smoothly.

Steady state: You see nothing when things are normal. Cadence reviews confirm health. Monitors catch failures. Your attention goes to strategy, not supervision.

This is the trajectory every enterprise will need to follow with AI agents. The question “how do we govern AI at scale?” has the same answer as “how do we manage a growing team?” — not by reviewing everything, and not by trusting blindly, but by building the right layers of oversight and earning trust through calibration.

The practical test

When a client asks “how should we govern our AI agents?”, ask them how they manage their best team:

Do they review every email that team sends? (No — they review at boundaries.)
Do they attend every meeting that team has? (No — they have cadence check-ins.)
Do they ignore the team entirely? (No — they have escalation channels.)
Do they make every decision for the team? (No — they tiebreak on genuinely hard calls.)

The governance architecture for AI agents is the one good managers already know. The tools are different — structured logs instead of standups, anomaly detectors instead of 1:1s , cross-model review instead of peer review — but the pattern is identical.

Where the analogy breaks

Human teams need monitoring for motivation — are people showing up, putting in effort, not cutting corners? AI agents don’t shirk. They either execute or fail. There’s no laziness, no office politics, no quiet quitting.

This makes AI governance structurally simpler. You don’t need incentive design, performance reviews, or culture-building. You need accurate specifications, reliable infrastructure, and quality gates. The failure modes are technical (wrong output, stale data, budget overrun), not behavioural (disengagement, misaligned incentives, interpersonal friction).

The governance patterns transfer from management. The motivational layer doesn’t.

The companies that will govern AI agents well are the ones that already manage teams well. The ones that will struggle are the ones that either review everything (and can’t scale) or trust blindly (and get burned). The middle path — cadence reviews, intelligent escalation, human tiebreaking — is the one that works for humans and will work for agents.