The frontier is no longer the back office

24 May 2026 · 5 min read ·

When banks first designed AI controls, the implicit target was the back office. The vendor demos, the proof of concepts, the early governance papers all assumed the model was replacing a junior. The reviewer was a senior. The risk was that a fast clerk would make a wrong decision a person would have caught. That framing is now wrong, and Ken Griffin said so out loud last week.

Speaking at the Stanford Graduate School of Business in May, the Citadel founder described work that used to take teams of master’s and PhD finance professionals over weeks or months being done by AI agents in hours or days. He was careful to specify: “These are not mid-tier white collar jobs. These are extraordinarily high skilled jobs being automated by agentic AI.” He went home one Friday, he said, fairly depressed. Griffin has spent most of the last two years calling AI garbage. The reversal is the signal, not the quote.

What changes when the frontier moves up the org chart is not the cost curve. It is where the controls have to sit. Controls designed for the back office assume a senior reviewer above the work, a procedure document around it, and a regulator who can read the procedure and tell whether the bank knew what it was doing. The senior reviewer is the load-bearing piece. Their judgment is what makes the whole thing safe enough. The model is permitted to draft, classify, or process; the human signs.

That whole architecture rests on the senior reviewer being the smarter party. When the agent is doing work that the senior reviewer used to do, sometimes faster and on a larger sample, the relationship inverts. The reviewer is no longer pattern-matching the output against a model in their head; the agent’s model is the bigger one. What is left for the reviewer is acceptance criteria, frame, and judgment about whether the question was the right question. That is real work, but it is not the work the existing rulebook measures.

This is the part most bank AI programmes have not yet absorbed. The rulebook still reads like it was written for the back-office substitution case. Tiering models by impact, requiring evidence of training data lineage, demanding bias testing, asking for an explainability statement: these are all useful, and none of them touch the actual failure mode of an agent that produces a strategy memo a PhD would have written in three weeks. The failure mode for that kind of work is not bias or hallucination in the narrow sense. It is a plausible answer to a question that should not have been answered that way.

Griffin’s depression is, I think, where the conversation about agentic governance has to start. Not because his emotional state matters, but because his observational position is rare. He is a sober capital allocator who is on the public record as a skeptic, watching the work happen inside his own four walls. That is the kind of evidence that survives in a board paper. It is also the kind of evidence that makes the existing way of governing AI look small. That approach is busy measuring whether the model can be trusted to do the task. The interesting question is whether the institution can be trusted to ask the task.

If that sounds like a governance question, it is. But it is also a product question, and that is the part banks underweight. Where the controls sit is where the work feels them. A workflow that requires the analyst to name the standard their output must meet, the evidence that would make it acceptable, and the failure mode they are guarding against produces a different output than a workflow that just asks for a memo. The control is not a checkpoint added to the workflow. The control is the workflow’s shape.

The practical implication is that the back-office-substitution rulebook is not wrong, it is just no longer where the action is. The banks that will look composed in three years are the ones whose AI governance treats the senior analyst’s job, the one Griffin watched compress from months to days, as a first-class subject of design. Not just for safety, though there is plenty of safety in it. For the simpler reason that the work itself has moved, and controls that stay at the old layer will end up regulating the part of the bank that no longer matters.

Griffin is right that this is dramatic. He is right that it is fast. He may also be right that it is depressing in the moment of seeing it. The question worth carrying out of his talk is not whether the agents are getting better. They are. The question is whether the institutions that use them are designing controls for the layer where the work now lives, or for the layer where the work used to live.