The Search-and-Replace Test for AI Governance
/ 6 min read
Every large organisation building agentic AI is publishing a principles list. They tend to have ten to fifteen items. They cover inventory and registration, least privilege, observability, change management, data minimisation, resilience, and cost control. They read well. They feel comprehensive. And most of them have nothing to do with agents.
There is a simple test. Take any principle on the list and replace the word “agent” with “application.” If the principle still makes sense — if it is still true, still important, still something your organisation should do — then it is not an agentic AI principle. It is an enterprise IT governance principle that someone has relabelled.
“Register and inventory everything.” Yes. This is true for applications, APIs, databases, and cloud resources. It has been true for twenty years. Adding “agent” to the sentence does not make it a new idea.
“Least privilege and strong identity.” Yes. This is IDAM. It predates machine learning entirely. Giving an agent a unique identity with scoped permissions is exactly what you should do, and it is exactly what you should have been doing for service accounts since the early 2000s.
“Full observability.” Yes. Logging, tracing, alerting. This is what your SRE team already builds. The word “agent” in the requirement does not change the engineering.
“Governed change and lifecycle.” Yes. Version control, change approvals, rollback plans. This is ITIL. It is older than most of the people writing these principles lists.
Run the test across a typical twelve-principle list and nine or ten will survive the substitution unchanged. They are good principles. They are necessary. They are also table stakes that any well-run technology organisation should already have in place, and presenting them as novel agentic governance is misleading. It tells teams that the hard work is building registries and logging pipelines, when in fact the hard work has not started yet.
The principles that fail the search-and-replace test — the ones that stop making sense when you swap “agent” for “application” — are the ones that matter. There are roughly five of them, and they all trace back to the same root: agents produce non-deterministic output.
The first is that testing becomes statistical. You cannot unit-test a non-deterministic system the way you test a deterministic one. A traditional application either returns the correct value or it does not. An agent might return a defensible answer, a subtly wrong answer, or a confidently fabricated answer, and the only way to know the distribution is to run the evaluation hundreds of times and measure. Behavioural testing for agents is closer to clinical trials than to integration tests. Nobody writes “test your application” as a governance principle because the methodology is obvious. For agents, the methodology is genuinely unsolved, and pretending it is just “testing” hides the gap.
The second is that controls must be probabilistic. A firewall is binary — traffic is allowed or blocked. A content filter for an LLM works most of the time. Risk acceptance for a traditional control means accepting the residual risk that the control might not be implemented correctly. Risk acceptance for a probabilistic control means accepting that the control is implemented correctly and will still fail at some rate. One guardrail is a suggestion. Three independent guardrails with measured effectiveness rates are a control. This is a different control design discipline, and you cannot learn it by renaming your existing control framework.
The third is that the attack surface is natural language. Traditional systems separate code from data at the syntactic level, which is why parameterised queries eliminate SQL injection. Language models process instructions and data in the same channel. A prompt injection is semantically indistinguishable from a legitimate instruction. Simon Willison has been documenting this since 2022, and his conclusion after four years of work remains unchanged: there is no architectural pattern that resolves this completely. Delimiters will not save you. System prompts compete with injections for the model’s attention and lose when the injection is well-crafted. The most promising direction — deterministic control planes like CaMeL that treat the LLM as untrusted and mediate all tool calls externally — is still research, not production infrastructure. Any principle that says “secure by default” without confronting this is borrowing confidence from a domain where the separation exists and applying it to one where it does not.
The fourth is that observability must include reasoning. Logging the inputs and outputs of a traditional application gives you a complete audit trail. Logging the inputs and outputs of an agent gives you the start and end of a story with the middle missing. The agent selected tools, planned steps, revised its approach, and made judgment calls between the input and the output. If your observability does not capture the reasoning trace — the tool calls, the planning decisions, the confidence signals — you cannot audit the agent and you cannot debug it when it fails. This is not “monitoring.” It is a different kind of telemetry that most platforms do not yet provide natively.
The fifth is that autonomy tiering must be based on capability, not intent. Traditional risk tiering asks what the application is designed to do. Agentic risk tiering must ask what the agent is able to do — what tools it can access, what data it can read, what actions it can take — regardless of what its developers intended. Willison’s lethal trifecta names the structural version of this: an agent that simultaneously accesses private data, processes untrusted content, and can communicate externally is dangerous regardless of its stated purpose, because any two of those three are manageable but all three together create an exfiltration channel. An agent designed to summarise documents but granted write access to a production database is a high-risk agent, no matter what its use case description says. The risk follows the capability envelope, not the stated purpose. This is a fundamentally different approach to classification, and it is the one that most governance frameworks have not yet adopted.
These five are hard precisely because they have no precedent in traditional IT governance. You cannot copy a control from your existing framework, change the label, and deploy it. You have to design new controls, test them against non-deterministic systems, and accept that some of them will be probabilistic rather than binary. That is uncomfortable for organisations accustomed to controls that either work or do not.
The search-and-replace test is not a criticism of anyone publishing principles lists. The table-stakes principles need to exist, and many organisations do not have them in place yet. But calling them “agentic AI governance” sets the wrong expectation. It implies that once you have checked twelve boxes, you have governed your agents. You have not. You have governed them the way you govern applications, which is necessary and insufficient. The genuinely agentic principles — the ones that fail the substitution test — are where the real design work begins.
Related: [AI Controls Architecture](AI Controls Architecture) · [The Risk Without an Engineering Solution](The Risk Without an Engineering Solution) · [Governance Is a Design Problem](Governance Is a Design Problem)