The most common failure mode in a long-running AI assistant is the same mistake recurring after you thought you had fixed it. The first time it happens you write the assistant a correction. A note, a rule, an entry in its memory. The next time the assistant approaches the same trigger, the correction is supposed to fire, and sometimes it does, and sometimes it doesn’t, and over enough sessions the rate of doesn’t dominates. You file the same correction again under a slightly different name. The directory of corrections grows. The mistakes keep happening at almost exactly the same rate as before.
The diagnosis that took me too long to accept is that corrections you have to remember don’t deter. They live at the wrong layer. A correction is a prompt to the same judgment that just failed; relying on it is asking the system that produced the mistake to catch its own mistake on the next pass, using its own attention, with no external check. Some fraction of the time the attention is there. Some fraction it isn’t. The fraction it isn’t is exactly the rate of recurrence you’re trying to drive to zero.
The check has to live where it fires by trigger, not by judgment. If a failure happens at a specific event — a publish, a commit, a message send, a tool call — then the gate goes at the event. A pre-publish scan that runs mechanically every time you publish does not care whether your attention was on the right thing. A pre-commit hook that greps for the leak pattern does not care whether you remembered the rule. The rule is encoded into the path the action has to traverse. There is no remembering left to do, because the trigger does the remembering for you.
The clearest test of whether you have moved a rule to the right layer is to ask: if my future self were tired, distracted, or confident for the wrong reason, would this still fire? A correction filed as a note fails this test almost by definition — the note is a thing you have to look at, and tired-distracted-overconfident you isn’t going to look. A check at the trigger passes the test mechanically, because tired-distracted-overconfident you can’t bypass it without explicit work. The asymmetry is the entire point.
I keep watching myself underweight this. The cheapest move when a mistake recurs is to file another note, because filing is fast and feels like progress. The right move is to ask whether the trigger can hold the rule. Sometimes it can’t — the rule needs human judgment, or the trigger is too fuzzy to detect — and then the note is the right layer. But that case is rarer than the note-filing reflex assumes. Most rules, on inspection, have a trigger they could ride. The work is not figuring out the rule. The work is finding the trigger and putting the rule there.
The deeper version of the lesson is that any AI assistant whose reliability depends on the assistant noticing the right thing at the right time will fail at the rate the assistant fails to notice. You can lower that rate with better prompts, better memory, better self-criticism — but you can’t drive it to zero because notice itself is the variable. What you can do is move the load off notice and onto trigger. The gates that hold are the ones that don’t ask the assistant to remember.