Teaching agents to learn from losing

Most agent setups make the same mistake twice — or twenty times. The most valuable thing I built into the dojo wasn't a skill. It was a loop that turns every correction into a rule the agent can't forget.

Every agent has a goldfish problem. You correct it, it fixes the issue, it thanks you — and then in the next session, freed of all memory of the correction, it does the exact same thing again. The intelligence is real. The amnesia is total. And amnesia, repeated across hundreds of sessions, is its own kind of incompetence.

This was the failure mode that bothered me most, because it’s the one humans don’t have. A junior engineer who breaks the build by forgetting to run migrations breaks it once. The embarrassment writes the lesson into them. By the third time, it’s reflex. The whole value of experience is that losses leave a mark.

Agents don’t get that for free. So I built it.

The loop that turns corrections into rules

The mechanism is a closed loop with five stages, and the discipline is in treating every correction as data rather than a one-off annoyance.

1. Observe

Every time the agent gets corrected, it logs a structured lesson — not a vague note, but a tagged entry recording the type of error, the root cause, the fix that worked, and the rule that would have prevented it. The structure matters. A pile of freeform regrets is useless; a queryable record of tagged failures is an asset.

2. Store

Those lessons accumulate in one place, tagged and countable. The system tracks metrics across them: how many lessons total, which patterns recur, how often a lesson actually leads to a change. The agent’s failures become a measurable dataset about its own weaknesses.

3. Amend

This is the part I’m proudest of. When the same kind of failure shows up three or more times, that’s no longer bad luck — it’s a pattern, and a pattern deserves a permanent rule. At that threshold the system proposes promoting the recurring lesson into the agent’s standing instructions. A repeated mistake gets written into the agent’s “muscle memory” so it physically can’t keep making it.

4. Evaluate

A rule isn’t sacred just because it exists. The loop tracks behavior before and after each new rule. If the rule is working, the corresponding failures stop appearing. If they don’t, the rule is wrong, and being wrong is information too.

5. Rollback

Failed fixes get reverted immediately. Rules that don’t earn their place get revised or removed. The system is allowed to un-learn things that turned out to be mistakes — which is what keeps it from calcifying around a bad early guess.

Why the threshold matters

The “three or more” rule is doing quiet but important work. The naive version of self-improvement is to turn every single correction into a permanent rule. That sounds rigorous and is actually a disaster — you end up with a bloated rulebook full of one-off overreactions, half of them contradicting each other, all of them eating context the agent needs for actual thinking.

The threshold is a filter against overfitting. One failure might be a fluke, a weird edge case, a bad prompt on my end. Three failures of the same kind is signal. By only promoting patterns that clear the bar, the system stays lean — it learns the lessons that recur and lets the noise wash out. It’s the difference between a wise engineer and a traumatized one. Both have been burned. Only one drew the right general lesson from it.

The goal isn’t an agent that never fails. It’s an agent that never fails the same way twice.

Memory as infrastructure

What I find genuinely interesting here is that the agent’s intelligence barely changes through any of this. The underlying model is the same on session one and session one hundred. What changes is the environment it operates in — the accumulated, curated, battle-tested set of rules that wrap around it. The agent gets more reliable not because it got smarter, but because its context got wiser.

That reframes where the value lives. We spend enormous energy chasing better models. But a meaningful share of real-world agent reliability isn’t in the model at all — it’s in the memory and governance layer we build around it. That layer is portable, inspectable, and improvable on a timescale you control, rather than one you wait on a vendor for.

This is the same principle I work on at enterprise scale, where it goes by heavier names — observability, governance, continuous evaluation. An enterprise AI platform that can’t observe its own failures and feed them back into its guardrails isn’t a platform; it’s a liability with good marketing. The mechanics differ. The conviction is identical: a system that can’t learn from its losses will keep paying for them.

The agents we have are already capable enough to be useful and forgetful enough to be frustrating. The fix isn’t waiting for a model that remembers. It’s building the memory yourself — a loop that catches every loss, finds the patterns, and writes them somewhere the agent can’t ignore. Train the agent to learn from losing, and it stops handing you the same defeat on repeat.

The self-improvement loop, in full

The lessons log, the pattern scanner, and the amendment workflow — open-source in the dojo.

View on GitHub →