The red thread problem: how skills, agents and governance rescue TOGAF traceability in agentic delivery

Agentic delivery generates plausible artifacts at every architecture layer with no enforced lineage. Here's how to keep the TOGAF red thread unbroken when agents are doing the work.

Every enterprise architect knows the red thread. It’s the traceable line that runs from a business driver, down through functional and non-functional requirements, into security and integration and technical controls — where every layer justifies the one below it and nothing exists without a parent reason. TOGAF’s entire value is that thread. It’s also the deliverable everyone quietly skips: the requirements traceability matrix, written at the end, backfilled with plausible-looking lineage that nobody actually verified.

Agentic delivery makes this worse before it makes it better. Point a fleet of AI agents at an architecture engagement and you get something seductive and dangerous: beautiful, fluent artifacts at every layer, generated in minutes, with no enforced link between them. A gorgeous technical design that no one can trace back to a business requirement. A security control that mitigates a threat nobody named. The agents are productive. The thread is gone.

So the question for anyone serious about letting agents into regulated delivery is not “can agents produce architecture artifacts?” — they obviously can. It’s: how do you keep the red thread unbroken when agents are doing the work?

The answer comes from a pattern I’ve been building toward in copilot-agents-dojo: three separate concerns — skills, agents, and governance — that map across the TOGAF cascade rather than onto it.

The mistake everyone makes first

The instinct is to make the three concerns be the layers. A business agent, a functional agent, a technical agent, stacked up. This collapses immediately, because then governance has nowhere to stand — it becomes just another agent with opinions, and one agent’s opinion can’t be the referee for another’s.

The trick is orthogonality. Skills, agents, and governance each do a different job at every layer:

Agents own the phases. Each TOGAF phase has a natural persona — a Business Architect for Phase B, a Solution and Data Architect for Phase C, a Platform Architect for Phase D, a Security Architect threading through all of them. The discipline of separating who elicits a requirement from who designs against it is what stops an agent from quietly inventing a requirement to justify a design it already wanted to build. (This is why, in the dojo, the Business Analyst role is decomposed across TPM and Architect rather than collapsed into one — TOGAF is the formal justification for that instinct.)
Skills own the derivation procedures. “How to elicit a non-functional requirement and express it as a measurable SLO” is a skill, not a personality. “How to derive a security requirement from a business risk appetite” is a skill. These are reusable across agents and engagements. A shared requirements-elicitation skill is the TOGAF Requirements Management discipline made executable — and it’s what stops elicitation from degrading into passive transcription.
Governance owns the thread itself. This is the part nobody else has, and it’s the whole point. The gate between phases — TOGAF’s own concept — is where governance lives. The non-negotiable rule: no artifact at layer N+1 may persist unless it carries a verified link to a parent at layer N. A technical requirement with no functional parent fails. A functional requirement with no business driver fails. A security control with no named threat fails. Traceability stops being a document and becomes a precondition for existence.

How the thread stays unbroken, concretely

Walk a real cascade through it.

A business driver enters at Phase A/B — say, “reduce bilingual document-processing cost for a regulated mining client.” The Business Architect agent, running the elicitation skill, produces a business requirement with a unique ID and a named owner. The gate checks: is it measurable, is it owned, is it ratified? Only then does it persist.

That requirement becomes the parent key for everything downstream. The Solution Architect deriving a functional requirement (“auto-classify EN/ZH documents”) cannot create one without referencing that parent ID — the skill enforces the link, the gate verifies it. Non-functional requirements (“classification latency under 2s at p95,” “99.5% extraction accuracy”) inherit the same parentage and add targets the gate can actually test. Security requirements (“mask PII in extracted fields before storage”) are derived against the data classification of the business requirement, and the gate rejects any control that doesn’t name the obligation or threat it satisfies. Integration and technical requirements sit at the bottom, each carrying full ancestry back up to the driver.

The result: the requirements traceability matrix — the most tedious, most-skipped TOGAF deliverable — is a byproduct of the gate, not a document someone writes at the end and fills with lies.

Governed self-improvement, and the tension it creates

Here’s where it gets genuinely new. A self-improving runtime can learn derivation patterns over an engagement — “business drivers of this type tend to generate these NFRs” — and propose them as new skills. The synthesis is that a learned skill only persists if it passes the same governance gate that governs live delivery. The agent gets better at building traceable architectures over time, and it can never learn to build an untraceable one, because the gate that governs runtime is the gate that governs learning.

But there’s an honest tension worth naming, because it’s the line between an augmented architect and an architect quietly replaced by the agent’s priors.

TOGAF traceability assumes requirements are relatively stable once ratified. The self-improving loop assumes the agent keeps learning new patterns. These can fight. If the agent learns “drivers of type X usually need NFR Y” and starts auto-suggesting Y, it has begun shaping requirements toward what it has seen before — which is the enemy of genuine elicitation.

The resolution is structural, and it’s the same separation again: the eliciting agent stays naive and driver-led; only the deriving agent gets to use learned patterns; and even then the gate flags any derivation whose parent the human never ratified. Learning is allowed to make derivation faster. It is never allowed to manufacture a requirement.

Why this matters more than it sounds

To a developer, “behavioural governance for agents” sounds like linting. To a TOGAF practitioner, the exact same mechanism is the answer to the question their assurance board has been losing sleep over: if I let agents generate my architecture, how do I keep the requirements traceability my auditors demand?

Same gate. Completely different stakes.

The first generation of agentic delivery proved agents can produce artifacts. The frontier — the part that lets a regulated enterprise actually deploy this — is proving the artifacts are traceable. Agents own the phases. Skills own the derivation. Governance owns the thread. Keep them orthogonal, put the gate between every layer, and the red thread holds itself taut without anyone drawing it by hand.

That’s not a productivity story. It’s an assurance story. And in regulated delivery, assurance is the only story that closes the deal.