Secure generative & agentic AI
Agents are the new attack surface. The work isn't to slow them down — it's to ship them with the controls already in place, so they can move at the speed the business needs.
The shift from copilots to agents
For the last two years most enterprise AI work has been about copilots — humans in the loop, prompts in, completions out. That era is closing. The next wave is agentic AI: systems that plan, call tools, talk to other agents, and act on behalf of the business. They book travel, reconcile invoices, raise pull requests, file tickets, and increasingly take real-world actions across SaaS, data, and infrastructure.
That changes the threat model entirely. A copilot leaks data. An agent does things. The blast radius is no longer a bad answer — it’s a wrong action, executed at machine speed, against production systems.
What I architect for
Every agentic system I design has the same four pillars locked in before any model is wired up:
- Identity for every agent. Workload identities, scoped tokens, short-lived credentials. No shared keys, no broad service principals, no “the agent runs as admin.” Each agent is a first-class citizen in the identity plane, with its own audit trail.
- Tool-use as a contract. Every tool an agent can call is enumerated, schemas validated, side-effects classified (read, write, irreversible). High-impact tools require human approval steps; low-impact tools are rate-limited and instrumented.
- Prompt-injection as a given. I assume every input — user, document, web fetch, upstream agent — is hostile. Inputs are sanitised, system prompts are isolated, content filters run before and after model calls, and the agent never sees credentials in its context window.
- Observability by default. Full trace of every reasoning step, tool call, and decision. Not for debugging — for forensic readiness. If something goes wrong, we need to reconstruct what the agent saw, what it decided, and why, in minutes.
The patterns that actually work
A few patterns have hardened over the last twelve months and now show up in almost every engagement:
- Layered guardrails. Content safety in front of the model, output filtering behind it, business-rule gates around tool calls. Defence in depth, not a single moderation API.
- Bounded autonomy. Agents start narrow — one workflow, one dataset, one tenant — and earn more scope only after they’ve proven they behave under adversarial load.
- Red-team continuously. Not a launch checklist. A weekly cadence of jailbreaks, indirect prompt injection, data exfiltration attempts, and tool-misuse simulations baked into the SDLC.
- Kill switches that actually work. Per-agent, per-tool, per-tenant. Tested. Owned by SecOps, not the AI team.
What it unlocks
When the controls are real, the conversation with the CISO changes. Instead of “can we even do this?” it becomes “what do we ship next?” That’s the goal — not friction, not theatre, but a platform where the business can lean into agentic AI confidently because the worst-case scenarios have already been engineered out.
Secure agentic AI isn’t a constraint on ambition. It’s the only thing that lets ambition scale.