Defense in Depth Needs a Floor: Why Agent Security Layers Only Add Up With a Shared Decision Plane

TL;DR

Defense in depth is the right strategy for AI agent security. It is also the most-quoted, least-architected strategy in the industry right now. “Multiple layers” has become a slogan substituting for a building plan.
The structural reason layered agent controls drift is that each layer ends up implemented in many places, by many teams, with no shared substrate to keep them coherent. Identical policy intent, twenty different enforcement implementations. The variance is the vulnerability.
The fix is operational, not philosophical. A centralized, externalized policy decision plane — one engine consulted by every tool handler, orchestrator, MCP server, and sub-agent boundary — is the floor that defense in depth stands on. Without it, the layers don’t add up. They accumulate.

The Strategy Without an Architecture

Ask any security leader how they’re approaching AI agent risk and you will, somewhere in the first three minutes, hear the phrase defense in depth. It is the universal answer. Every vendor white paper, every conference keynote, every internal architecture document reaches for it. It is the right strategy.

It is also rarely an architecture. “Defense in depth” describes a posture — multiple controls, no single point of failure, each layer catching what the others miss — not a building plan. In the pre-agent era, the layers were familiar primitives that had been hardened by twenty years of practice: the network boundary, the WAF, the IAM role, the audit log. The strategy and the implementation had grown up together. You said “defense in depth” and meant something specific.

In the agent era, the implementation pattern has not caught up to the strategy. Teams are building agent systems with model guardrails on top, tool handlers in the middle, orchestrator logic somewhere, identity in another place, and audit downstream — each implemented separately, each enforcing a slightly different version of the same intent.

The structural problem: when “the application layer enforces authorization” means twenty handlers, three orchestration frameworks, six MCP servers, and a sub-agent topology that’s growing month over month, “the application layer” is a category, not a system. Each implementation drifts. The drift is the vulnerability.

Where Layered Agent Security Actually Fragments

The fragmentation is not theoretical. Four specific places it shows up, in roughly every agent-deploying enterprise:

“Narrow Scope” Becomes a Slogan

Every team agrees agents should have narrow, focused scope. Each team encodes that differently — some in tool descriptions, some in IAM, some in handler conditionals. Scope creep is invisible until the incident.

“Least Permissions” Granted Loosely at Design Time

Permissions handed out during integration sprints, when nobody is fully sure which APIs the agent will actually need, become exploitable surfaces at runtime. The runtime never gets a chance to reconsider them.

“Human-in-the-Loop” Means Different Things

One orchestrator pauses for approval on financial actions. Another asks the model to decide when to escalate. A third has no gate at all. The phrase covers all three, and the “deterministic” version exists only in the one that pauses.

“Agent Identity” Is a Name Tag Without a Consumer

You can mint a verifiable identity for every agent. If each downstream system consumes that identity through its own logic — if there’s no uniform consumer of agent attributes — the identity is decoration. Verifiable, and decorative.

The shape of the failure is always the same. The intent is universally agreed. The implementation lives in dozens of places. Each one is right roughly. None of them is the same. The variance compounds with every new tool, every new orchestrator, every new team that ships an integration.

That’s defense in piles, not defense in depth.

The Principle That Has to Generalize

One principle from earlier eras of platform security is worth restating in the agent context because it’s the principle that makes the floor inevitable:

If the actor enforcing a rule is the actor the rule is meant to constrain, the rule is advice, not enforcement.

In the model layer, this is why “don’t do destructive things” in a system prompt is not a control: the model is the actor the rule is meant to constrain, and it is also the actor evaluating compliance. A model that reasons its way past a constraint is doing exactly what the constraint was supposed to catch.

The same logic applies one level up. If each tool handler is the actor enforcing “agents only access their permitted resources,” and each handler is the actor whose calls the rule is meant to constrain, then the rule is advice. A handler that reimplements the policy in its own if statements — or skips the check on the path nobody remembered to wire it into — is exactly the failure the rule was supposed to prevent.

Enforcement has to live external to the thing being enforced. That holds for the model, and it holds for the handler. Both belong outside, in a shared substrate every layer consults.

What the Floor Actually Does

A decision plane — the floor — is one engine that every layer of the agent stack consults at the moment of a consequential action. Six characteristics distinguish it from a confidentiality control or a per-handler enforcement block wearing the same hat.

One Engine, Many Callers

Every tool handler, every MCP server, every sub-agent boundary, every orchestrator calls the same decision API. No handler reimplements the rule. When the rule changes, it changes once.

Attribute-Aware, Not Role-Aware

The decision reads subject attributes (which agent, which owner), resource attributes (production, customer PII, regulated class), and context (recent actions, source surface, delegation chain). Static role tables can’t carry this much state.

HITL Is a Decision Outcome

“Require human approval” isn’t a flowchart in one team’s code. It’s a possible return value from the decision plane — surfaced uniformly, satisfied through a workflow that the agent cannot self-authorize past.

Evaluated Mid-Execution, Not Just Pre-Task

Long-running agent tasks generate dozens of decision points. The plane evaluates each tool call as it happens, not just when the task starts. Intent drift mid-task is caught at the next decision boundary.

Policies Are Versioned Artifacts

The rule the engine consulted on May 22 is a specific, versioned policy. Audits, simulations, and rollbacks all use the same artifact. The policy is treated like code, with code’s discipline.

Decision Stream Is the Audit

Every allow, deny, and require_approval is a structured event with full inputs and the rule that fired. That stream is the agent behavioral audit, the incident reconstruction surface, and the source of truth your safety layer reads from.

Notice what isn’t in this list: anything about the model, anything about specific handlers, anything tied to a single framework. The floor sits below all of them. It doesn’t ask the model to comply with rules. It doesn’t trust each handler to enforce them. It enforces them once, where every layer can reach it.

Building the Floor Under What You’re Already Shipping

01

Inventory Every Agent Action Surface

Walk every place an agent can take a consequential action — tool handlers, MCP servers, sub-agent boundaries, downstream API integrations. Each one is a decision point. Today, most of them are if statements scattered across application code. Count them. The number is almost always larger than expected.

02

Pick One High-Blast-Radius Path First

Don’t boil the ocean. Pick the single agent path with the largest blast radius — the destructive tool call, the privileged write, the financial action. Route that handler’s authorization through an external decision point. Measure decisions, denies, and latency. Prove the substrate works on one path before scaling it.

03

Make Agent Identity Verifiable Before Anything Else

Every agent gets a distinct NPE record — owner, purpose, allowed tool classes, expiration. The decision plane needs a subject it can trust before its rules become enforceable. Identity is the prerequisite, not the destination.

04

Encode HITL as Policy, Not Flowcharts

“Require human approval” lives in the policy engine as a decision outcome triggered by resource class, action severity, and context. One encoding, every orchestrator reads it — not as a hardcoded branch in one team’s tool handler.

05

Push Resource Classification Into the Decision Engine

Tag the resources agents can touch — production vs. sandbox, regulated vs. routine, customer-scoped vs. org-wide. Policy evaluates the pair (agent + resource), not just the agent. This is what gives least permissions teeth at runtime.

06

Standardize the MCP Boundary

Every MCP server consults the same decision plane on every call. No MCP server reimplements authorization; no MCP gateway carries its own policy logic. The boundary is one of the easiest places to enforce consistency — or to lose it, if each integration team writes its own.

07

Treat Sub-Agent Hops as Policy Boundaries

When agent A invokes agent B, the decision plane evaluates B’s call as B’s call — not as A’s continued authority. Each hop is a fresh decision with the current subject, current context, and current delegation chain.

08

Wire the Decision Stream Into Your Existing SOC Pipeline

The substrate becomes meaningful when its events flow into the systems your analysts already use. Stream every decision — subject, action, resource, attributes, rule ID, outcome — into the same SIEM that ingests the rest of your security telemetry. The floor doesn’t replace your safety layer; it gives it the structured signal it’s been missing.

How This Maps to ABAC-Enabled Security

🏠

The Decision Plane Is the Substrate

One engine, consulted by every tool handler, MCP server, orchestrator, and sub-agent boundary. That’s the architectural shape of consistent enforcement — the part that “layered security” assumes you already have and almost no team has built.

🔐

Attributes Carry State the Model Cannot

The decision needs identity attributes, resource classification, and context simultaneously. Static permissions stripped down to roles can’t carry that load. Attribute-based policy is the only architecture that fits.

📍

Determinism Lives in the Substrate, Not in Each Handler

“Deterministic enforcement” is only as deterministic as the substrate that backs it. With a shared decision plane, the same input produces the same outcome regardless of which handler caught the call. Without one, “deterministic” degrades to “whoever wrote this handler last.”

Strategy Above, Floor Below

Defense in depth is the strategy. It has always been the right strategy. What’s missing in the agent era is the architectural pattern that turns it into a system instead of a slogan.

That pattern is a shared, externalized decision plane underneath every layer that pretends to enforce policy. A model layer that defers to it. A handler layer that consults it. An orchestrator layer that respects its outputs. An audit layer that reads its event stream. Each layer above doing the work it was designed to do, on top of a floor it can stand on.

Without the floor, “defense in depth” is a list of recommendations whose consistency depends on whoever happened to be writing code that week. With it, the strategy has something underneath it — a place where the rule is the rule, regardless of which layer is asking.

Strategy above. Floor below. Build the floor first.

Defense in Depth Needs a Floor