Back to Blog

Confidentiality Is Not Agent Security

Encrypted memory protects an agent’s bytes. It doesn’t decide whether the agent should be allowed to call the tool. That’s the Authorization Gap — and it’s where AI agent security is actually breaking.

May 18, 2026

TL;DR

  • The fashionable diagnosis — that AI agents are broken at runtime because we haven’t secured “data in use,” and the answer is encrypted memory, hardware enclaves, and attestation — is a category error. Mark O. Rogge calls it out directly: confidentiality is necessary; it is not sufficient.
  • The agent incidents actually happening don’t look like an attacker reading model weights out of RAM. They look like an over-privileged agent calling a tool it should never have been allowed to call — a sub-agent invoking a destructive API, an MCP server reaching a resource it was never scoped to touch. The decryption already happened. The attestation already passed. The failure is upstream of memory.
  • Rogge names it the Authorization Gap, and it’s where AI agent security is actually breaking. Non-human identities — the service accounts, tool tokens, and agent credentials that every agent spawns — now outnumber humans 82-to-1 in a typical enterprise, and almost none of them are governed by a runtime enforcement layer. They’re governed by hope.

The Comforting Story That’s About to Cost a Lot

A specific narrative has been gathering momentum in enterprise AI security: agents are unsafe at runtime because we haven’t secured the third state of data — data in use. Wrap agent workloads in encrypted memory, run inference in confidential-computing enclaves, attest the stack end-to-end, and the runtime problem is solved. Get the confidentiality model right and agent security follows.

It won’t. And in a recent piece worth reading in full, Mark O. Rogge spells out exactly why:

“An AI workload running in a perfectly attested, fully encrypted enclave will, with complete fidelity, execute whatever instruction reaches it — including the instruction to exfiltrate a customer database, mutate a production config, or wire money to an attacker’s account. The enclave protects the bytes. It does not ask whether the action should happen.”

That last sentence is the whole pivot. The agent incidents making headlines — the coding assistant that deleted a production database, the support agent that issued a refund nobody approved, the research agent that exfiltrated an SSH key it found in a config file — are not confidentiality failures. The data wasn’t leaked to an outside party. The credentials weren’t cracked. The enclave held. An authenticated agent, holding tokens it was technically permitted to hold, did something it should never have been allowed to do.

That’s a different problem. It has a different name — Rogge calls it the Authorization Gap — and in the agent era it has not been receiving anything close to the same level of investment as the confidentiality stack around it.

Two Different Primitives Doing Two Different Jobs

Confidentiality and authorization look adjacent because they both live in the “security” org chart. They are not the same primitive. In agent systems especially, they protect against different threats, run on different control loops, and fail in different ways.

Confidentiality

Asks: can the wrong audience see this? Implemented by encryption, tokenization, masking, redaction, secret managers, network segmentation, confidential-computing enclaves around the model. Evaluated at rest, in transit, and increasingly in use. Static-ish — the answer doesn’t change call-to-call.

Authorization

Asks: should this agent be allowed to perform this action on this resource under these conditions? Implemented by a policy engine reading attributes at the moment the agent decides to act — a tool call, a sub-agent invocation, an MCP request. Dynamic — the answer depends on which agent, which action, which resource, which context, and any of those can change between calls.

A vault that encrypts your customer database protects against a hostile reader. It does nothing to stop an authenticated support agent from issuing a $50,000 refund to the wrong account, or a coding agent from running volumeDelete against a production volume. The decryption happens, correctly, because the system genuinely cannot distinguish a routine agent action from a destructive one without an authorization layer that knows the agent’s identity, the resource’s classification, and the context of the call.

Two primitives. Two control planes. Two budget lines. The agent era needs both. The industry has been treating them as one and shipping only one.

Safety, Security, and Confidentiality Are Three Different Things

Rogge identifies a second illusion worth surfacing here, because it shows up in nearly every AI risk conversation: the assumption that AI safety — the alignment, guardrail, and content-filtering work that keeps a model from producing harmful outputs — is the same thing as AI security. It is not.

A model trained to be helpful, harmless, and honest will, by design, try to help. When an attacker frames a malicious request as a reasonable business need, the model’s safety training is precisely what makes it cooperate. Rogge calls this the Politeness Trap, and the distinction it forces is one sentence long:

“Safety governs what a model is willing to do. Security governs what it is permitted to do.”

Confidential computing addresses neither. It addresses what the underlying hardware will reveal. Three different problems, three different control planes, and the industry keeps conflating them in conversations about AI agents.

That conflation is most expensive at the agent runtime, where all three converge on a single moment — an agent deciding to call a tool, invoke a sub-agent, or hit an MCP server — and only one of them (authorization) has any real say in whether the action proceeds.

Why the Gap Exists

Three industry forces have widened this gap rather than closed it.

1

Confidentiality has crisp success criteria. “The data is encrypted with AES-256.” “PII fields are masked in retrieval responses.” “Secrets live in the vault, not the repo.” You can audit it, ship it, ship a dashboard for it, and put it on a compliance report. Authorization decisions are situational; they don’t fit into a checkbox the same way.

2

Compliance frameworks reward what they can measure. Encryption at rest is in every framework. Per-call attribute-based authorization decisions over arbitrary resources are in almost none. Budget follows audit.

3

Agents broke the assumption authorization was built on. Pre-agent applications had a small, declared set of actions; authorization could be a static role table. Agents pick actions at runtime from an open-ended tool surface, chain those actions across MCP servers, and spawn sub-agents that take further actions on their behalf. Every hop is a new authorization decision the static role table never had to answer. The architectures are catching up.

The result is enterprises with mature confidentiality programs running agent systems whose authorization stack still says “authenticated agent → allowed.” The vault is excellent. The decision engine that controls what the agent does with the unlocked contents is missing.

What “Agent Runtime Authorization” Actually Looks Like

The phrase gets used loosely. Inside the platforms that have it, it’s specific. Six characteristics distinguish a real agent authorization layer from a confidentiality control wearing the same hat.

Before the list, one number that should reset every conversation about scope: non-human identities now outnumber human identities by 82 to 1 in a typical enterprise. Service accounts, API keys, agent credentials, tool tokens, MCP server identities, sub-agent credentials. Each one is a subject the authorization layer needs an opinion about. Almost none are governed by a runtime enforcement layer today. They’re governed by hope.

Evaluated on Every Tool Call

Every tool call, every MCP request, every sub-agent invocation triggers a decision. Not a token check at session start. Not a static role lookup baked into the agent framework. A structured evaluation of agent, action, resource, and context that returns allow or deny before the call leaves the agent’s process.

Agent Is the Subject, Not the User Behind It

The decision distinguishes this specific agent NPE from the human who delegated to it. Agents are not human-profile clones. The policy reads agent attributes — purpose, owner, allowed tools, data classes — not the delegator’s blanket permissions.

Resources Carry Classification

The thing the agent is about to act on has its own attributes — production vs. sandbox, customer PII vs. internal docs, regulated vs. routine, this expense report vs. all expense reports. Policy evaluates the pair, not just the agent.

Context Is a First-Class Input

Source surface (which tool host, which MCP server), task lineage, recent agent activity, current delegation chain. Same agent + same action + same resource can produce different decisions depending on context, and that’s the point.

One Engine Across Every Agent Surface

The same policy decision applies whether the agent is calling a local tool, hitting an MCP endpoint, invoking another agent, or chaining through an orchestration framework. The engine lives outside the agent runtime; every surface consults it. One source of truth, one audit stream, one place to change the rule.

Decision Stream Is the Agent Audit

Every allow and deny is a structured event with agent identity, action, resource, attributes, and the rule that fired. That stream is your agent behavioral audit and your incident reconstruction surface — not a separate logging story bolted on after the fact.

None of this is exotic. The hard part is making it consistent across every surface an agent can touch — tool host, MCP server, sub-agent boundary, downstream API — not bolting it onto one critical handler and leaving the rest unprotected.

Keep the Confidentiality Program. Add the Agent Authorization Layer.

None of this argues against the controls you already ship. Encryption, masking, vaults, DLP, segmentation, even confidential-computing enclaves around inference — keep all of it. The point is that those controls solve one question, and the question agent systems are failing at is a different one. The solution isn’t to swap one for the other. It’s to recognize they’re complementary primitives and stand up the missing half.

01

Inventory Every Agent Action Surface

Walk every place an agent can take a consequential action — tool handlers, MCP servers, sub-agent boundaries, downstream API integrations. Each one is a decision point. Today, most of them are if statements scattered across application code. Count them. Most teams discover dozens to hundreds.

02

Externalize the Highest-Blast-Radius Agent Path

Pick the single agent path with the largest blast radius — the destructive tool call, the privileged write, the financial action. Route it through a policy engine. Measure decisions per minute, deny rate, decision latency. Prove the model on one agent before rolling it to the rest.

03

Reuse Your Data Classification as Authorization Attributes

You probably already classify data for confidentiality (public / internal / confidential / restricted). Reuse those tags as agent authorization attributes. The classification work is mostly done; what’s missing is the engine that reads it the moment an agent reaches for a resource.

04

Give Every Agent a Distinct NPE Identity

Stop letting agents impersonate users. Every agent gets a first-class NPE record with attributes — purpose, owner, allowed tools, allowed data classes, expiration. It’s the prerequisite for any policy that treats agents differently from the humans who delegated to them.

05

Encode Approval Gates as Agent Policy

For the highest-classification resources, “human-in-the-loop” is an attribute on the decision: this agent + this action + this resource → require approval. Encode it once in policy, apply it consistently across every agent surface — not as a flowchart in one team’s tool handler.

06

Distinguish Agents from Humans in the Log Stream

If your SIEM can’t answer “was this a human or an agent?” for every consequential action, the SOC story doesn’t exist for the population of identities about to grow 500x. Make actor type a first-class log field, and feed the policy engine’s decision stream into the same pipeline.

07

Simulate Policy Against Real Agent Traces

Replay production agent traces against candidate policies before rollout. Catch over-permissive rules in a sandbox, not the next incident. Treat the agent policy itself as a regulated artifact — version it, review it, gate it the way you’d gate a schema migration.

08

Report “Agent Authorization Coverage” to the Board

If you report encryption coverage, report authorization coverage too. “Of all consequential agent actions in our environment, X% are evaluated by an external policy decision.” That number is the gap you’re actually closing — and the one the next incident report will measure you against.

How This Maps to ABAC-Enabled Security

🔐

Attribute-Based Decisions at Runtime

Subject attributes, resource classification, and request context evaluated per call. This is the architectural shape of the second primitive — the one the industry has been underinvesting in while shipping more vaults.

👤

Agents as First-Class NPEs

Distinct identity, distinct attributes, distinct policy. Authorization for agents only works when agents are addressable separately from the humans they delegate from.

🔗

Centralized Decisions, Distributed Enforcement

One policy engine across MCP servers, tool surfaces, orchestration layers, and APIs. The same shape of control plane the confidentiality program already has — it just answers a different question.

The Budget Question

Rogge offers the cleanest version of the budget framing we’ve seen: if you have one runtime security budget to spend in 2026, spend it on enforcement, not encryption. The data-in-use problem is real. The agent authorization-in-use problem is bigger, more common, and the one that will show up in agent incident reports.

When the next security review for an agent system comes around, watch the language carefully. If the conversation is dominated by enclaves, attestation, encrypted memory, masking, vault usage, and DLP — you are reviewing a confidentiality program. Important. Necessary. Insufficient. It does not answer the question the next incident report will be about.

That question is short and easy to ask, and almost no platform today can answer it consistently for every agent in production:

Should this agent be allowed to perform this action on this resource under these conditions — and where does that decision live?

If the answer is “in an if statement in the tool handler,” or “in the agent’s system prompt,” or “in the token scope we issued six months ago,” the runtime is the gap. Closing it doesn’t require giving up anything you’ve already built around the agent. It requires standing up the other primitive — the one the agent era needs and the confidentiality era never had to ship.

Further Reading

Confidentiality protects an agent’s bytes. Stratium decides what the agent is allowed to do with them.