Back to Blog

Indirect Prompt Injection Isn’t a Bug

It’s a Supply Chain Problem for Your AI

April 24, 2026

TL;DR

  • Google’s Workspace security team says indirect prompt injection (IPI) is an “always-on” threat: attackers hide malicious instructions inside the content your AI reads (docs, emails, web pages), not in the user’s prompt.
  • Their answer is continuous mitigation: constant discovery of new attack patterns, synthetic variant generation, and fast-turn policy updates plus ML/LLM retraining.
  • The takeaway: treat LLM tool use like production access — enforce least privilege, policy-as-code guardrails, and data-centric controls that limit exfil even when the model gets tricked.

What Happened

Google published a detailed look at how they defend Google Workspace with Gemini against indirect prompt injection.

Indirect prompt injection is the pattern where an attacker embeds instructions inside data that an LLM will later ingest — an email, a document, a shared file, or content fetched from a URL. When the user asks the assistant to “summarize this doc” or “draft a reply,” the model may follow the attacker’s embedded instructions instead of the user’s intent — potentially causing data leakage, unsafe actions, or workflow manipulation.

Google’s key framing: IPI isn’t the kind of security issue you “patch once.” It evolves alongside:

More agentic automation Models taking actions via tools, not just generating text
More heterogeneous data sources Docs, mail, chats, external URLs — each one a potential injection surface
Faster attacker iteration Prompt payloads mutate quickly; static defenses decay

So they describe a continuous program that combines (1) new attack discovery, (2) scaling via synthetic data, and (3) defense refinement and measurement.

Why This Matters

The Next Data Exfil Frontier

If your assistant can read internal docs + call tools, then prompt injection is effectively a new social engineering layer — but aimed at your automations.

The Attack Surface Is Content, Not Code

Traditional controls (WAF rules, input validation) help, but the primary payload lives inside normal business artifacts: docs, emails, ticket comments, wiki pages.

Tool Access Turns Bad Completions Into Real Incidents

Summarizing a malicious document is annoying; summarizing it and then “sending the summary to attacker@…” or “exporting all customer records” is catastrophic.

Defenders Need Fast Policy Iteration

Google explicitly calls out configuration-driven “point fixes” (regex takedowns, URL sanitization, tool-chaining limits) because model refresh cycles are slower than attacker iteration.

What to Do Now

01

Inventory AI Data Sources and Rank by Trust

Internal docs, email, chat, ticketing, CRM, web browsing, file uploads. Assume external and untrusted sources can carry adversarial instructions.

02

Enforce Tool Least-Privilege per Request

Use narrow, ephemeral scopes/tokens for each tool call. Default-deny high-risk actions (send email, share files, export data) unless explicitly authorized.

03

Add Deterministic Guardrails Around Tool Use

Require user confirmation for destructive or outbound actions. Block or sanitize URLs before fetch. Put hard rules on tool chaining (e.g., “web_fetch results cannot trigger email/send”).

04

Introduce Policy-as-Code for Actions and Data Access

Make allow/deny logic reviewable and testable (OPA/Rego, Cedar, etc.). Write rules like: “Only finance group can export invoices,” “No cross-tenant sharing,” “No sending to domains not on allowlist.”

05

Apply ABAC-Style Authorization to AI Requests

Evaluate subject attributes (user role, device posture), object attributes (doc classification), and context (time, network, sensitivity). Don’t let “the model” be the principal — bind actions to the requesting user + session.

06

Put DLP-Style “Egress Brakes” in the Last Mile

Inspect outputs and outbound payloads for secrets (API keys, tokens, PII, customer data). If detected: redact, require escalation, or block.

07

Build a Red-Team Pipeline for Prompt Injection

Curate known IPI techniques and run them continuously against your assistants. Track “before/after” metrics when you update policies, prompts, or model versions.

08

Treat IPI Defense Like Continuous Vuln Management

Subscribe to public disclosures. Triage new patterns quickly. Roll out config/policy mitigations faster than model retraining.

How This Maps to ABAC-Enabled Security

🛡

Policy Enforcement as a Control Plane

Google describes a centralized policy engine governing tool calls, URL sanitization, and tool chaining — exactly the kind of deterministic “policy layer” that ships fast and can be audited.

👤

ABAC Over “Who Can Do What With Which Data”

The practical way to stop IPI from becoming exfiltration: ensure the assistant can’t exceed the user’s intent or privileges. Attribute-based controls encode user role + document sensitivity + requested action + destination constraints.

🔒

Data-Centric Controls Limit Blast Radius

If a model is tricked, your last line of defense is that sensitive data stays protected: classification, DLP checks, outbound restrictions, and strict sharing controls. Make “exfil is hard” even when the model is confused.

🔄

Continuous Mitigation Beats One-Time Hardening

Google’s workflow (discovery → catalog → synthetic variants → deterministic + ML defenses → measurement) is a blueprint for operationalizing AI security. The win isn’t a perfect prompt — it’s a repeatable loop.

Source

  • Google Online Security Blog, “Google Workspace’s Continuous Approach to Indirect Prompt Injection Defense” (April 2026)

Enforce policy at the tool call — not after the exfil.