The Lethal Trifecta: Setting Secure AI Baselines for Organisations

9/1/2025

The Lethal Trifecta: Setting Secure AI Baselines for Organisations

When my colleague shared Simon Willison's essay The Lethal Trifecta, it hit a nerve. We've been debating how to adopt AI tools securely across the organisation. The trifecta is a simple, memorable way to describe the core risk:

Access to private data - internal documents, code, emails.
Exposure to untrusted content - web pages, emails, user inputs.
External communication - the ability to send data out, through HTTP requests or APIs.

On their own, each vector is manageable. But if an AI system combines all three, it s a perfect target. An attacker only needs to slip in malicious instructions hidden in "untrusted content" (say, a web page or a phishing email). The model obediently follows those instructions: find the secrets in private data, and exfiltrate them via external communication.

This isn't science fiction. We've already seen it in the wild. Copilot, ChatGPT Plugins, Google Bard, Supabase MCP — all have reported or patched prompt injection exploits that fit this pattern. The danger is not whether these vendors fix vulnerabilities quickly (they usually do), but the fact that the combination itself is unsafe.

Why this matters for secure AI adoption

As someone responsible for enterprise security, I see the trifecta as a baseline framework. It gives us a quick lens to evaluate AI tools:

Does this workflow give the agent access to sensitive data?
Is it reading untrusted input without sanitisation?
Can it send outbound communications without oversight?

If the answer is “yes” to all three, you’re holding a live grenade.

In our environment, this is not hypothetical. Imagine layering an AI assistant on top of JumpCloud or Google Workspace to help with IT automation. If it can both read user data and act on external instructions, it could be tricked into emailing the full employee directory to an attacker. That’s not just a security breach — it’s an IPO-level governance incident.

Baseline controls to break the trifecta

The simplest mitigation: remove at least one leg of the trifecta.

If an agent must access private data → restrict or block outbound communication.
If an agent must send external messages → strip it of privileged access to sensitive data.
If it must read untrusted inputs → sandbox that step, validate the content, and require approvals for downstream actions.

Other guardrails we should adopt as policy:

Least privilege: give the agent the narrowest possible scope of data and actions.
Human-in-the-loop: require review before any outbound communication involving sensitive material.
Monitoring & logging: detect covert exfiltration (e.g. secrets hidden in log metadata).
Vendor/tool vetting: maintain an approved list of AI tools that pass a trifecta review.

Critical Reflection: Is the Trifecta Enough?

The lethal trifecta is a sharp design lens — but it is not a complete baseline for secure AI adoption. It highlights where the worst risks lie, yet leaves gaps in identity, accountability, and governance.

Identity & Access

Without SSO and role-based access controls (RBAC), even an AI agent that avoids the trifecta can be hijacked by insiders or compromised accounts.
Example: An internal AI assistant limited to internal docs (no external comms, no untrusted input) still becomes a liability if a contractor’s account is compromised due to weak MFA enforcement.
Baseline safeguard: enforce SSO across all AI tools (via JumpCloud or Okta) and apply RBAC so agents only operate with the minimum data scope tied to user identity.

Accountability

The trifecta helps prevent the obvious exfiltration patterns, but immutable audit logs are what allow us to prove what happened and respond effectively.
Example: Suppose an agent summarises sensitive emails but doesn’t exfiltrate them. Without detailed logs, you can’t know which inboxes were touched, whether the summaries were stored, or if the output contained regulated data. That blind spot could sink you in an audit.
Baseline safeguard: Admin and access logs stored in a tamper-proof system (e.g. SIEM like Splunk, or Google Workspace audit feeds ingested into Palo Alto Cortex XDR, with network visibility reinforced via Zscaler). Logs should capture which user invoked the agent, what data it touched, and what actions it took.

Governance & Compliance

Frameworks like MAS TRM (Singapore), ISO 27001, and NIST AI RMF require organisations to document controls, assign accountability, and review AI tools regularly. The trifecta doesn’t meet these obligations on its own.
Example: An AI plugin that avoids the trifecta may still violate MAS TRM requirements if no third-party risk assessment was done. In regulatory terms, the technical design doesn’t matter if the governance steps are skipped.
Baseline safeguard: subject every AI tool to a risk assessment, record it in the ISMS, and review annually against MAS TRM/ISO controls.

Threat Coverage

Prompt injection and data exfiltration are critical, but they are not the only AI risks.

Model bias: customer-facing chatbots trained on skewed data could generate discriminatory responses.
IP leakage: engineers pasting proprietary code into ChatGPT (Samsung’s 2023 case) expose sensitive IP.
Regulatory exposure: using AI without consent controls may breach GDPR/PDPA, even if the trifecta is avoided.
Baseline safeguard: enforce policies for acceptable data use, enable DLP monitoring in Zscaler, and run red-team exercises to stress-test AI behaviours.

A Realistic Solution Framework

So what does a practical baseline look like?

Trifecta as a Design Guardrail
- No AI system should ship if it enables all three legs (private data + untrusted input + external comms).
Identity & Access First
- SSO enforced for all AI tools.
- RBAC to keep agents scoped.
Audit & Monitoring
- Route AI usage logs into Cortex XDR for correlation.
- Enforce outbound monitoring with Zscaler to catch covert data leaks.
Governance Layer
- Maintain an AI registry.
- Risk-assess every AI tool before onboarding.
- Map controls to MAS TRM / ISO 27001 / NIST AI RMF.
Red-Team & Awareness
- Run prompt-injection simulations.
- Train staff on acceptable AI use (no source code dumps, no customer data in prompts).

This layered approach gives us design safety (trifecta) + operational accountability (SSO, logs, governance, monitoring).

What I’d Do Next

Map our AI pilots against the trifecta to see where the weak points are.
Kick off a red-team exercise to simulate prompt injection and covert exfiltration.
Formalise an AI adoption baseline that integrates design guardrails, IAM, logging, and governance.
Share the findings with engineering, product, and compliance so this becomes an org-wide standard, not just a security checklist.

Closing Thought

The lethal trifecta is not “too much” — it’s too little if treated as the only baseline. It’s a strong starting point, but not the finish line.

Every new AI project should begin with a single, disarming question:

“Which leg of the trifecta are we cutting off — and how are we enforcing accountability around the rest?”

That, I believe, should be the baseline for secure AI adoption.