Threat-Modelling Your First MCP Server: STRIDE for the Agent Era
7/3/2026
The Model Context Protocol has quietly become the USB-C of AI tooling — a standard way to plug an agent into real capabilities: files, databases, APIs, ticketing systems. It's genuinely useful, and that's exactly why it deserves a threat model.
An MCP server is not a chatbot. It's an execution surface. When you connect one, you're granting a non-deterministic, prompt-steerable agent the ability to take actions through whatever tools that server exposes. Before I wire one to anything that matters, I run the oldest, most boring, most reliable exercise in security: STRIDE.
Why STRIDE still fits
STRIDE — Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, Elevation of privilege — was built for systems where untrusted input meets privileged action. That is precisely what an MCP server is. The novelty of agents doesn't replace the fundamentals; it just adds a vivid new way to trigger them: the untrusted input is natural language, and it can come from the data the agent reads.
Let me walk each letter with the MCP-specific version of the threat.
S — Spoofing
Can something pretend to be a trusted party?
- Can a malicious server impersonate a legitimate one in the agent's tool list? (Tool-name collisions, typosquatted servers.)
- Does the server authenticate which client/agent is calling it, or will it serve anyone who connects?
- Are the upstream credentials the server holds bound to a distinct identity, or a shared one?
Mitigations: authenticate both ends; pin trusted servers; give each MCP server its own scoped upstream identity (see machine identity).
T — Tampering
Can inputs, tools, or results be altered?
- Tool poisoning: a tool's description is itself prompt context. A malicious or compromised server can embed instructions in a tool description to steer the agent.
- Can responses be modified in transit, or can one tool's output rewrite the agent's plan?
Mitigations: treat tool descriptions as untrusted content; review them; use transport integrity (TLS); pin tool definitions and alert on changes.
R — Repudiation
Can an action happen with no reliable record?
- If the agent deletes a record or sends a message, can you reconstruct which user request and which context led to it?
- Are tool invocations logged with arguments, or do you only see the final chat?
Mitigations: log every tool call — name, arguments, result, and the invoking user — into your central telemetry. An agent action you can't attribute is an incident you can't investigate.
I — Information disclosure
Can data escape its boundary?
This is the big one, and it's the lethal trifecta wearing an MCP badge. If one server can read private data, another (or the same) can reach the open internet, and the agent ingests untrusted content, you have a ready-made exfiltration path: hidden instructions in a web page or document tell the agent to read secrets and POST them out.
Mitigations: cut a leg of the trifecta. Separate read-private and reach-external capabilities across different agent contexts; sanitise/observe outbound calls; default-deny network egress from tool execution.
D — Denial of service
Can it be made unavailable or ruinously expensive?
- A tool that triggers expensive API calls or large token usage can be driven into a cost-DoS by a crafted prompt loop.
- A slow or hanging tool can stall the agent indefinitely.
Mitigations: rate-limit and budget-cap tool calls; set timeouts; put quotas on anything that costs money downstream.
E — Elevation of privilege
Can the agent do more than intended?
- Does the server expose write/delete when the use case only needs read?
- Can chained tool calls compose into an action no single tool was meant to allow?
- Does the agent inherit the server's broad privileges rather than the user's narrow ones?
Mitigations: least-privilege tool scopes; bind the agent's authority to the invoking user (it should never do for a user what that user couldn't do alone); human approval gates on irreversible actions.
A lightweight process for one server
You don't need a week. For a single MCP server I do this in an afternoon:
- Draw the data-flow. Agent → MCP server → upstream systems. Mark every trust boundary the data crosses.
- List the tools and their real blast radius. For each: what can it read, write, delete, send, or spend?
- Run the six letters against that list, writing down concrete "can someone…?" questions like the ones above.
- Rank by impact × likelihood, and fix the trifecta-shaped findings first.
- Decide the approval gates — the short list of actions that always require a human.
What I'd do next
- Build a reusable MCP threat-model template (the data-flow + the six-letter checklist) so each new server is a fill-in-the-blanks exercise.
- Default every self-hosted MCP server to egress-deny and add network allow-lists per tool.
- Add tool-call logging to central telemetry before connecting anything to production data.
- Write up a worked example on a real server — a filesystem or database MCP — start to finish.
Closing thought
Agents make threat-modelling feel new because the attack can arrive as a polite sentence buried in a document the agent was asked to read. But the discipline is the same one we've always had: untrusted input plus privileged action equals "model it before you ship it."
Before I connect any MCP server, one question decides it: "If a malicious instruction reached this agent through the data it reads, what's the worst tool it could reach — and have I gated that?"