Threat Modeling AI Systems: What Changes When the Model Is the Attack Surface

Traditional threat modeling — STRIDE, PASTA, attack trees — assumes you’re reasoning about deterministic systems. You enumerate inputs, trace data flows, identify trust boundaries, and the system behaves consistently within those constraints.

AI systems break that assumption at the foundation.

When the model is the attack surface, the threat landscape shifts in three fundamental ways:

1. The Input Space Is Unbounded

A SQL injection attack targets a parser with known grammar. An LLM processes natural language — infinite valid inputs, each potentially triggering different behavior. You can’t enumerate the attack surface; you can only probe it.

This means traditional coverage metrics are meaningless. “We tested 1,000 inputs” tells you almost nothing about the 10,001st.

2. Behavior Is Probabilistic, Not Deterministic

The same prompt, at temperature > 0, can produce different outputs. Security controls that rely on consistent output formatting — content filters, structured output validators — have failure modes that don’t appear in testing and surface unpredictably in production.

3. The Training Data Is Part of the Attack Surface

Data poisoning, backdoor attacks, and membership inference attacks operate at a layer most security teams have never had to think about. The “supply chain” for an AI system includes every document it was trained on.

What To Do About It

The honest answer: adapt your threat model to match probabilistic systems.

Red team with adversarial prompts, not just functional test cases
Treat model outputs as untrusted user input — validate downstream, always
Monitor output distributions in production, not just errors
Include the model card and training provenance in your supply chain review

The frameworks are catching up. MITRE ATLAS gives you ATT&CK-style adversarial ML taxonomy. OWASP LLM Top 10 covers the most common failure modes. But the mindset shift matters more than the framework: when behavior is probabilistic, security is statistical.