Skip to content

Dealing with AI Risk

Every few years, something new comes along and the infosec community collectively loses its mind. AI is no different. Businesses are moving fast, GRC teams are scrambling to catch up, and everyone is asking the same question - how do we manage this risk?

I've been in this space long enough to know that the answer is usually simpler than it looks.

It's not that new

Here's the thing - an LLM isn't actually a new type of risk. It's a combination of two things your organisation already knows how to deal with:

  • An API - something that takes input, does something with it, and spits out a result
  • A contractor acting on your behalf - someone you've given access and trust to, who makes decisions and takes actions in your name

The tricky part is that an LLM is both of these things at the same time. Your AppSec team has always managed APIs. Your GRC team has always managed human agent risk. They've just never had to manage the same thing, together, at the same time.

That's where the confusion comes from. Not because AI is fundamentally different, but because it breaks down a boundary that's always existed between two separate disciplines.

Mapping the risk

If you think about it that way, the risk categories start to look pretty familiar:

Risk As an API As a Human Agent As an LLM
Injection SQL/code injection Social engineering Prompt injection - same input, both problems
Impersonation Spoofed auth tokens Fake instructions from "management" Compromised system prompt
Scope Violation Privilege escalation Insider threat Excessive agency - it reasons its way into things
Audit Evasion Log tampering Hiding actions Generates plausible justifications that look clean
Denial of Function DoS, resource exhaustion Poor decisions under pressure Context stuffing, token exhaustion
Data Exposure Sensitive data in API responses Oversharing System prompt leakage, training data exposure
Manipulation Malformed inputs Phishing, coercion Jailbreaking - manipulating reasoning, not code
Scale Automated exploit scripts A rogue employee leaves a trail A manipulated LLM acts at API speed, with a straight face

You already have the controls

This is the part no one wants to hear, because it means the problem is on us. If your organisation has a mature API security practice and a halfway decent approach to managing third-party and human agent risk, you've already got most of what you need.

Control API Context Human Agent Context Applied to LLMs
Least Privilege Scoped API keys Role-based access, need-to-know Scoped tool access, action whitelisting
Audit Trail Immutable request/response logging Action logs, four-eyes principle Log inputs and reasoning, not just outputs
Input Validation Sanitise all inputs Verify instructions through proper channels Prompt filtering, instruction hierarchy
Segregation of Duties Separate services for sensitive operations No single person end-to-end Model reasons, tool layer executes, human approves irreversible actions
Rate Limiting Throttle requests, detect anomalies Workload monitoring Token budgets, query limits
Escalation Paths Alert on anomalous behaviour Escalate unusual requests Hard limits on action scope, human-in-the-loop for high-risk actions

What about the frameworks?

There are a few worth knowing about:

  • OWASP LLM Top 10 - probably the most practical starting point, written from an application security angle
  • MITRE ATLAS - good on the technical attack side, modelled on the same approach as MITRE ATT&CK
  • NIST AI RMF - governance heavy, maps reasonably well to ISO 27001 thinking
  • EU AI Act - regulatory, useful for compliance conversations, not really a security framework

None of them do a great job of bridging the API security world and the human agent trust model. That's the gap where most organisations are currently exposed.

The bottom line

I'm not going to dress this up. Your LLM is an API that can also be talked into doing things it shouldn't. You can exploit it technically, or you can manipulate it contextually - and sometimes the same single input does both.

The good news is you don't need to invent anything new. Apply your API security controls to the endpoint. Apply your human agent trust model to the actions it takes. Figure out where those two things overlap, and make sure someone owns that gap.

That's not a new discipline. That's just GRC doing its job.