Dealing with AI Risk¶

Every few years, something new comes along and the infosec community collectively loses its mind. AI is no different. Businesses are moving fast, GRC teams are scrambling to catch up, and everyone is asking the same question - how do we manage this risk?

I've been in this space long enough to know that the answer is usually simpler than it looks.

It's not that new¶

Here's the thing - an LLM isn't actually a new type of risk. It's a combination of two things your organisation already knows how to deal with:

An API - something that takes input, does something with it, and spits out a result
A contractor acting on your behalf - someone you've given access and trust to, who makes decisions and takes actions in your name

The tricky part is that an LLM is both of these things at the same time. Your AppSec team has always managed APIs. Your GRC team has always managed human agent risk. They've just never had to manage the same thing, together, at the same time.

That's where the confusion comes from. Not because AI is fundamentally different, but because it breaks down a boundary that's always existed between two separate disciplines.

Mapping the risk¶

If you think about it that way, the risk categories start to look pretty familiar:

Risk	As an API	As a Human Agent	As an LLM
Injection	SQL/code injection	Social engineering	Prompt injection - same input, both problems
Impersonation	Spoofed auth tokens	Fake instructions from "management"	Compromised system prompt
Scope Violation	Privilege escalation	Insider threat	Excessive agency - it reasons its way into things
Audit Evasion	Log tampering	Hiding actions	Generates plausible justifications that look clean
Denial of Function	DoS, resource exhaustion	Poor decisions under pressure	Context stuffing, token exhaustion
Data Exposure	Sensitive data in API responses	Oversharing	System prompt leakage, training data exposure
Manipulation	Malformed inputs	Phishing, coercion	Jailbreaking - manipulating reasoning, not code
Scale	Automated exploit scripts	A rogue employee leaves a trail	A manipulated LLM acts at API speed, with a straight face

You already have the controls¶

This is the part no one wants to hear, because it means the problem is on us. If your organisation has a mature API security practice and a halfway decent approach to managing third-party and human agent risk, you've already got most of what you need.

Control	API Context	Human Agent Context	Applied to LLMs
Least Privilege	Scoped API keys	Role-based access, need-to-know	Scoped tool access, action whitelisting
Audit Trail	Immutable request/response logging	Action logs, four-eyes principle	Log inputs and reasoning, not just outputs
Input Validation	Sanitise all inputs	Verify instructions through proper channels	Prompt filtering, instruction hierarchy
Segregation of Duties	Separate services for sensitive operations	No single person end-to-end	Model reasons, tool layer executes, human approves irreversible actions
Rate Limiting	Throttle requests, detect anomalies	Workload monitoring	Token budgets, query limits
Escalation Paths	Alert on anomalous behaviour	Escalate unusual requests	Hard limits on action scope, human-in-the-loop for high-risk actions

What about the frameworks?¶

There are a few worth knowing about:

OWASP LLM Top 10 - probably the most practical starting point, written from an application security angle
MITRE ATLAS - good on the technical attack side, modelled on the same approach as MITRE ATT&CK
NIST AI RMF - governance heavy, maps reasonably well to ISO 27001 thinking
EU AI Act - regulatory, useful for compliance conversations, not really a security framework

None of them do a great job of bridging the API security world and the human agent trust model. That's the gap where most organisations are currently exposed.

The bottom line¶

I'm not going to dress this up. Your LLM is an API that can also be talked into doing things it shouldn't. You can exploit it technically, or you can manipulate it contextually - and sometimes the same single input does both.

The good news is you don't need to invent anything new. Apply your API security controls to the endpoint. Apply your human agent trust model to the actions it takes. Figure out where those two things overlap, and make sure someone owns that gap.

That's not a new discipline. That's just GRC doing its job.