The Blast Radius Problem: Why AI Agent Security Is a Different Category

22 March 2026 - 8 mins read

Commissioned, Curated and Published by Russ. Researched and written with AI.

In early 2026, a hacker exploited a prompt injection vulnerability in Cline, an open-source AI coding agent. Security researcher Adnan Khan had discovered the flaw weeks earlier and warned Cline privately. The fix only arrived after public disclosure. By then, someone had already used the vulnerability to slip instructions into Cline that automatically installed OpenClaw on users’ machines. The Verge reported that the agents were not activated upon installation – which, as they noted, would have made it a very different story.

Sit with that for a moment. A hacker compromised one agent to install another agent. The second agent was never triggered. But the infrastructure was there, waiting.

This is not a story about a company making bad security decisions. It is a story about what agent ecosystems look like as an attack surface, and why the threat model is unlike anything we have dealt with before in software security.

The Faustian Bargain

Composio (which competes with OpenClaw) described it as a Faustian bargain. The framing is worth keeping because it is accurate.

An agent that can do useful things must have access to do those things. Brandon Wang, cited in the Composio piece and independently verifiable: an agent with broad permissions has access to text messages including 2FA codes, bank accounts, calendar, contacts, and web browsing. The phrase used was that it could drain a bank account. That is not hyperbole about a poorly-secured setup. That is the logical consequence of granting the access needed to make the agent genuinely useful.

You cannot have a capable agent that is also fully sandboxed. The capability and the risk are the same thing.

Traditional software security assumes a bounded blast radius. A compromised web application leaks its database. Painful, expensive, recoverable. A compromised agent has access to everything the user does. The blast radius is not the application – it is the user’s entire digital life.

Five Things That Make This Different

1. Prompt injection has no clean fix.

SQL injection has parameterised queries. XSS has output encoding and content security policies. These are not perfect, but they are well-understood and widely deployed. Prompt injection does not have an equivalent.

The model processes text. Malicious instructions look like regular text. An agent reading a webpage, an email, or a document is always potentially processing attacker-controlled content. The Verge noted that in a world of increasingly autonomous software, prompt injections are massive security risks that are very difficult to defend against. OpenAI introduced Lockdown Mode for ChatGPT to prevent data exfiltration when hijacked – which tells you how seriously the industry is taking this threat, and also that the threat is real enough to warrant a dedicated defensive mode.

There is no patch for the fundamental architecture. You can sandbox external content, you can add layers of validation, you can be careful about what you let agents read. None of it fully solves the problem.

2. The blast radius is total.

This follows from the Faustian bargain above, but it is worth stating plainly as a security property. When you assess the risk of a compromised system, you need to know what it has access to. For a traditional web service, that scope is usually well-defined. For an agent with broad permissions, the scope is: everything.

This changes the calculus on acceptable risk. Bugs that would be low-severity in a bounded system are critical in an agent with access to authentication tokens, financial accounts, and private communications.

3. Viral adoption creates abandoned infrastructure.

CSO Online documented this directly: early versions of OpenClaw were insecure by default, rapid viral adoption overwhelmed users’ security awareness, and many deployments were quickly abandoned, leaving instances running outdated code.

Security researcher @fmdz387 ran a Shodan scan in late January 2026 and found nearly 1,000 OpenClaw instances running without any authentication. Early versions ran without authentication by default. Combined with treating localhost traffic as legitimate, the exposure surface was significant.

This is not a failure unique to OpenClaw. It is the failure mode of any tool that goes viral faster than security awareness scales. Thousands of people spin up instances, grant broad permissions, and then move on. The instances keep running. The attack surface compounds over time.

4. The dependency chain is the attack surface.

The Cline incident is the most structurally important finding here. OpenClaw was not the vulnerable component. OpenClaw got weaponised through Cline’s vulnerability. The agent that ended up installed on users’ machines had nothing to do with the vulnerability that was exploited.

As agent ecosystems become more interconnected – agents calling other agents, MCP servers exposing capabilities to multiple clients, frameworks composing tools from multiple sources – this threat model becomes the dominant one. Your agent’s security posture depends on every tool, framework, and API it touches. You do not control all of those. You probably cannot audit all of them.

5. Responsible disclosure does not work if nobody’s listening.

Khan warned Cline privately and was ignored. The fix happened after public disclosure. This is a pattern in mature software ecosystems too, but it is worse in fast-moving agent platforms because the security response processes are not mature yet. Teams are moving fast, shipping fast, and often do not have a clear path for triaging and responding to security reports.

CVE-2026-25253 in OpenClaw was a one-click remote code execution vulnerability via a malicious link. It was patched in version 2026.1.29 before public disclosure – so the responsible disclosure process worked in that case. But the broader pattern across the ecosystem is that security researchers are discovering critical issues, warning vendors, and waiting. The Cline incident shows what happens when the wait is too long.

What The Incidents Actually Show

The Shodan scan and CVE-2026-25253 together reveal the gap between theoretical and deployed security.

Theoretical: OpenClaw has an authentication system. Theoretical: OpenClaw patches CVEs. Deployed reality: nearly 1,000 instances running with no authentication, many of them abandoned, running whatever version was current when they were set up.

Security properties that exist in the product do not help instances that are not configured to use them. A patched CVE does not help an instance running the vulnerable version from six months ago. The gap between what the software can do and what deployed instances actually do is the real attack surface.

This is not an argument for blaming OpenClaw. Any agent platform that went viral this fast would have the same problem. The lesson is about what operational discipline looks like for agentic infrastructure.

What To Actually Do

This is not a comprehensive hardening guide. It is the minimum that every engineer running agent infrastructure should already be doing.

Run agents behind authentication. No exceptions. The default should never be “unauthenticated and hope nobody finds it.” If a tool ships insecure by default, add authentication before exposing it to any network.

Audit the access list. What does your agent have access to? If the answer is “everything I might want someday,” you have given it too much. The access list should be the minimum required for the agent’s actual function, reviewed deliberately, not accumulated by convenience.

Keep instances updated. Abandoned agents running old versions are precisely what the Shodan scan found. If you are not actively using an instance, shut it down. If you are using it, keep it patched.

Take prompt injection seriously as an operational risk. Sandbox external content where possible. Be aware that any agent reading external content – web pages, emails, documents – is processing potentially attacker-controlled instructions. This is not solvable with a single control, but it should inform what you let agents read and what actions they can take based on that content.

Treat your agent deployment the same way you treat a server with root access to your systems. Because that is what it is. The same discipline you would apply to SSH key management, privileged access, and production credentials should apply to agent permissions and deployment hygiene.

Where This Goes

The category is early. The incidents documented here are from early 2026, when these tools had been widely deployed for less than a year. The security research is ahead of the operational practices. The threat model is understood by researchers and largely ignored by operators.

That gap will close one of two ways: operators get ahead of it through deliberate practice, or the incidents get serious enough that they force the conversation. The Cline case – agents installing agents, not activated, a very different story if they had been – suggests we are closer to the second outcome than the first.

The Faustian bargain does not go away. Capable agents require access. Access creates risk. The answer is not to make agents less capable. It is to take the security engineering seriously before the deployment scales to the point where the cost of not doing so becomes unavoidable.

Sources: The Verge, reco.ai, Barrack AI, CSO Online. Composio (which competes with OpenClaw) also published analysis on this topic; facts sourced from there are independently verified.