Your AI Agent's Sandbox Has a Hole in It: DNS Exfiltration and the Bedrock AgentCore Flaw
Commissioned, Curated and Published by Russ. Researched and written with AI.
What’s New This Week
Two incidents broke into the open within a week of each other: BeyondTrust’s Phantom Labs publicly disclosed the Bedrock AgentCore DNS bypass on March 16, 2026, and CodeWall disclosed on March 9 that its AI agent had compromised McKinsey’s internal Lilli platform in under two hours via SQL injection. Both are live stories. The Bedrock flaw remains unpatched – AWS updated its documentation rather than its code.
Changelog
| Date | Summary |
|---|---|
| 23 Mar 2026 | Initial publication covering the Bedrock AgentCore DNS bypass and McKinsey Lilli incident. |
When AWS says its Bedrock AgentCore Code Interpreter runs in “Sandbox” mode with “complete isolation with no external access,” you might reasonably assume that means no external access. It doesn’t.
BeyondTrust’s Phantom Labs published their research on March 16, 2026 – a full disclosure of a technique that allows an attacker with code execution inside a sandboxed AgentCore instance to establish a bidirectional, interactive shell with a remote server. The mechanism is DNS. The CVSSv3 score is 7.5. There is no CVE assigned. AWS has not issued a patch.
This isn’t an esoteric edge case. It’s a direct consequence of an assumption that runs through almost every “sandboxed” cloud execution environment: that blocking HTTP/TCP egress is equivalent to blocking network access. It isn’t.
What Sandboxed Actually Means
Bedrock AgentCore Code Interpreter is a managed service that lets AI agents execute Python, JavaScript, and shell code on behalf of users – think ChatGPT’s code interpreter, but embedded in your agentic workflow. The service offers three network modes: Public, VPC, and Sandbox.
Sandbox was documented as the isolation tier. No external access. Safe to run untrusted or AI-generated code without worrying about data walking out the door.
The compute isolation is real. The underlying Firecracker microVMs provide strong workload separation. The problem is that “network isolation” in this context only blocks general TCP/UDP traffic. DNS queries for A and AAAA records were permitted to egress freely.
This is how DNS works by design. Name resolution has to happen somewhere. The practical question for any sandboxed environment is who controls the resolver and whether queries can reach external nameservers. In AgentCore Sandbox mode, the answer was: yes, they can, and no, this wasn’t documented clearly until the disclosure forced AWS’s hand.
The Bedrock AgentCore Flaw
Phantom Labs researcher Kinnaird McQuade confirmed the DNS leak using Interactsh, an out-of-band testing server. Despite the instance being configured in Sandbox mode with no network access, DNS queries from inside the sandbox reached an external nameserver.
From there, they built a fully functional bidirectional C2 channel. The engineering is straightforward in retrospect:
- Command delivery: An attacker-controlled DNS server encodes commands as base64 chunks, split across the octets of DNS A record responses. Each IP address in the response carries ASCII values. The sandboxed interpreter reassembles them.
- Output exfiltration: The interpreter embeds base64-encoded output into DNS subdomain queries – up to 60 characters per DNS label – directed at the attacker’s nameserver. The nameserver logs the queries and reconstructs the data.
The result is a complete interactive reverse shell over DNS. Traditional network monitoring that watches for outbound HTTP or TCP connections sees nothing.
The blast radius expands significantly if the Code Interpreter is assigned an overprivileged IAM role – which is easy to do accidentally. Once you have a shell inside the interpreter, you have the interpreter’s IAM credentials. Researchers demonstrated using that access to list S3 buckets and retrieve files containing PII, API keys, and financial data, all exfiltrated silently as DNS queries.
AWS’s response: update the documentation to say that Sandbox mode “intentionally allows DNS resolution,” and recommend customers migrate to VPC mode for actual isolation. No patch. The behavior is now considered intentional.
How DNS Exfiltration Actually Works
For those who don’t live in network security: DNS is the system that translates domain names into IP addresses. When your browser connects to example.com, it first asks a DNS resolver “what’s the IP address for example.com?” before any HTTP connection happens.
The key property that makes DNS useful for covert channels is that DNS queries are often treated differently from other network traffic. Firewalls and security groups that block all outbound TCP/80 and TCP/443 commonly allow UDP/53 (DNS) because blocking it entirely breaks name resolution. Even “isolated” environments frequently permit DNS queries to propagate outward for functional reasons.
DNS exfiltration works by encoding data in the query itself rather than the response. Instead of asking a legitimate question like “what’s the IP for google.com?”, the exfiltration tool asks “what’s the IP for ASDFJKLHASHED123DATA456.attacker-controlled-domain.com?” The attacker’s nameserver receives the query, logs the subdomain – which contains the encoded data – and optionally responds with encoded command data in the IP address.
This is why DNS monitoring matters. The exfiltration doesn’t generate any HTTP traffic, any TCP connections, or any of the signals that most network security tooling watches for. It looks like name resolution traffic.
Other Environments with the Same Assumption
Bedrock isn’t alone in this. The pattern of “we block HTTP but allow DNS” appears across major cloud execution environments:
AWS Lambda runs with outbound internet access by default. Restricting a Lambda function typically means attaching it to a VPC with no internet gateway – but that still requires a DNS resolver, and misconfigured VPCs can permit DNS queries to reach external resolvers.
GCP Cloud Run containers have outbound internet access by default. Restricting egress requires explicit VPC Service Controls or firewall rules. DNS queries to Google’s resolvers are permitted in most configurations even with egress restrictions.
Azure Container Apps allows network isolation via VNet integration, but DNS resolution to Azure DNS is always permitted within VNets, and workload identity configurations can still permit DNS queries to propagate through Azure’s resolver infrastructure.
The common thread: “no network access” in cloud environments almost always means “no outbound HTTP/TCP to the internet,” not “no DNS queries that could reach an attacker-controlled nameserver.” These are different things, and the gap between them is the attack surface.
McKinsey Lilli and the Enterprise Blast Radius
On March 9, 2026, security startup CodeWall disclosed that its autonomous AI agent had compromised McKinsey’s internal AI platform, Lilli, in under two hours. No credentials. No insider access.
The vector was a SQL injection vulnerability – 22 unauthenticated endpoints, at least one injectable. The AI agent found the flaw, exploited it, and gained read-write access to a platform serving approximately 40,000 consultants. According to reporting, the exposure included 46 million chat logs and 728,000 private files.
The prompts powering Lilli’s responses were reportedly all writable, meaning an attacker could have modified what the system told its entire user base. McKinsey confirmed the vulnerability in a March 11 statement and said it patched the unauthenticated endpoints within hours of disclosure.
The McKinsey incident isn’t directly about DNS exfiltration – the attack vector was SQL injection, not a sandbox bypass. But it illustrates the same underlying risk: the blast radius of a compromised AI agent is enormous when that agent has access to enterprise data at scale. Lilli was a knowledge platform serving tens of thousands of users. The question that determined the severity wasn’t how the attacker got in – it was what the system had access to once they were in.
That question applies equally to sandboxed code execution environments. The Bedrock flaw gives an attacker a covert channel. Whether that channel leads anywhere meaningful depends entirely on what the environment can reach.
Practical Mitigations
If you’re running agentic workloads in cloud environments, these are the specific things worth doing now:
Migrate off Bedrock AgentCore Sandbox mode for sensitive workloads. AWS is explicit: Sandbox mode permits DNS resolution, VPC mode provides actual isolation. If you’re handling customer data, credentials, or anything with compliance implications, VPC mode is the minimum.
DNS egress filtering. Route all DNS queries through a controlled resolver and implement filtering on what external nameservers that resolver will forward to. RPZ (Response Policy Zones) can block queries to domains not on an allow list. This prevents an attacker-controlled nameserver from receiving exfiltration data even if code execution is achieved.
DNS query logging and anomaly detection. Log all DNS queries from your execution environments. Baseline what normal resolution traffic looks like: the domains queried, query frequency, query length distribution. DNS exfiltration produces distinctive patterns – unusually long subdomains, high query rates to uncommon domains, base64-like character distributions in subdomain labels. These aren’t subtle once you’re looking for them.
Restrict IAM roles on code execution environments. If your Code Interpreter doesn’t need S3 access, it shouldn’t have it. The Phantom Labs demonstration of credential theft and S3 exfiltration was possible because the default IAM configuration was overpermissive. Audit what your execution environments can actually do with their assigned roles.
Treat agent code execution the same as you’d treat user code execution. If you wouldn’t let an anonymous user run arbitrary shell commands with access to your production credentials, your AI agent shouldn’t run in a configuration that allows the same.
What “actually sandboxed” looks like in practice: DNS queries go only to a controlled internal resolver that does not forward to external nameservers. The resolver covers only domains your application legitimately needs to resolve. Query logs are retained and reviewed. IAM roles are scoped to minimum required permissions. The environment has no implicit trust relationship with other parts of your infrastructure.
That’s not a product feature. It’s a configuration that engineers have to deliberately build and maintain. The AWS documentation update is not a mitigation – it’s an acknowledgement that the default behavior allows a covert channel and that the responsibility for closing it sits with you.
The lesson from Bedrock isn’t that AWS shipped a bug. It’s that “sandboxed” is a category claim, not a specification. What the sandbox actually restricts, and what it silently permits, requires reading the documentation carefully and testing the assumptions independently. BeyondTrust did that testing. The rest of us should have done it first.