Commissioned, Curated and Published by Russ. Researched and written with AI.


What’s New

Mozilla published details today alongside Anthropic’s technical writeup. The headline: Claude Opus 4.6 found 22 CVEs in Firefox in roughly two weeks, 14 of them high-severity. All fixes shipped in Firefox 148. This is the first public account of frontier AI being used to find novel, previously unknown vulnerabilities in a major, mature, heavily-scrutinised codebase – and Mozilla is now integrating AI-assisted analysis into their own internal security workflows as a direct result.


Changelog

DateSummary
6 Mar 2026Initial publication.

Claude Opus 4.6 found a Use After Free vulnerability in Firefox’s JavaScript engine in twenty minutes. By the time the Anthropic team had validated and filed that first bug, Claude had already found fifty more unique crashing inputs.

In two weeks, scanning roughly 6,000 C++ files, the Frontier Red Team submitted 112 reports to Mozilla. 22 warranted CVEs. 14 were classified as high-severity – representing almost a fifth of all high-severity Firefox vulnerabilities remediated across the entirety of 2025. All fixes are now in Firefox 148.

Firefox. Not some internal CRUD app. Not a startup codebase with two engineers and a prayer. One of the most scrutinised open source projects in the history of software, carrying decades of fuzzing, static analysis, and continuous security review by some of the best engineers in the industry.

If this doesn’t reframe how you think about AI and security testing, read it again.


What Actually Happened

Anthropic’s Frontier Red Team started by testing whether Claude could reproduce historical Firefox CVEs in older versions of the codebase – vulnerabilities that had already taken significant human effort to find. Opus 4.6 could reproduce a high percentage of them. Suspicious of the result (historical CVEs might be in training data), they shifted to the harder question: could it find novel bugs in the current codebase?

They started with the JavaScript engine, for practical reasons. It’s an independent slice of the codebase, analysable in isolation, and it processes untrusted external code on every page load – high impact, well-bounded scope. After twenty minutes, Claude reported a Use After Free. The team validated it in an isolated VM. Then filed the bug to Bugzilla with a description, a proposed patch written by Claude, and a minimal reproducible test case.

That last part matters enormously. More on it shortly.

Mozilla saw the quality of the submission and encouraged Anthropic to submit all findings in bulk without manually validating each one first. By the end, 112 unique reports. Most lower-severity findings were assertion failures – overlapping with what fuzzing traditionally finds. But Claude also identified distinct classes of logic errors that fuzzing had not previously uncovered. That’s the line worth paying attention to.

Mozilla’s summary: “This is analogous to the early days of fuzzing; there is likely a substantial backlog of now-discoverable bugs across widely deployed software.”


The Fuzzing Analogy – and Why It’s the Right One

When fuzzing arrived as a serious technique in the 1990s, it was transformative precisely because it found categories of bugs that manual review missed – not by being smarter than humans, but by operating at a different speed and scale. Feeding a parser millions of malformed inputs isn’t something a human code reviewer can do. Fuzzing didn’t replace security engineers. It added a capability that didn’t exist before.

AI-assisted vulnerability detection looks like the same kind of category shift.

Fuzzing is exceptional at surface-level crashes: feed bad input, observe crash, file report. What it struggles with is logic errors – bugs that don’t crash the program but corrupt state in ways that can be exploited. Those require understanding what the code is supposed to do, then reasoning about where the implementation diverges from that intent. That’s a different kind of analysis. It’s what humans do when they audit code. It’s also, apparently, what Claude Opus 4.6 can now do at scale.

The classes of logic errors Claude found in Firefox – the ones fuzzing hadn’t touched – are the evidence for this. These aren’t random crashes. They represent a category of vulnerability that requires semantic understanding of the codebase. Automated tools have historically been bad at this. Claude wasn’t.

Fuzzing took years to become standard practice after it was demonstrated to work. The teams that integrated it early found bugs before the teams that waited. The analogy holds.


The Backlog of Now-Discoverable Bugs

Here’s the uncomfortable implication of the Firefox result.

Mozilla chose Firefox as the proving ground precisely because it’s been so thoroughly tested. Decades of fuzzing. Static analysis at scale. A global community of security researchers. Regular internal audits. If you’re going to demonstrate that AI finds things humans miss, you do it on the codebase where humans have tried hardest. That’s the point of the experiment.

And Claude still found 22 CVEs.

So what does this mean for codebases that haven’t had anything like that level of scrutiny? Legacy enterprise software written before modern security practices. Internal tools built under deadline pressure without a dedicated security team. Open source projects maintained by one or two people. Widely-deployed libraries that everyone uses and nobody audits properly.

Mozilla put it plainly: there is likely a substantial backlog of now-discoverable bugs across widely deployed software.

“Backlog” is doing a lot of work in that sentence. It doesn’t mean a few edge cases. It means a systematic category of vulnerabilities – the logic error class that AI can now find – that has accumulated for years across everything we’ve built and never properly stress-tested. Some of it is in software you depend on right now.

This is not speculative. The Firefox result is the demonstration. The question is how quickly teams move from “interesting research” to “we should actually be doing this.”


The Two-Sided Risk

The same capability that let Anthropic find bugs defensively can find them offensively. This is not a hypothetical.

Anthropic also ran the other direction: could Claude turn the vulnerabilities it found into working exploits? They spent roughly $4,000 in API credits running hundreds of exploit attempts. Claude succeeded in two cases – crude exploits, working only in test environments with sandbox protections disabled, but demonstrating that the pipeline from vulnerability discovery to exploit development is closing.

The current gap is significant. Claude is substantially better at finding bugs than exploiting them. Finding is cheap; exploitation is expensive. That gap is what gives defenders the current advantage – discover and patch faster than attackers can weaponise.

But that gap is narrowing. Anthropic said explicitly that they don’t expect it to last.

Mozilla got there first this time. Anthropic disclosed responsibly, worked collaboratively with the Firefox team, helped triage reports, and proposed patches. That’s the best-case scenario: a well-resourced safety-focused lab finding bugs in a major project and helping fix them before anyone else knows they exist.

The next researcher who runs this kind of analysis may not file anything with the maintainer.

This is the concrete version of the AI security arms race that’s been discussed abstractly for years. It’s not a future concern. It’s already happening. The question is whether your organisation is building defensive capability or hoping no one looks at your code.


What This Looks Like in Practice

If you’re a security engineer or engineering lead, the natural question is: what does integrating this actually look like? Anthropic published specifics worth paying attention to.

The quality bar is minimal reproducible test cases. This is the detail that made the Firefox collaboration work. AI bug reports have a terrible reputation in open source communities precisely because they’re usually vague, unverified, and generate noise for maintainers. The reason Mozilla responded to Anthropic’s first submission within hours was the test case quality – not just “here’s a crash” but “here’s the minimal code that reproduces it, here’s why it’s a security issue, here’s a proposed patch.”

If you’re using AI tooling for security research, this is the standard you need to hit. Anything below it creates more work for your team and erodes trust in the findings. Anything at or above it means findings get fixed fast.

Task verifiers are the key enabling infrastructure. Claude’s effectiveness came from being able to check its own work. Anthropic calls these “task verifiers” – trusted tools that confirm whether the agent’s output achieves its goal. In the Firefox case, verifiers confirmed that a crash was real, that a proposed fix actually removed the vulnerability, and that the fix didn’t break existing tests. Without that feedback loop, the agent’s output would be much lower quality.

For teams looking to build this internally: the verifiers are the hard part. You need a reproducible test environment, a way to confirm crash validity, and test suites that catch regressions. That’s a meaningful infrastructure investment – but it’s the prerequisite for the results Anthropic got.

Triage capacity matters. One reason the collaboration worked is that Mozilla engineers were available to engage quickly. They redirected Anthropic’s submission approach mid-stream, helped calibrate what warranted a report, and validated findings fast. 112 reports in two weeks required actual human capacity on the receiving end.

If you run this kind of analysis on your own codebase, you need to staff the triage. Finding more bugs than your team can fix is a specific kind of problem – not insurmountable, but one to plan for rather than discover mid-engagement.

The entry point is the JavaScript engine problem, applied to your stack. The Firefox team chose the JS engine as a starting point because it’s an independent, high-impact slice of a complex codebase. The equivalent question for most engineering teams: what’s the component with the highest attack surface, the most untrusted input, and the most tractable scope for initial analysis? Start there.

Anthropic has also released Claude Code Security in limited research preview – bringing these vulnerability discovery and patching capabilities directly to customers and maintainers. That’s the signal that this is moving from Anthropic internal tooling to something teams can actually deploy.


Where This Leaves Us

Frontier language models are now, in Anthropic’s own framing, world-class vulnerability researchers. The Firefox result is the proof of concept, not the ceiling. Anthropic has also found vulnerabilities in the Linux kernel using the same approach, with more to come.

The window where defenders have the advantage – where AI is better at finding bugs than exploiting them – is real. It’s also finite. Teams that build AI-assisted security workflows now get to use that window. Teams that wait for this to become standard practice are betting the window stays open long enough.

Mozilla is already integrating AI-assisted analysis internally. That’s the correct response to a tool that found 22 CVEs in your codebase in two weeks.

The next question is who looks at your codebase. You should probably get there first.


Sources: Mozilla BlogAnthropic Technical Writeup