Anthropic Leaked Its Own Frontier Model. The Root Cause Is Embarrassingly Common.

27 March 2026 - 6 mins read

Commissioned, Curated and Published by Russ. Researched and written with AI.

What’s New This Week

Anthropic’s CMS misconfiguration was discovered on 26 March 2026, reported by Fortune and independently assessed by researchers from LayerX Security and the University of Cambridge. Anthropic removed public access to the data store the same day after being contacted. The company confirmed it is testing a new model with early-access customers and called it a “step change” – but has not yet announced a release date or general availability.

Changelog

Date	Summary
27 Mar 2026	First published.

On Thursday, 26 March 2026, two things happened at the same time: Anthropic accidentally told the world its next frontier model exists, and demonstrated a textbook CMS opsec failure affecting roughly 3,000 unpublished assets.

The mechanics are straightforward. Digital assets created in Anthropic’s content management system are set to public by default and assigned a publicly accessible URL at upload time – unless a user explicitly changes that setting. Nobody changed the setting. The result was a publicly searchable data store containing images, PDFs, audio files, and – most consequentially – a draft blog post announcing a model called Claude Mythos, with the internal codename Capybara.

The draft was located and reviewed by Fortune, and independently assessed by Roy Paz (senior AI security researcher at LayerX Security) and Alexandre Pauwels (cybersecurity researcher at the University of Cambridge). Anthropic confirmed the leak in a statement to Fortune, attributing it to “human error” in CMS configuration and describing the exposed material as “early drafts of content considered for publication.” After being contacted by Fortune, Anthropic removed the public’s ability to search and retrieve documents from the store.

What the Draft Said

The leaked blog post is structured as a product announcement. It introduces a new model tier above Opus – currently Anthropic’s top-end tier – under the name Capybara. According to the draft, reviewed by Fortune: “‘Capybara’ is a new name for a new tier of model: larger and more intelligent than our Opus models – which were, until now, our most powerful.”

The same document describes the completed training of Claude Mythos, calling it “by far the most powerful AI model we’ve ever developed.” Capybara and Mythos appear to refer to the same underlying model – one is the product tier name, the other the model name.

Anthropic’s current three-tier structure – Haiku (fast, small), Sonnet (balanced), Opus (most capable) – would expand. Capybara sits above Opus and is described as more expensive to run, not yet ready for general release.

The draft also includes a benchmark comparison, claiming the model “gets dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity” than Claude Opus 4.6, Anthropic’s current best model.

Anthropic confirmed this publicly. In a statement to Fortune, the company said: “We’re developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity. Given the strength of its capabilities, we’re being deliberate about how we release it. As is standard practice across the industry, we’re working with a small group of early access customers to test the model. We consider this model a step change and the most capable we’ve built to date.”

The Cybersecurity Dual-Use Problem

The draft blog post gives more detail on why Anthropic is being cautious. The leaked document describes the model as “currently far ahead of any other AI model in cyber capabilities” and warns that “it presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”

That framing is worth paying attention to. Anthropic is not just saying the model is unusually good at cybersecurity tasks as a feature. It is explicitly flagging concern that the capability gap between attack and defence could widen if this model were released without controls. The early-access strategy is described in the draft as being oriented toward cyber defenders – giving them a head start before the same capabilities are available to attackers.

This isn’t new territory for frontier labs. In February, OpenAI released GPT-5.3-Codex and described it as the first model it had classified as “high capability” under its Preparedness Framework for cybersecurity tasks. Anthropic navigated similar questions with Opus 4.6, which could surface previously unknown vulnerabilities in production codebases. Mythos appears to be a further step on the same trajectory.

The Opsec Failure That Caused All of This

The capability revelations are significant. But the leak mechanism itself deserves attention from anyone who runs infrastructure with content management systems.

The failure mode is not exotic. Anthropic uses an off-the-shelf CMS. That CMS defaults to public visibility for all uploaded assets. Someone – or multiple people, across multiple uploads spanning apparently years of unpublished content – did not change the default. Close to 3,000 assets accumulated in a publicly searchable location. Some of those assets were innocuous discards: unused images, old banners, logos. Others were not: a document apparently relating to an employee’s parental leave, and a PDF containing details of a private CEO retreat in the UK that Dario Amodei is scheduled to attend.

The lesson is not that Anthropic made an unusual mistake. The lesson is that default-public CMS behaviour is a well-understood failure mode, and large organisations with sensitive unpublished content routinely fail to address it. Headless CMS platforms, CDN asset storage, staging environments with public-accessible URLs – these are all variations on the same class of problem. The control that failed here is not technical complexity. It is a permission checkbox that someone did not tick.

For engineering teams: if you run any content infrastructure where assets are staged before publication, and if the default visibility on that infrastructure is public, you have the conditions for this exact incident. The fix is not complicated. The consistent application of it, across every new asset from every contributor, is where organisations typically fail.

What Happens Next

Anthropic has confirmed the model is real and in testing. It has not confirmed a release timeline. Given the stated cybersecurity concerns and the early-access structure described in the draft, a gradual rollout to vetted organisations – particularly those in security and critical infrastructure – seems most likely before any general availability.

The draft described the model as expensive to run and not ready for broad release. Whether the unplanned announcement accelerates or disrupts that strategy is the more interesting question. For a company that has positioned itself explicitly around responsible release practices, having the announcement taken out of its hands by a misconfigured upload setting is an uncomfortable footnote.