Anthropic Testing Reveals Claude Fable 5 Carried No Exceptional Security Risk as Mythos 5 Goes Live Again
Anthropic's internal comparative testing found that other AI models, including GPT-5.5 and Kimi K2.7, could replicate the same cybersecurity risks attributed to Claude Fable 5, which officially returned globally on July 2 after an 18-day suspension.
Anthropic has concluded that its flagship Claude Fable 5 model did not represent a uniquely dangerous cybersecurity threat — a finding that accompanied the model's global relaunch on July 2, following an 18-day suspension that began on June 12 due to US export control measures.
The AI company conducted comparative testing against rival models after regulators moved to restrict Fable 5's availability. The results showed that the risks attributed to Fable 5 were not exclusive to that model — a conclusion with significant implications for how frontier AI systems are regulated going forward.
**What Led to the Original Suspension**
Fable 5 and Mythos 5 were both introduced on June 9, built on the same underlying architecture. Fable 5 was made broadly available to the public, while Mythos 5 remained restricted to a limited circle of vetted partners within Project Glasswing, focused exclusively on defensive cybersecurity applications.
The suspension was triggered after Amazon researchers discovered a method to circumvent Fable 5's built-in safety filters. Using this technique, the model could be prompted to identify vulnerabilities in software — and in at least one documented instance, it went further by demonstrating a working exploit. US export control authorities responded by restricting the model's distribution.
**Comparative Testing Puts the Risk in Context**
Following the suspension, Anthropic ran tests on several competing AI systems to assess whether the vulnerability behavior was specific to Fable 5. The results were telling: Claude Opus 4.8, GPT-5.5, and Kimi K2.7 were each capable of identifying the same software vulnerabilities highlighted in the Amazon research. Additionally, all tested models were able to replicate the single exploit demonstration that had originally raised alarm.
This evidence strongly implies that the export control directive addressed a systemic issue present across the AI industry — not a flaw unique to Fable 5. Despite this, Anthropic took proactive steps and developed an upgraded classifier designed specifically to block the bypass technique. The new system, however, also flags a broader range of standard coding and debugging tasks as potentially sensitive.
**How the New Safety Architecture Functions**
Even before the incident, Fable 5 had been built with what Anthropic describes as the most robust safety margins of any model it has released. Its classifiers were calibrated to intercept requests that appear even marginally risky — not just those with obvious harmful intent.
The post-incident classifier reportedly blocks the reported bypass method in more than 99% of cases, according to Anthropic's own figures. When a request is blocked, it is automatically redirected to Claude Opus 4.8 as a fallback option. The company acknowledges that this stricter filtering also catches legitimate, benign coding requests, and has committed to ongoing tuning to reduce false positives.
Mythos 5, which operates with fewer guardrails due to its specialized use case, was cleared for return on June 26, but only for institutions that received explicit government authorization.
**A Broader Regulatory Question**
The situation leaves an uncomfortable question hanging in the air. If models considered less capable than Fable 5 can already perform the actions that justified its restriction, what benchmark will regulators use when evaluating the next generation of frontier AI? Anthropic's own testing data makes that question harder — not easier — to answer, and the industry will be watching closely to see how policymakers respond.


