Anthropic withholds powerful AI model after it finds zero-day flaws across software
Anthropic’s newest model could exploit zero-day flaws in every major browser and operating system, so the company locked it behind a defensive access program.

Anthropic withheld Claude Mythos Preview from broad release after internal tests showed it could autonomously find and exploit zero-day flaws in every major operating system and every major web browser. The decision, announced April 7, underscored how quickly frontier AI has become a cybersecurity force multiplier and how unevenly its benefits and risks are likely to be distributed.
Anthropic instead limited access through Project Glasswing, a defensive initiative aimed at securing critical software for the AI era. Its launch partners included Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA and Palo Alto Networks, along with more than 40 additional organizations working on critical software infrastructure. Anthropic said the point was to give defenders a head start, but the narrow rollout also reflected a hard public-policy choice: the same capabilities that can help secure systems faster can also lower the barrier for attackers if they spread too widely.
The stakes reached into Washington almost immediately. On April 14, Politico reported that the Commerce Department’s Center for AI Standards and Innovation was actively evaluating Mythos, and that staff at at least two large federal agencies and at least three congressional committees had sought briefings or access for cyber defense work. That interest came despite the Trump administration’s ban on working with Anthropic, a signal that government institutions were already treating the model as a tool that could matter for national security, oversight and procurement decisions.
The broader shift is already visible in the numbers. DARPA’s AI Cyber Challenge concluded in August 2025 with $8.5 million awarded to three teams. In the final scoring round, those systems found 77% of synthetic vulnerabilities and patched 61% of them at an average speed of 45 minutes. They also uncovered 18 real zero-day vulnerabilities, including six in C code and 12 in Java codebases. The results showed how AI can compress the time between finding a flaw and fixing it, but also how quickly that same automation can be turned toward exploitation.
Researchers at Georgetown’s Center for Security and Emerging Technology have argued that AI is pushing vulnerability discovery, patching and exploitation forward at the same time, letting defenders “shift left” earlier in development while widening the dual-use danger. OpenAI responded in kind on April 14 with GPT-5.4-Cyber and an expanded Trusted Access for Cyber program, now being scaled to thousands of verified individual defenders and hundreds of teams. The industry is moving toward tiered access and controlled release, but the core problem remains: the fastest protections may still reach major firms and federal agencies first, while smaller businesses, local governments and ordinary users wait exposed.
Sources:
Know something we missed? Have a correction or additional information?
Submit a Tip

