Technology

Anthropic’s Claude found 22 Firefox security flaws; Mozilla fixed 100+

Anthropic's Claude Opus 4.6 submitted 112 reports; Mozilla issued 22 CVEs, 14 high severity, and patched most in Firefox 148, showing AI can surface deep bugs fast.

Dr. Elena Rodriguez3 min read
Published
Listen to this article0:00 min
Share this article:
Anthropic’s Claude found 22 Firefox security flaws; Mozilla fixed 100+
Source: framerusercontent.com

Anthropic says its Claude Opus 4.6 model discovered more than 100 defects in the Firefox codebase during a focused test, leading Mozilla to fix over 100 issues and issue 22 CVEs for security-sensitive bugs. The company published a detailed account on March 6 saying it submitted 112 unique reports to Mozilla over a two-week period; Mozilla assigned 14 of the 22 CVEs a high severity rating and included most fixes in Firefox 148, which rolled out on Feb. 24.

Anthropic described a sweeping scan that analyzed nearly 6,000 C++ files and produced a volume of actionable findings. The firm said Claude generated 50 additional unique crashing inputs while researchers validated and submitted their first vulnerability. In one technical example Anthropic shared, the model reached a use-after-free condition in Firefox’s JavaScript engine after about 20 minutes of exploration; a human researcher then validated the finding in a virtualized environment to reduce false positives.

The collaboration pushed Mozilla into an operational sprint. Brian Grinstead, a Mozilla engineer, said, "This is a large influx. We did mobilize as sort of an incident response to get the 100+ bugs that were filed, triaged and most of them fixed." Mozilla engineers mobilized across teams to triage the reports and land patches, with platform engineers beginning to apply fixes within hours of receiving submissions, Anthropic wrote.

Not all reports were labeled as security failures. Anthropic says roughly 90 of the 112 reports were non-security issues, including crash-inducing inputs, assertion failures and logic errors that traditional fuzzers had not flagged. The company framed the result broadly: "AI models can now independently identify high-severity vulnerabilities in complex software." Anthropic also cautioned about automated remediation, noting the limits of agent-generated patches: "We can't guarantee that all agent-generated patches that pass these tests are good enough to merge immediately. But task verifiers give us increased confidence that the produced patch will fix the specific vulnerability while preserving program functionality."

AI-generated illustration
AI-generated illustration

The effort also touched on exploit development as part of testing. Anthropic reports it tasked Claude with developing a practical exploit and produced a working exploit for one vulnerability, CVE-2026-2796, which has since been patched. The company says the 14 high-severity vulnerabilities disclosed in this engagement represent almost a fifth of the high-severity Firefox vulnerabilities remediated in 2025, underscoring how quickly AI can surface serious flaws even in mature, widely reviewed projects. Logan Graham, head of Anthropic’s frontier red team, explained the choice of target: "We chose Firefox because it's one of the most well-tested and secure open-source projects in the world."

Anthropic rolled out a product called Claude Code Security in February that it says automates parts of the code security testing workflow. The company positions the Firefox collaboration as a proof point for the product and a wider shift in how maintainers must manage vulnerability intake. The speed and scale of the submissions highlight a new capacity gap: open-source projects may need larger triage teams and clearer disclosure pipelines to handle high volumes of plausible, machine-generated reports.

The episode shows AI tools can lower the cost of finding serious software bugs, while also raising operational and safety questions about bulk reporting and automated fixes. It pushes software maintainers to adapt disclosure practices and triage processes to a world where machines can find critical flaws measured in minutes rather than months.

Know something we missed? Have a correction or additional information?

Submit a Tip
Your Topic
Today's stories
Updated daily by AI

Name any topic. Get daily articles.

You pick the subject, AI does the rest.

Start Now - Free

Ready in 2 minutes

Discussion

More in Technology