Mozilla says 271 vulnerabilities found by Mythos have "almost no false positives"

The disbelief was palpable when Mozilla’s CTO last month declared that AI-assisted vulnerability detection meant “zero-days are numbered” and “defenders finally have a chance to win, decisively.” After all, it looked like part of an all-too-familiar pattern: Cherry-pick a handful of impressive AI-achieved results, leave out any of the fine print that might paint a more nuanced picture, and let the hype train roll on.
Mindful of the skepticism, Mozilla on Thursday provided a behind-the-scenes look into its use of Anthropic Mythos—an AI model for identifying software vulnerabilities—to ferret out 271 Firefox security flaws over two months. In a post, Mozilla engineers said the finally ready-for-prime-time breakthrough they achieved was primarily the result of two things: (1) improvement in the models themselves and (2) Mozilla’s development of a custom “harness” that supported Mythos as it analyzed Firefox source code.
“Almost no false positives”
The engineers said their earlier brushes with AI-assisted vulnerability detection were fraught with “unwanted slop.” Typically, someone would prompt a model to analyze a block of code. The model would then produce plausible-reading bug reports, and often at unprecedented scales. Invariably, however, when human developers further investigated, they’d find a large percentage of the details had been hallucinated. The humans would then need to invest significant work handling the vulnerability reports the old-fashioned way.
Mozilla’s work with Mythos was different, Mozilla Distinguished Engineer Brian Grinstead said in an interview. The biggest differentiating factor was the use of an agent harness, a piece of code that wraps around an LLM to guide it through a series of specific tasks. For such a harness to be useful, it requires significant resources to customize it to the project-specific semantics, tooling, and processes it will be used for.
Grinstead described the harness his team built as “the code that drives the LLM in order to accomplish a goal. It gives the model instructions (e.g., ‘find a bug in this file’), provides it tools (e.g., allowing it to read/write files and evaluate test cases), then runs it in a loop until completion.” The harness gave Mythos access to the same tools and pipeline that human Mozilla developers use, including the special Firefox build they use for testing.
It works when given a very clear, easily machine verified goal. They've basically taken what they've learned from machine speed training and applied it to finding memory bugs, where basically if the process crashed, you've succeeded. So what is described here at least is quite a narrow window of capability, when given a very clear success/failure model which can be automatically marked by another process, model , or algorithm.
The harness is what turns an AI model onto an AI system, and it is absolutely key to success.
It is a new arms race: How fast will the AI attack tools improve relative to the AI defense tools? If history is any lesson, the defenders will not be able to keep ahead of the attackers. I hope I am wrong, but history says I am correct.
Will AI change our future history? Only time will tell.
And great article -- appropriately skeptical but not negative.
There is real public good achieved here, and that's exactly the way for commercial companies to earn good will. Especially since covering 20+ years of historical code is largely a one-off (yes, there will be even more competent models; and yes, running them on FF and other major / foundational OSS projects is a win-win)
The alternative was just releasing it once ready and getting blamed for bad people using it.
There is no alternative of stopping work on AI, other AIs are getting there too, some with less safeguards, and some will probably be open-weights for which any safeguards can probably be disabled.
For now, I think the reasonable stance here is to give Mozilla the benefit of the doubt and to point out that it's not just Mythos one has to be worried about. People have tunnel vision. The forest itself is changing. The newer models are all closing in on useful contributions when properly directed to detecting problems in existing code bases. That's what arm chair experts and luddites are missing. It's a paradigm shift much like automated fuzzers and automatically generated testing harnesses gave us a few years ago (and generated similar backlash). Conservative programmers can bury their head in the sand all they want, but *LM-users are going to blow right past them in the near future much like fuzzer users blew people away sticking to meticulously piecing through code in a debugger, or a skilled debugger user adeptly outperforming someone that never moved past inserting print statements in code. 30 years ago no college CS course taught how to build test harnesses for software nor bothered considering input sanitation as anything beyond a UX exercise. Now, building testing harnesses and security considerations, including fuzzing tools, language agnostic and specific advanced debugging techniques, and input management techniques are part of any well crafted CS course. The question isn't if, it's when managing *LM tooling becomes equally required in CS degrees.

Facts Only

* Mozilla used Anthropic Mythos to identify 271 Firefox security flaws over two months.
* The breakthrough was achieved through two factors: improvement in the AI models and Mozilla’s custom "harness."
* Earlier attempts at AI-assisted detection resulted in "unwanted slop" and hallucinations.
* The custom harness wrapped the LLM to guide it through tasks, providing it access to Mozilla's specific tooling and build processes.
* The harness enabled Mythos to use the same tools used by human Mozilla developers, including the special Firefox build.
* The harness was described as the code that drives the LLM to accomplish a goal by providing instructions and tools.
* The success window is narrow, requiring a clear, machine-verifiable success/failure model.
* The work involved covering over 20 years of historical code.

Executive Summary

Mozilla used Anthropic Mythos, an AI model, to identify 271 security flaws in Firefox over two months. The successful detection was achieved primarily through two factors: improvements in the underlying AI models and the development of a custom "harness." This harness allowed the AI to operate within the specific tooling and pipelines used by Mozilla developers, which was crucial for accurate results. Earlier attempts at AI-assisted vulnerability detection were prone to "unwanted slop" and hallucinations when used without such a specialized harness. Mozilla engineers found that the harness, which acts as an agent guiding the LLM through specific tasks, was key to achieving high accuracy with minimal false positives. The experience demonstrated that effective AI application in security detection requires customizing the model's access to specific, project-specific operational tools.

Full Take

The narrative positions AI vulnerability detection as a definitive end to zero-day threats, leveraging hype ("zero-days are numbered") to mask the specific, incremental engineering work required to achieve success. This dynamic frames the development process as an aggressive arms race where the focus is on speed rather than methodological rigor. The key insight is that AI utility is not inherent in the model itself, but emerges only when custom systems (harnesses) are built to bridge the gap between general AI capability and specific software engineering contexts. This suggests that the real breakthrough is not the AI, but the engineering layer—the development of specialized tooling—that dictates how the AI interacts with the codebase. The shift described—from human developers meticulously piecing through code to relying on automated systems—is mirrored by a broader paradigm shift in computer science education, where specialized skills like building testing harnesses and input sanitation must evolve into core curriculum. The pattern detected is the use of hyperbolic, fear-based framing to distract from the necessary, detailed infrastructural work, and the implication that complexity is solved by superior tools rather than deeper understanding.
Patterns detected: ARC-0043 Motte-and-Bailey, ARC-0024 Ambiguity, ARC-0011 Moral Panic

Sentinel — Human

Confidence

The text is highly human-written, exhibiting a distinct, reflective voice and complex rhetorical structuring that points toward a human analyst rather than synthetic generation.

Signals Detected

Sentence length variance and complex, reflective phrasing

Presence of idiosyncratic emphasis, personal voice, and reflective tone ('I hope I am wrong,' 'I think the reasonable stance here is')

Argumentative structure flows from a specific, historical observation (fuzzers vs. debuggers) to a modern conclusion, rather than simple report aggregation.

The claims are grounded in specific, verifiable internal details (Mozilla engineers, specific project names, model types) rather than broad, unsupported assertions.

Human Indicators

The text employs a reflective, philosophical tone and includes personal hedging and rhetorical pauses ('I hope I am wrong,' 'I think the reasonable stance here is'), which is characteristic of human, non-optimized writing.

The analysis synthesizes historical context (fuzzers, debuggers) with a modern technological development, showing a layered, personalized argument structure.

The discussion of the paradigm shift feels driven by a specific, learned perspective rather than a purely synthesized data summary.