Firefoxs vast codebase, comprising tens of thousands of files and millions of lines of code, presents significant challenges for manual bug detection. To address this, the team employs AI agentsdescribed as coding archaeologiststhat analyze code for semantically introduced bugs and navigate complex systems using advanced tools like the Mythos model. These agents identified nearly 500 security bugs in recent months by simulating attacker perspectives, generating test cases, and leveraging existing tooling (e.g., fuzzers) to detect vulnerabilities. However, early AI-generated findings often included inaccuracies, prompting a shift toward integrated systems that verify and act on findings systematically. A custom harness was developed to validate AI results, ensuring actionable outcomes by filtering out false positives and aligning AI outputs with real-world workflows. This combination of advanced models and infrastructure has streamlined security improvements while maintaining efficiency.
The process involves prioritizing code analysis through lightweight LLM-based scoring to focus on files most likely to contain memory safety issues or be exposed to user inputs. Agents use iterative workflows to test scenarios, refine hypotheses, and generate verifiable fixes, with verification subagents ensuring test cases are valid and patches are effective. Despite these advancements, challenges persist, such as scaling AI-driven solutions for large, complex codebases and balancing automation with human oversight. Existing tools like Codex Security excel at patching specific issues but lack the capacity to globally resolve recurring bugs without human expertise. The team emphasizes the importance of integrating AI with existing infrastructure (e.g., bug bounty programs, fuzzers) rather than replacing human workflows, while advocating for open-source collaboration to address security and scalability challenges. Future goals focus on achieving zero bugs through continuous refinement of models, verification systems, and prioritization strategies tailored to Firefoxs scale and complexity.