Cisco Scales AI Bug Hunting to 1.8 Billion Lines of Code, But Remains Quiet on the Results

Table of Contents
The Floodlight Approach to Security
Cisco is betting heavily on the ability of frontier AI models to dismantle the traditional timeline of software security audits. In a recent update, Anthony Grieco, Senior Vice President of Cisco’s security and trust organization, revealed that the company utilized a combination of Anthropic’s Claude Mythos Preview and OpenAI’s GPT 5.5-Cyber to scan a staggering 1.8 billion lines of code over a period of just eight weeks.
To put that scale into perspective, Grieco noted that performing a manual audit of that magnitude would typically require eight years of effort from Cisco’s advanced security teams. The scan covered more than 25 different programming languages across the company’s diverse product portfolio, utilizing what Grieco described as a “human-guided harness” to keep the AI focused and reduce noise.
The efficiency is impressive, and Cisco claims a false positive rate of under 3 percent. Grieco likened the transition from traditional targeted security evaluations to this AI-driven approach as “switching from a flashlight to a flood light to illuminate a dark room.” However, there is a glaring omission in the reporting: Cisco has not disclosed how many actual vulnerabilities the models uncovered, nor whether those holes have been patched.
The ‘Black Box’ of AI Bug Hunting
The lack of specific data from Cisco stands in stark contrast to other early adopters of these high-capability models. In May, Palo Alto Networks—another partner in the same ecosystem—reported that after one month of using frontier models including Mythos, it identified 26 CVEs representing 75 underlying security issues across 130 products. For Palo Alto, this represented a massive spike in discovery, as the company typically discloses fewer than five CVEs per month.
Cisco’s silence on the specific count of bugs found raises questions about the actual efficacy of the tool versus the marketing narrative of “transformative power.” While the speed of scanning is a mathematical certainty, the quality of the “actionable intelligence” remains unverified without a reported number of critical finds.
Anthropic’s Controlled Expansion
The technology powering these scans, specifically Anthropic’s Mythos, is not available to the general public. It is currently restricted to “Project Glasswing,” a highly vetted partner program. Anthropic recently announced a significant expansion of this cohort, adding 150 new organizations and bringing the total to roughly 200 partners.
The caution surrounding Mythos is rooted in the model’s potential for misuse. Anthropic has previously suggested that the model’s ability to find and exploit security holes is so potent that an unrestricted release could provide bad actors with a turnkey toolkit for catastrophic cyberattacks. Consequently, access is limited to government agencies and corporate entities that meet stringent security requirements.
The expanded Glasswing program now reaches beyond the U.S., including a heavy presence in South Korea. Reports indicate that the Korea Internet and Security Agency (KISA), Samsung Electronics, SK hynix, and SK Telecom have joined the program. Anthropic also noted that the new cohort diversifies the industries involved, adding partners from healthcare, water, power, and communications—sectors where a single unpatched vulnerability in a shared codebase can have systemic real-world consequences.
A Closing Window for Defense
The rush to integrate these models into the development lifecycle is driven by a growing fear that the “adversary” is already using them. Executives at Palo Alto Networks have previously warned of a narrow window—perhaps only three to five months—before AI-driven exploits become the industry norm.
By integrating AI threat detection and automated auditing, companies like Cisco and Rubrik are attempting to automate the defense side of the equation. But as these tools move from the preview phase to the production phase, the industry will need more than anecdotes about speed; it will need a transparent accounting of how many doors these AI models are actually locking before the other side finds the key.