Google GTIG Documents First AI-Generated Zero-Day Exploit

Google's Threat Intelligence Group confirmed the first AI-generated zero-day exploit, targeting 2FA logic in an open-source web admin tool via LLM-written code.
Table of Contents
    Add a header to begin generating the table of contents

    Google’s Threat Intelligence Group published a report on May 11, 2026, confirming what it describes as the first in-the-wild zero-day exploit developed using a large language model — a working Python script that bypassed two-factor authentication protections in a widely deployed open-source web administration tool.

    How GTIG Identified the Exploit as AI-Generated Code

    GTIG researchers attributed the exploit’s authorship to an LLM based on three specific technical fingerprints embedded in the Python script. The code contained “educational docstrings” — inline comments that explain the script’s own logic step-by-step in the style of tutorial material, a pattern absent from exploits written by experienced human operators. It also included a CVSS severity score value that no standards body had assigned, one generated by the model itself rather than drawn from a real vulnerability database entry. The script’s overall structure matched what GTIG described as a “structured, textbook Pythonic format highly characteristic of LLMs” — methodical and well-organized in ways that contrast with the terse, pragmatic code style typical of experienced exploit authors.

    Google stated it ruled out Gemini as the model used but could not determine which specific LLM produced the script. The affected vendor was not publicly named, consistent with coordinated disclosure protocol. No CVE has been formally assigned.

    Why Semantic Logic Bugs Are LLMs’ Natural Exploitation Target

    The vulnerability the exploit targeted was a semantic logic flaw in the tool’s authentication decision-making — not a memory corruption bug such as a buffer overflow or use-after-free. This distinction matters for understanding where AI-assisted exploitation is most effective. LLMs reason well about code logic and control flow, making them capable of identifying authentication bypass conditions, business logic errors, and decision-tree flaws. Low-level binary exploitation — finding memory corruption primitives and writing shellcode — requires a different class of technical reasoning that current models handle less reliably. The zero-day targeting a 2FA authentication logic path is consistent with LLMs’ comparative strengths, and suggests that authentication flows and access control logic represent a priority risk area for AI-assisted attack development.

    GTIG notified the affected vendor and disrupted the attack before mass exploitation occurred, according to the report.

    Nation-State Actors Already Using AI Across Earlier Phases of the Attack Chain

    The May 11 disclosure marks the first case where AI was used to write working exploit code itself, but GTIG’s report also documents wider AI adoption across adjacent attack phases by named threat actors. Chinese state-affiliated groups APT27, APT45, UNC2814, UNC5673, and UNC6201 have been observed using AI tools to accelerate reconnaissance operations and generate targeted social engineering content. A separate Russian operation, tracked under the internal designation “Overload,” is using AI voice cloning to conduct precision vishing attacks against specific individuals.

    How Threat Actors Are Bypassing Commercial AI Rate Limits at Scale

    GTIG assessed that threat actors are “industrializing access to premium AI models” — automating account creation and rotating infrastructure to circumvent the usage rate limits and content safety restrictions that commercial AI services enforce. This pattern suggests that the availability of commercial AI services is not itself an effective barrier to adversarial use; determined operators have already built systematic workarounds. The implication is that the exploit-writing capabilities demonstrated in this zero-day case are available to a broader range of threat actors than those who control their own LLM infrastructure.

    The zero-day’s development and deployment in a live attack marks a qualitative shift in the threat model for application security programs. Organizations that previously modeled time-to-exploitation windows of days or weeks after public vulnerability disclosure must now account for adversaries who can generate working exploit code directly from code analysis, potentially before a vulnerability is publicly known.

    Related Posts