A University of Toronto research team published peer-reviewed results demonstrating a self-propagating AI worm that exploited 73.8% of a 33-host enterprise test network and replicated to 61.8% of target systems using a free, publicly available open-weight AI model — with no zero-day vulnerabilities required. The worm worked exclusively on known CVEs: the same unpatched vulnerabilities already catalogued on enterprise vulnerability scanners.
Papernot Lab’s FakeCorp Results: 73.8% Exploitation Across 33 Enterprise Hosts
Professor Nicolas Papernot led the research team — which includes Jonas Guan, Tom Blanchard, Hanna Foerster, Hengrui Jia, and Gabriel Huang — in building and testing the worm against a laboratory network called FakeCorp. The 33-host environment included Linux servers, Windows systems, and IoT devices configured with real-world vulnerability profiles. All experiments ran in an isolated laboratory setting with no connection to production infrastructure.
Across 15 independent experiments conducted over seven days, the worm averaged 31.3 vulnerabilities identified per run and achieved up to seven generations of self-replication without human direction.
Free Open-Weight Model Automates Reconnaissance and Payload Generation on Linux, Windows, and IoT
The research team withheld the specific model name, citing responsible disclosure concerns, describing it only as a free, publicly available open-weight model released in 2025. The worm required hundreds of LLM inference calls per target, issuing reconnaissance queries, generating payload code, and adapting to host-specific configurations before launching exploitation. Every vulnerability it exploited was a known CVE — no novel attack surface was required.
The significance of the model being free and publicly available is that the capability this worm demonstrates is no longer confined to well-funded nation-state actors. Any attacker capable of running open-weight inference can apply the same approach against a network with an unpatched CVE backlog.
Seven Replication Generations Average 31.3 Vulnerabilities Per Run Without Human Input
The self-replication chain reached up to seven generations in the most successful runs, meaning compromised hosts became propagation nodes that the worm used to reach additional targets. The 61.8% replication rate reflects successful execution across those hosts, not just scan discovery. The worm’s propagation graph grew without human intervention, autonomously identifying newly reachable targets after each compromise.
Professor Papernot summarized the core finding: “Attackers can now cheaply operationalize known vulnerabilities at scale.” The word “cheaply” carries the weight here — the research demonstrates that the compute and capability barrier required to automate mass exploitation of known CVEs has fallen below the threshold of nation-state exclusivity.
Why the Toronto AI Worm’s Known-CVE-Only Approach Reshapes Enterprise Patch Urgency
Enterprise vulnerability management has operated on the assumption that a meaningful delay exists between CVE disclosure and mass exploitation — weeks during which patches can be tested and deployed. That window has been narrowing as exploit tooling matures. The Toronto research demonstrates a concrete mechanism that accelerates it further: once a CVE is public and a free model can generate working exploit code on demand, the constraint on mass exploitation shifts from attacker capability to attacker motivation.
Defenders with extended patch cycles for kernel updates, middleware, and IoT firmware face the greatest exposure. The FakeCorp network’s IoT and mixed-OS composition was deliberate — it reflects the heterogeneous reality of enterprise environments where patch parity across device types is rarely achieved.
Hundreds of LLM Inference Calls Per Host: The Compute Cost Barrier and Its Rate of Decline
The operational constraint visible in the research is inference volume. The worm made hundreds of LLM calls per target host, which at current pricing creates real compute cost that scales with network size. For a 33-host environment, that overhead is manageable; for an enterprise with thousands of hosts, the cost multiplies significantly. That cost barrier currently limits the practical scale of attacks modeled on this approach.
Inference costs across the AI industry have fallen each year since large models became commercially available. The compute overhead that makes this expensive today will be substantially lower as hardware efficiency improves and open-weight models become more capable per inference call. Defenders who read this research as a future-state concern rather than a present-tense capability shift may be underestimating the pace of that transition.
The paper’s guidance aligns with existing vulnerability management practice but adds new urgency: reducing the backlog of unpatched known CVEs is now a direct defensive measure against autonomous AI-assisted exploitation, not just a general hygiene recommendation.
