Zach Anderson
Apr 10, 2026 23:18
Anthropic releases safety tips as Undertaking Glasswing reveals frontier AI fashions can now discover and exploit vulnerabilities sooner than human defenders.
Anthropic dropped a sobering evaluation this week: inside two years, AI fashions will uncover huge numbers of software program vulnerabilities which have sat unnoticed in code for years—and chain them into working exploits. The corporate’s safety groups launched detailed defensive suggestions alongside Undertaking Glasswing, their initiative to deploy Claude Mythos Preview’s capabilities for cyber protection.
The mathematics right here is not sophisticated. If attackers can use frontier fashions to automate vulnerability discovery and exploit era, the window between a patch dropping and a working exploit showing shrinks dramatically. Anthropic’s safety engineers have watched this occur in their very own testing.
What Their Analysis Really Discovered
In accordance with Anthropic’s technical findings, AI fashions excel at recognizing signatures of recognized vulnerabilities in unpatched methods. Reversing a patch right into a working exploit—precisely the form of mechanical evaluation these fashions deal with properly—used to require specialised expertise. Now it is changing into automated.
The corporate famous that publicly out there fashions beneath Mythos functionality ranges can already discover severe vulnerabilities that conventional code opinions missed for prolonged intervals. Mozilla Firefox vulnerabilities found by AI scanning function one documented instance.
The Defensive Playbook
Anthropic’s suggestions prioritize controls that maintain even towards attackers with limitless persistence and AI help. Friction-based safety measures—further pivot hops, fee limits, non-standard ports—lose effectiveness when adversaries can grind by tedious steps robotically.
Their prime priorities:
Patch velocity issues greater than ever. Web-facing purposes ought to obtain patches inside 24 hours of an exploit changing into out there. The CISA Recognized Exploited Vulnerabilities catalog ought to be handled as an emergency queue. Anthropic recommends utilizing EPSS (Exploit Prediction Scoring System) for prioritizing every little thing else.
Put together for 10x vulnerability report quantity. Over the subsequent two years, consumption and triage processes will face stress they’ve by no means skilled. Organizations nonetheless working weekly spreadsheet conferences will not maintain tempo.
Scan your personal code with frontier fashions earlier than attackers do. This was Anthropic’s single most emphasised suggestion. Legacy code that predates present overview practices—particularly code whose unique authors have moved on—represents the highest-value goal for proactive scanning.
Zero Belief Will get Actual
The steering pushes laborious towards hardware-bound credentials and identity-based service isolation. A compromised construct server should not attain manufacturing databases. A compromised laptop computer should not contact construct infrastructure.
Static API keys, embedded credentials, and shared service-account passwords are described as “among the many first issues an attacker with model-assisted code evaluation will discover.”
For Smaller Operations
Organizations with out devoted safety groups bought particular recommendation: allow computerized updates all over the place, desire managed companies over self-hosting, use passkeys or {hardware} safety keys, and activate free safety tooling from code hosts like GitHub’s Dependabot and CodeQL.
Open-source maintainers ought to anticipate elevated vulnerability report quantity—some beneficial, some automated noise. Publishing a SECURITY.md with clear consumption processes helps separate sign from spam.
Anthropic dedicated to updating this steering as Undertaking Glasswing progresses. For enterprises monitoring SOC 2 and ISO 27001 compliance, most suggestions map on to current controls. The distinction now could be urgency.
Picture supply: Shutterstock



















