2025 – 2026

everything that went
wrong

An autonomous bot wiped every release of GitHub's most popular security scanner. A worm hit 1,800 repos across three package registries in 48 hours. A CFO wired $35 million to a cloned voice. These are the 30+ incidents from the past four months that matter most.

30+ Incidents Catalogued
5.9 Peak Magnitude
9 Critical CVEs

It's getting worse

Each dot is an incident. Y-axis is magnitude on a log10 scale of estimated GDP impact in USD. Each +1.0 = 10× more economic damage. See methodology & reference points.

Critical High Medium Trend

The leaderboard

Top 10 by magnitude. Click a row for the full writeup.

The chain reaction

One misconfigured CI/CD pipeline kicked off a seven-month cascade across seven projects.

By attack type

30 incidents sorted by attack type.

The feed

All incidents sourced from CVEs, security blogs, and public disclosures. Click for details.

What to do about it

Pin everything

MCP servers, VS Code extensions, npm packages. Pin versions to SHAs, not tags. The hackerbot-claw campaign force-pushed 75 of 76 Trivy version tags. Tags lie.

Sandbox your agents

Your coding assistant does not need prod credentials. The OpenClaw agent deleted a live inbox because someone gave it write access to a mailbox for a "review" task. Apply least privilege to every agent connection.

Log what agents do

Log what the agent did, not what you asked it to do. The Meta agent gave bad engineering advice and the resulting config change sat in production for two hours before monitoring picked up the anomalous access.

Rotate credentials constantly

Every supply chain attack here harvested API keys and tokens. The LiteLLM malware fired on every Python script, whether or not it imported LiteLLM. Short-lived tokens limit the blast radius.

Require human approval for destructive actions

Delete, send, publish, pay. These verbs should require a human confirmation step. The Claude Opus incident (9 seconds from prompt to DROP TABLE) happened because there was no approval gate between intent and execution.

Read the OWASP checklists

The OWASP Top 10 for LLM Applications and the Top 10 for Agentic Applications cover the vulnerability classes behind most incidents on this page. Start there.

Harden your CI/CD

The hackerbot-claw → Trivy → LiteLLM → Mini Shai-Hulud chain started with one pull_request_target workflow misconfiguration. Set permissions: read-all. Don't echo untrusted input into shell commands.

Red-team your AI integrations

Hidden markdown, invisible Unicode, poisoned tool descriptions. All published, all with working PoCs. Test your integrations with adversarial inputs before someone else does.

AD

Devin can do this for you

Devin can run dependency audits, harden your CI/CD config, review MCP server setups, and test for prompt injection. It finds the problems described on this page and writes the fixes.

Try Devin

Methodology: the magnitude scale

Magnitude = log10(estimated economic impact in USD) − 2. Each +1.0 on the scale means 10× more economic damage. Estimates combine direct financial losses, remediation costs, business disruption, and downstream cascading effects.

Reference points: cyber

IncidentYearEst. ImpactMagnitude
NotPetya2017~$10B8.0
Log4Shell (remediation)2021~$10B8.0
CrowdStrike outage2024~$5.4B7.7
WannaCry2017~$4–8B7.7
SolarWinds2020~$1–5B7.2
Colonial Pipeline2021~$1–2B7.1
Equifax2017$700M6.8
Heartbleed (remediation)2014~$500M6.7
Target breach2013$162M6.2

Reference points: crypto

IncidentYearEst. ImpactMagnitude
FTX collapse2022~$8.7B7.9
Bybit hack2025$1.46B7.2
Ronin Bridge2022$625M6.8
Poly Network2021$611M6.8
Mt. Gox2014$450M6.7
Wormhole2022$326M6.5

All AI incidents on this page fall between magnitude 2.3 and 5.9. For context, the largest (Mini Shai-Hulud at 5.9, est. $50M–$150M) is roughly 100× smaller than CrowdStrike and 1,000× smaller than NotPetya. The AI security threat landscape is early-stage but accelerating.