22h ago

AISI revises AI cyber task doubling to 4.7 months

0

The AI Security Institute published evaluations showing frontier AI models advancing rapidly in cyber capabilities. The length of completable cyber tasks now doubles every 4.7 months by February 2026, down from an 8-month interval recorded in November 2025. Newer checkpoints including an updated Mythos version and GPT-5.5 saturated the existing task suite, producing highly uncertain time horizons on the benchmark.

Original post

Our evaluations show that frontier AI's cyber capabilities are advancing quickly. The length of cyber tasks frontier models can complete has been doubling every few months, and this rate has become faster over time, with recent models exceeding our previous trends. 🧵

8:49 AM · May 13, 2026 View on X

Everyone has seen the @waitbutwhy cartoon of AI capability growth with a "you are here" indicator just before the exponential really starts, but the independent assessments of both METR and the UK's AISA do seem to show that we are past that point now (until we hit a slowdown?)

4:29 AM · May 14, 2026 · 31.3K Views

@waitbutwhy Anyhow, not sure what to do with that knowledge because there is no easy suggestion to help people adapt to super-exponential ability gain, so hold on to your butts.

Ethan Mollick@emollick

Everyone has seen the @waitbutwhy cartoon of AI capability growth with a "you are here" indicator just before the exponential really starts, but the independent assessments of both METR and the UK's AISA do seem to show that we are past that point now (until we hit a slowdown?)

4:29 AM · May 14, 2026 · 31.3K Views
4:32 AM · May 14, 2026 · 8.8K Views

Notably, our new post tests a newer checkpoint of Mythos than AISI was initially provided. The new one has significantly different performance than our initial run. I've seen this mismatch pollute the discourse (prior to us getting the new checkpoint), so good to update.

AI Security Institute@AISecurityInst

Our evaluations show that frontier AI's cyber capabilities are advancing quickly. The length of cyber tasks frontier models can complete has been doubling every few months, and this rate has become faster over time, with recent models exceeding our previous trends. 🧵

3:49 PM · May 13, 2026 · 115.5K Views
3:55 PM · May 13, 2026 · 47 Views
AI Security Institute@AISecurityInst

Our evaluations show that frontier AI's cyber capabilities are advancing quickly. The length of cyber tasks frontier models can complete has been doubling every few months, and this rate has become faster over time, with recent models exceeding our previous trends. 🧵

3:49 PM · May 13, 2026 · 115.5K Views
3:54 PM · May 13, 2026 · 3.4K Views

'The length of cyber tasks frontier models can complete has been doubling every few months, and this rate has become faster over time, with recent models exceeding our previous trends.'

Andrew Curran@AndrewCurran_
3:54 PM · May 13, 2026 · 3.4K Views
3:54 PM · May 13, 2026 · 1.6K Views

UK AI SECURITY INSTITUTE: - November 2025 estimate that AI cyber abilities double in ability every 8 months - In February 2026, re-estimated doubling to be every 4.7 months - Claude Mythos and GPT-5.5 are more capable than a 4.7 month doubling rate would suggest

AI Security Institute@AISecurityInst

Our cyber range results illustrate this step-up. Since our first Mythos evaluation, we received access to a newer Mythos Preview checkpoint. On a 32-step corporate network attack we estimate takes a human expert ~20 hours, this checkpoint completes the full attack in 6 /10 attempts.

3:49 PM · May 13, 2026 · 389.4K Views
12:34 AM · May 14, 2026 · 5.6K Views

> Caveats: Mythos Preview (new) and GPT-5.5 saturate the task suite, resulting in highly uncertain time horizons

AI Security Institute@AISecurityInst

Our evaluations show that frontier AI's cyber capabilities are advancing quickly. The length of cyber tasks frontier models can complete has been doubling every few months, and this rate has become faster over time, with recent models exceeding our previous trends. 🧵

3:49 PM · May 13, 2026 · 115.5K Views
4:04 PM · May 13, 2026 · 3.5K Views
AISI revises AI cyber task doubling to 4.7 months · KRO · Digg