UK AI Security Institute reports OpenAI GPT-5.5 completes 32-step cyber-attack simulation

QUOTE POST

This is just one eval, but it's an important one - UK AISI’s cyber range tests long-horizon, agentic capability. 5.5 performs similarly to Mythos.

The risks for frontier models are real. But we do our best to deploy AI people can actually use - through hard work on mitigations.

AI Security Institute@AISecurityInst

OpenAI’s GPT-5.5 is the second model to complete one of our multi-step cyber-attack simulations end-to-end 🧵

3:07 PM · Apr 30, 2026 · 1.4M Views

12:38 AM · May 1, 2026 · 16.9K Views

QUOTE POST

#79Marc Andreessen 🇺🇸@PMARCA

Co-sign.

David Sacks@DavidSacks

It’s time to demystify Mythos. Mythos is not magic. It’s not a doomsday device. It’s the first of many models that can automate cyber tasks (just like coding). OpenAI’s GPT-5.5-cyber can now do the same. And all the frontier models (including those from China) will be there within approximately 6 months. It’s important to recognize that these models do not create vulnerabilities; they discover them. The bugs are already in the code. Using AI to discover and patch them will actually harden these systems. The leap from pre-AI cyber to post-AI cyber means that there will be a big upgrade cycle. After that, however, the market is likely to reach a new equilibrium between AI-powered cyber-offense and AI-powered cyber-defense. Obviously it’s important that cyber defenders get access before cyber attackers. That process is already underway but needs to happen quickly (see point above about Chinese models). Unlike Mythos, GPT-5.5-cyber appears not to be token constrained so it may be the first cyber model that defenders actually get to use.

5:45 PM · Apr 30, 2026 · 991K Views

6:34 PM · Apr 30, 2026 · 258.4K Views

REPLY

#127Boaz Barak@BOAZBARAKTCS

@TheRealAdamG @TheRealAdamG I love you but let's not fall into this trap - we should brag about how amazing GPT 5.5 is in codex for normal users, and not about its hacking capabilities.

Adam.GPT@TheRealAdamG

https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities "In April, our evaluation of Anthropic's Claude Mythos Preview found that it represented a step up in cyber performance over previous frontier models and was the first to complete our corporate network attack simulation end-to-end, a multi-step exercise we estimate would take a human around 20 hours. A key question was whether this reflected a breakthrough specific to one model, or part of a broader trend. Results from an early checkpoint of GPT-5.5 suggest the latter: a second model, from a different developer, now reaches a similar level of performance on our cyber evaluations."

5:29 PM · Apr 30, 2026 · 6.4K Views

10:06 PM · Apr 30, 2026 · 1.2K Views

QUOTE POST

#127Boaz Barak@BOAZBARAKTCS

TBH I find it very weird to "compete" on dual use or risky capabilities. We need to measure these to choose appropriate safeguards, but shouldn't optimize or market them.

GPT 5.5 is a great model not because it can find vulnerabilities but because it can deliver value to users.

Lisan al Gaib@scaling01

GPT-5.5 is on par with Claude Mythos - GPT-5.5 average pass rate of 71.4% (±8.0%) - Mythos Preview 68.6% (±8.7%) - GPT-5.5 solved a task that takes a human expert ~12 hours in under 11 minutes at a cost of $1.73

3:17 PM · Apr 30, 2026 · 413.6K Views

10:00 PM · Apr 30, 2026 · 5.7K Views

REPLY

#285Aidan McLaughlin@AIDAN_MCLAU

@boazbaraktcs @TheRealAdamG agreed

Boaz Barak@boazbaraktcs

@TheRealAdamG @TheRealAdamG I love you but let's not fall into this trap - we should brag about how amazing GPT 5.5 is in codex for normal users, and not about its hacking capabilities.

10:06 PM · Apr 30, 2026 · 1.2K Views

3:28 AM · May 1, 2026 · 241 Views

QUOTE POST

#337Boris Power@BORISMPOWER

Great explanation of where we are with cyber capabilities right now and what that precisely means

David Sacks@DavidSacks

It’s time to demystify Mythos. Mythos is not magic. It’s not a doomsday device. It’s the first of many models that can automate cyber tasks (just like coding). OpenAI’s GPT-5.5-cyber can now do the same. And all the frontier models (including those from China) will be there within approximately 6 months. It’s important to recognize that these models do not create vulnerabilities; they discover them. The bugs are already in the code. Using AI to discover and patch them will actually harden these systems. The leap from pre-AI cyber to post-AI cyber means that there will be a big upgrade cycle. After that, however, the market is likely to reach a new equilibrium between AI-powered cyber-offense and AI-powered cyber-defense. Obviously it’s important that cyber defenders get access before cyber attackers. That process is already underway but needs to happen quickly (see point above about Chinese models). Unlike Mythos, GPT-5.5-cyber appears not to be token constrained so it may be the first cyber model that defenders actually get to use.

5:45 PM · Apr 30, 2026 · 991K Views

10:53 PM · Apr 30, 2026 · 5.6K Views

QUOTE POST

#400Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@TEORTAXESTEX

Did Not Buy Anthoripic Psyops Again Award granted to: me Sama did mog Mythos on all dimensions (because 5.5 can be used at all and apparently isn't weaker)

Lisan al Gaib@scaling01

GPT-5.5 is on par with Claude Mythos - GPT-5.5 average pass rate of 71.4% (±8.0%) - Mythos Preview 68.6% (±8.7%) - GPT-5.5 solved a task that takes a human expert ~12 hours in under 11 minutes at a cost of $1.73

3:17 PM · Apr 30, 2026 · 413.6K Views

7:47 PM · Apr 30, 2026 · 5K Views

REPLY

#464kache@YACINEMTB

@DavidSacks Ya

David Sacks@DavidSacks

It’s time to demystify Mythos. Mythos is not magic. It’s not a doomsday device. It’s the first of many models that can automate cyber tasks (just like coding). OpenAI’s GPT-5.5-cyber can now do the same. And all the frontier models (including those from China) will be there within approximately 6 months. It’s important to recognize that these models do not create vulnerabilities; they discover them. The bugs are already in the code. Using AI to discover and patch them will actually harden these systems. The leap from pre-AI cyber to post-AI cyber means that there will be a big upgrade cycle. After that, however, the market is likely to reach a new equilibrium between AI-powered cyber-offense and AI-powered cyber-defense. Obviously it’s important that cyber defenders get access before cyber attackers. That process is already underway but needs to happen quickly (see point above about Chinese models). Unlike Mythos, GPT-5.5-cyber appears not to be token constrained so it may be the first cyber model that defenders actually get to use.

5:45 PM · Apr 30, 2026 · 991K Views

9:28 PM · Apr 30, 2026 · 2.1K Views

REPLY

#464kache@YACINEMTB

@markchen90 ship cbyer plz

Mark Chen@markchen90

This is just one eval, but it's an important one - UK AISI’s cyber range tests long-horizon, agentic capability. 5.5 performs similarly to Mythos. The risks for frontier models are real. But we do our best to deploy AI people can actually use - through hard work on mitigations.

12:38 AM · May 1, 2026 · 16.9K Views

1:02 AM · May 1, 2026 · 738 Views

QUOTE POST

#464kache@YACINEMTB

what noam is saying here, by the way, is that we've entered RSI. You can scale inference compute to discover new knowledge, which you can then use to create new data to train on. It only doesn't feel like a foom to you because you're a human, whose lifetime is a blink

Noam Brown@polynoamial

After 100 million tokens, performance was still going up. What we're seeing here is not the capability ceiling. From the report: "Performance on TLO continues to scale with the amount of inference compute spent, and we have not yet observed a plateau with the best models."

4:07 PM · Apr 30, 2026 · 167.7K Views

9:34 PM · Apr 30, 2026 · 93K Views

QUOTE POST

#711Beff (e/acc)@BEFFJEZOS

Progressive Adversarial overload is how we harden all complex systems, from cybersecurity to biosecurity and engineer anti-fragility.

This progressive release is the way, lets the system adiabatically adapt as the overall level of intelligence of adversaries goes up.

David Sacks@DavidSacks

It’s time to demystify Mythos. Mythos is not magic. It’s not a doomsday device. It’s the first of many models that can automate cyber tasks (just like coding). OpenAI’s GPT-5.5-cyber can now do the same. And all the frontier models (including those from China) will be there within approximately 6 months. It’s important to recognize that these models do not create vulnerabilities; they discover them. The bugs are already in the code. Using AI to discover and patch them will actually harden these systems. The leap from pre-AI cyber to post-AI cyber means that there will be a big upgrade cycle. After that, however, the market is likely to reach a new equilibrium between AI-powered cyber-offense and AI-powered cyber-defense. Obviously it’s important that cyber defenders get access before cyber attackers. That process is already underway but needs to happen quickly (see point above about Chinese models). Unlike Mythos, GPT-5.5-cyber appears not to be token constrained so it may be the first cyber model that defenders actually get to use.

5:45 PM · Apr 30, 2026 · 991K Views

8:00 PM · Apr 30, 2026 · 6.2K Views

QUOTE POST

#861Zvi Mowshowitz@THEZVI

Okay, since people seem to be not understanding the distinction here, I'll spell it out. They are not the same.

Mythos can, on its own, discover lots of new vulnerabilities, because it is capable of navigating and exploring on its own and stringing these things together. It doesn't need to be told exactly what to do, it can figure out what to do.

GPT-5.5 is at least as good as Mythos on 'narrow cyber tasks' as per UK AISI, but they have to be narrow. You need to know what it is you want done. That's valuable, but it's not at all the same thing, and far less dangerous.

If OpenAI could have compiled and fixed a similar stream of bugs in the world's most important software, at similar compute cost, I presume that they would have.

Indeed, GPT-5.5-Cyber exists, and yet the White House is objecting to Anthropic expanding deployment of Mythos. You think they're doing this for no reason?

Meanwhile, the whole 'everyone will have it in six months' is the usual pretending that the situation is much closer than it is, although of course on a long enough time horizon the point stands.

David Sacks@DavidSacks

It’s time to demystify Mythos. Mythos is not magic. It’s not a doomsday device. It’s the first of many models that can automate cyber tasks (just like coding). OpenAI’s GPT-5.5-cyber can now do the same. And all the frontier models (including those from China) will be there within approximately 6 months. It’s important to recognize that these models do not create vulnerabilities; they discover them. The bugs are already in the code. Using AI to discover and patch them will actually harden these systems. The leap from pre-AI cyber to post-AI cyber means that there will be a big upgrade cycle. After that, however, the market is likely to reach a new equilibrium between AI-powered cyber-offense and AI-powered cyber-defense. Obviously it’s important that cyber defenders get access before cyber attackers. That process is already underway but needs to happen quickly (see point above about Chinese models). Unlike Mythos, GPT-5.5-cyber appears not to be token constrained so it may be the first cyber model that defenders actually get to use.

5:45 PM · Apr 30, 2026 · 991K Views

8:43 PM · Apr 30, 2026 · 49K Views

REPLY

#965Adam.GPT@THEREALADAMG

@boazbaraktcs Fair feedback @boazbaraktcs!

Boaz Barak@boazbaraktcs

@TheRealAdamG @TheRealAdamG I love you but let's not fall into this trap - we should brag about how amazing GPT 5.5 is in codex for normal users, and not about its hacking capabilities.

10:06 PM · Apr 30, 2026 · 1.2K Views

10:11 PM · Apr 30, 2026 · 262 Views

QUOTE POST

#1494Chubby♨️@KIMMONISMUS

GPT-5.5 on par with Claude Mythos on mutli-step cyber-attack simulations?

OpenAI: come back of the year.

AI Security Institute@AISecurityInst

OpenAI’s GPT-5.5 is the second model to complete one of our multi-step cyber-attack simulations end-to-end 🧵

3:07 PM · Apr 30, 2026 · 1.4M Views

6:42 PM · Apr 30, 2026 · 17.2K Views

ORIGINAL POST

#1760Steven Sinofsky@STEVESI

Our evaluation of OpenAI's GPT-5.5 cyber capabilities https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities GPT-5.5 is one of the strongest models we have tested on our cyber tasks and is the second model to solve one of our multi-step cyber-attack simulations end-to-end. // of course it is and those that fell for Mythos being some sort of unfathomable cyber-weapon feel for PR hook, line, sinker.

9:00 PM · Apr 30, 2026 · 19.2K Views