1d ago

METR evaluates Anthropic Claude Mythos Preview at 16-hour risk horizon

0

METR evaluated early Anthropic Claude Mythos Preview in March 2026, estimating 50% time horizon of at least 16 hours (95% CI: 8.5–55 hours) on risk-assessment tasks. The model more than doubled the time horizon of the next-best system on METR’s 80% success-rate benchmark, hitting the upper limit of current measurement capabilities using standard software engineering and agentic tasks.

Original post

We evaluated an early version of Claude Mythos Preview for risk assessment during a limited window in March 2026. We estimated a 50%-time-horizon of at least 16hrs (95% CI 8.5hrs to 55hrs) on our task suite, at the upper end of what we can measure without new tasks.

4:41 PM · May 8, 2026 View on X

AI 1000 · 21 actions