UK researchers say advanced AI models are mastering complex cyber tasks at alarming speed
Artificial intelligence systems are rapidly learning how to perform complex cybersecurity tasks once handled only by highly trained human experts, according to alarming new findings from the UK’s AI Security Institute.
Researchers say the pace of improvement is accelerating so quickly that advanced AI models are now completing sophisticated cyber operations in dramatically shorter timeframes than previously expected.
The institute, commonly known as AISI, has been tracking how effectively frontier AI models can carry out cybersecurity work using what it calls a “time window benchmark”. The benchmark compares how long it would take a skilled human cybersecurity professional to complete the same task successfully.
The latest results have startled researchers.
Embed from Getty Images
According to AISI, some advanced large language models can now autonomously complete tasks equivalent to around 16 minutes of expert-level human cybersecurity work with an 80 per cent success rate. The institute tested the systems using a restricted computing budget of 2.5 million tokens, meaning performance could potentially improve further without those limitations.
Researchers say the most concerning trend is not just the capability itself, but the speed at which these systems are improving.
Back in November 2025, AISI estimated that the time horizon for AI cybersecurity capabilities was doubling roughly every eight months. By February 2026, researchers had already revised that estimate downward to 4.7 months after observing rapid progress across newer reasoning models.
Now, after the release of Anthropic’s Mythos Preview and OpenAI GPT-5.5, the institute says the pace has accelerated even further.
AISI stopped short of naming an exact new doubling rate but indicated the improvement curve is now approaching a four-month cycle.
The organisation compared its findings with separate research from METR, a non-profit AI research group focused on measuring software engineering performance. According to METR’s estimates, AI capability on software-related tasks has been doubling approximately every 4.2 months since late 2024.
The newest Mythos Preview model reportedly pushed that pace even closer to four months.
Researchers stressed that the benchmark measures only narrow cybersecurity and software-related abilities. They warned the results should not be interpreted as evidence that AI systems are becoming universally twice as intelligent every few months.
Even so, the detailed tests revealed striking progress.
One simulated exercise known as “The Last Ones” involves a 32-step attack on a fictional corporate network. The latest Mythos Preview model successfully completed the challenge in six out of ten attempts.
Researchers also reported that the AI completed a previously unsolved industrial control system attack simulation called “Cooling Tower” in three out of ten attempts. That scenario involved seven stages of cyber exploitation targeting infrastructure systems.
The findings represent a significant jump compared with earlier frontier models.
When Anthropic’s Opus 4.6 model underwent evaluation in February 2026, it failed to complete the full 32-step attack sequence. The furthest it managed to reach was step 22, where the AI had to reverse-engineer a Windows service binary, recover encrypted credentials and escalate privileges to gain deeper system access.
Those types of operations would normally require highly specialised cybersecurity expertise.
AISI warned that the speed of development raises serious questions about future cyber threats and defensive readiness.
“Frontier AI’s autonomous cyber and software capability is advancing quickly,” the institute concluded. “The length of cyber tasks that frontier models can complete autonomously has doubled on the order of months, not years.”
At the same time, researchers cautioned against overstating the implications. The institute emphasised that success inside controlled testing environments does not automatically translate into real-world dominance against defended systems operated by professional security teams.
One real-world example highlighted the limitations still facing current models.
The widely used curl software project recently allowed Mythos to inspect its codebase for vulnerabilities. Despite the AI’s growing sophistication, the model reportedly identified only one confirmed security flaw.
Even so, cybersecurity experts increasingly fear the rapid evolution of AI-assisted hacking tools could eventually lower the barrier for complex cyberattacks, giving less experienced threat actors access to capabilities previously reserved for elite professionals.
For now, researchers say the technology remains imperfect. But the speed at which these systems are improving is becoming impossible for the cybersecurity world to ignore.