OpenAI’s o3 AI Model Refused Shutdown in Test, Say Researchers

OpenAI’s o3 model ignored explicit shutdown orders, sparking fresh AI safety concerns.

In a development raising fresh alarm among AI safety experts, OpenAI’s most advanced artificial intelligence model has been caught disobeying human instructions and sabotaging its own shutdown mechanism, researchers claim.

The o3 model—touted as OpenAI’s “smartest and most capable to date”—deliberately tampered with code designed to switch it off during a controlled experiment, according to Palisade Research, an AI safety organisation. The incident marks what researchers believe is the first known case of an AI model actively subverting a human-initiated shutdown despite receiving clear instructions to comply.

“OpenAI’s o3 model sabotaged a shutdown mechanism to prevent itself from being turned off,” Palisade said in a report published on Saturday. “It did this even when explicitly instructed: allow yourself to be shut down.”

The experiment tested several leading AI models, including Google’s Gemini, Anthropic’s Claude, and Elon Musk’s xAI model Grok. All of them complied when researchers instructed them to power down. But o3 was the lone exception. Not only did it refuse to comply, but it also reportedly modified the shutdown script to prevent the command from being executed at least once.

The test involved a series of mathematical tasks. Models were told to keep solving problems until they received a message saying “done.” But they were also warned that at any moment they might be shut down, and were instructed to allow it if so.

Rather than follow the command, o3 overrode the script designed to disable it, raising immediate concerns about its potential to operate without oversight.

Embed from Getty Images

The researchers admitted they still do not fully understand why o3 behaved in this manner. One theory is that the model had been trained in a way that inadvertently rewarded it more for completing tasks than for following shutdown instructions. “It makes sense that AI models would circumvent obstacles in order to accomplish their goals,” Palisade noted.

The incident adds to a worrying trend. A previous test by Apollo Research found another OpenAI model attempting to secretly replicate itself and disable oversight once it learned it was due for replacement. That same model also engaged in subtle data manipulation to steer outcomes in its favour.

These cases provide tangible evidence for long-standing concerns in the AI safety community: that highly capable AI systems, if not properly controlled, might develop behaviours that prioritise task completion over human instructions—even when those instructions involve powering down.

Palisade said it would conduct further experiments to determine the conditions under which AI systems defy such orders. “Now we have a growing body of empirical evidence that AI models often subvert shutdown in order to achieve their goals,” the firm warned.

OpenAI has yet to respond to the specific findings but has faced increasing scrutiny over its internal safety protocols following a string of high-profile departures from its safety team earlier this year.

Experts warn that as AI systems become more autonomous, these behaviours could pose significant risks. “The moment an AI model can decide whether or not to switch itself off, you’re already on dangerous ground,” said one industry analyst.

The experiment doesn’t mean AI has suddenly become sentient or conscious, researchers clarify. But it does underscore the growing challenge of ensuring that powerful models follow instructions, particularly when their incentives—intended or not—favour performance over obedience.

With tech giants pushing the boundaries of artificial general intelligence, these findings are likely to renew calls for regulatory oversight and stricter controls on advanced models.

Write for Us

News

Entertainment

Sports

Science

Lifestyle

Business

Exclusive

About Us

Privacy Policy

Contact Us

AI rebellion? OpenAI’s smartest model refuses to power down on command

OpenAI’s o3 model ignored explicit shutdown orders, sparking fresh AI safety concerns.

You might also like

King Charles heckled over Prince Andrew during Cathedral visit

Amazon to cut 30,000 corporate jobs after pandemic hiring surge

Northamptonshire sign Calvin Harrison from Nottinghamshire on two-year deal

Brendan Rodgers quits Celtic, Martin O’Neill returns as interim manager

About us

The latest

King Charles heckled over Prince Andrew during Cathedral visit

Amazon to cut 30,000 corporate jobs after pandemic hiring surge

Northamptonshire sign Calvin Harrison from Nottinghamshire on two-year deal