Home Blog How Close Are Today’s AI Models to AGI—And to Self-Improving into Superintelligence?

How Close Are Today’s AI Models to AGI—And to Self-Improving into Superintelligence?

20
0
How Close Are Today’s AI Models to AGI—And to Self-Improving into Superintelligence?


Are We Seeing the First Steps Toward AI Superintelligence?

Today’s leading AI models can already write and refine their own software. The question is whether that self-improvement can ever snowball into true superintelligence

Digital human face composed of glowing particles connects to futuristic microchip emitting bright data streams

KTSDESIGN/SCIENCE PHOTO LIBRARY

The Matrix, The Terminator—so much of our science fiction is built around the dangers of superintelligent artificial intelligence: a system that exceeds the best humans across nearly all cognitive domains. OpenAI CEO Sam Altman and Meta CEO Mark Zuckerberg have predicted we’ll achieve such AI in the coming years. Yet machines like those depicted as battling humanity in those movies would have to be far more advanced than ChatGPT, not to mention more capable of making Excel spreadsheets than Microsoft Copilot. So how can anyone think we’re remotely close to artificial superintelligence?

One answer goes back to 1965, when statistician Irving John Good introduced the idea of an “ultraintelligent machine.” He wrote that once it became sufficiently sophisticated, a computer would rapidly improve itself. If this seems far-fetched, consider how AlphaGo Zero—an AI system developed at DeepMind in 2017 to play the ancient Chinese board game Go—was built. Using no data from human games, AlphaGo Zero played itself millions of times, achieving in days an improvement that would have taken a human a lifetime and that allowed it to defeat the previous versions of AlphaGo that had already beaten the world’s best human players. Good’s idea was that any system that was sufficiently intelligent to rewrite itself would create iterations of itself, each one smarter than the previous and even more capable of improvement, triggering an “intelligence explosion.”

The question, then, is how close we are to that first system capable of autonomous self-improvement. Though the runaway systems Good described aren’t here yet, self-improving computers are—at least in narrow domains. AI is already running code on itself. OpenAI’s Codex and Anthropic’s Claude Code can work independently for an hour or more writing new code or updating existing code. Using Codex recently, I thumbed a prompt into my phone while on a walk, and it made a working website before I reached home. In the hands of skilled coders, such systems can do dramatically more, from reorganizing large code bases to sketching entirely new ways to build the software in the first place.


On supporting science journalism

If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


So why hasn’t a model powering ChatGPT quietly coded itself into ultraintelligence? The hitch is in the phrase above: “in the hands of skilled coders.” Despite AI’s impressive improvements, our current systems still rely on humans to set goals, design experiments and decide which changes count as genuine progress. They’re not yet capable of evolving independently in a robust way, which makes some talk about imminent superintelligence seem blown out of proportion—unless, of course, current AI systems are closer than they appear to being able to self-improve in increasingly broad slices of their abilities.

One area in which they already look superhuman is how much information they can absorb and manipulate. The most advanced models are trained on far more text than any human could read in a lifetime—from poetry to history to the sciences. They can also keep track of far longer stretches of text while they work. Already, with commercially available systems such as ChatGPT and Gemini, I can upload a stack of books and have the AI synthesize and critique them in a way that would take a human weeks. That doesn’t mean the result is always correct or insightful—but it does mean that, in principle, a system like this could read its own documentation, logs, and code and propose changes at a speed and scale no engineering team could match.

Reasoning, however, is where these systems lag—though that’s no longer true in certain focused areas. DeepMind’s AlphaDev and related systems have already found new, more efficient algorithms for tasks such as sorting, results that are now used in real-world code and that go beyond simple statistical mimicry. Other models excel at formal mathematics and graduate-level science questions that resist simple pattern-matching. We can debate the value of any particular benchmark—and researchers are doing exactly that—but there’s no question that some AI systems have become capable of discovering solutions humans had not previously found.

If the systems already have these abilities, what, then, is the missing piece? One answer is artificial general intelligence (AGI), the sort of dynamic, flexible reasoning that allows humans to learn from one field and apply it to others. As I’ve previously written, we keep shifting our definitions of AGI as machines master new skills. But for the superintelligence question, what matters is not the label we attach; it’s whether a system can use its skills to reliably redesign and upgrade itself.

And this brings us back to Good’s “intelligence explosion.” If we do build systems with that kind of flexible, humanlike reasoning across many domains, what will separate it from superintelligence? Advanced models are already trained on more science and literature than any human, have far greater working memories and show extraordinary reasoning skills in limited domains. Once that missing piece of flexible reasoning is in place, and once we allow such systems to deploy those skills on their own code, data and training processes, could the leap to fully superhuman performance be shorter than we imagine?

Not everyone agrees. Some researchers believe we have yet to fundamentally understand intelligence and that this missing piece will take longer than expected to engineer. Others speak of AGI being achieved in a few years, leading to further advances far beyond human capacities. In 2024 Altman publicly suggested that superintelligence could arrive “in a few thousand days.”

If this sounds too much like science fiction, consider that AI companies regularly run safety tests on their systems to make sure they can’t go into a runaway self-improvement loop. METR, an independent AI safety group, evaluates models according to how long they can reliably sustain a complex task before reaching failure. This past November, its tests of GPT-5.1-Codex-Max came in around two hours and 42 minutes. This is a huge leap from GPT-4’s few minutes of such performance on the same metric, but it isn’t the situation Good described.

Anthropic runs similar tests on its AI systems. “To be clear, we are not yet at ‘self-improving AI,’” wrote the company’s co-founder and head of policy Jack Clark in October, “but we are at the stage of ‘AI that improves bits of the next AI, with increasing autonomy.’”

If AGI is achieved, and we add human-level judgment to an immense information base, vast working memory and extraordinary speed, Good’s idea of rapid self-improvement starts to look less like science fiction. The real question is whether we’ll stop at “mere human”—or risk overshooting.

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can’t-miss newsletters, must-watch videos, challenging games, and the science world’s best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

WordPress Plugins

LEAVE A REPLY

Please enter your comment!
Please enter your name here