The Limits of Large Language Models: A Turning Point in AI’s Quest for AGI

In our push towards machines that truly think and reason like us, it’s becoming clear that large language models (LLMs) are starting to hit a plateau. AI developers, who once saw these models as stepping stones to artificial general intelligence (AGI), are now realising that pure scaling might not be enough. As AI expert Gary Marcus puts it, “Nobody with intellectual integrity should still believe that pure scaling will get us to AGI.”

Look at OpenAI, for example. The organisation has poured billions into developing LLMs and is now one of the world’s most valuable startups, with a valuation that might exceed $500 billion. Despite the incredible popularity of products like ChatGPT, the long-awaited leap to AGI feels more distant than we’d hoped, with profitability still proving elusive and the initial hype cooling off.

Meanwhile, tech giants like Google, Meta, and Anthropic are also doubling down on scaling LLMs. There’s a growing concern that this collective push might be fuelling an AI bubble. Even OpenAI’s CEO, Sam Altman, has acknowledged that the industry’s excitement sometimes runs ahead of what the technology can actually deliver, even though AI continues to represent a significant technical advance.

Critics have also highlighted a fundamental challenge: these models excel at recognising patterns but stumble when it comes to genuine logical reasoning. A study by Apple researchers, aptly titled “The Illusion of Thinking,” points out that while LLMs are impressive in processing data, they often fall short in handling complex reasoning tasks. Andrew Gelman from Columbia University likened their performance to “jogging and running”—capable of covering a lot of ground quickly but lacking the focus needed for deeper insights.

Another hurdle is the tendency for misinterpretation and hallucinations. Research in Germany has revealed that hallucination rates in LLMs can range from 7% to 12% across several languages, suggesting that simply hoarding more data might not smooth out these issues.

This has spurred interest in alternative approaches. Instead of relying solely on LLMs, some researchers are exploring world models—systems that simulate real-world environments and learn from them. Pioneers like Fei Fei Li and Yann LeCun argue that these models, which make predictions based on environmental simulations, might bring us closer to a truly human-like AI. After all, as Fei Fei Li reminds us, “Language doesn’t exist in nature,” highlighting the need for systems that understand and interact with the physical world.

As we look to the future, whether through world models, multi-agent systems, or embodied AI, the path to AGI is sure to be full of challenges and innovative breakthroughs. If you’ve ever wrestled with the limitations of current tech, it’s reassuring to see experts hard at work exploring fresh strategies to move beyond existing boundaries.