Are Reasoning AI Models Running Out of Steam?

The AI industry has been riding a wave of rapid progress, especially in developing reasoning AI models that outperform previous systems in solving complex tasks like coding and math. But that momentum may be slowing sooner than expected. A new analysis by research nonprofit Epoch AI suggests that performance gains from these models could plateau within a year—posing serious challenges for the industry.

Reasoning models, such as OpenAI’s latest o3, have shown strong results across technical benchmarks. Unlike conventional models, they use more computing power to solve problems. That extra power helps boost performance, but also comes with a tradeoff: slower task completion and higher costs.

These models are trained in two major phases. First, they learn from massive datasets like standard models. Then comes the reinforcement learning stage, where the model gets feedback on how well it solves difficult problems. This second step is what makes them capable of reasoning—but it’s also resource-intensive.

Until now, labs like OpenAI hadn’t poured significant compute into the reinforcement phase. That changed with o3, which reportedly used ten times the compute of its predecessor, o1. Most of that extra power likely went into reinforcement learning. OpenAI has even confirmed that its next-generation models will rely even more heavily on reinforcement learning, using more compute during that stage than during the original training.

But scaling up reinforcement learning has its limits. Josh You, the analyst behind Epoch’s report, explains that while model training performance has been improving steadily—quadrupling annually—reinforcement learning performance is accelerating even faster, increasing tenfold every few months. If this trend continues, reasoning model progress could converge with the broader AI frontier by 2026, hitting a ceiling that compute alone can’t overcome.

Computing Limits Aren’t the Only Problem

Even if more computing power were available, that might not be enough. According to Epoch, research overhead is another growing concern. Reasoning AI models require highly specialized research teams and longer development cycles. If those costs stay high, they could slow or even cap progress. Simply scaling compute won’t be sustainable if the human cost of innovation rises just as quickly.

Worse still, reasoning models come with flaws that can’t be fixed by brute force alone. While powerful, they often hallucinate more than conventional models—making mistakes in ways that are harder to detect. They also cost significantly more to operate. That makes them harder to scale commercially, even if performance improves.

The takeaway from Epoch’s report is clear: the current boom in reasoning model performance may be short-lived. If compute scaling hits its ceiling and research costs continue climbing, the industry may need to rethink its strategy. New methods—not just more power—may be required to push reasoning AI forward.

For an industry that has invested heavily in this next generation of AI, the possibility of hitting a wall is alarming. It raises a difficult question: what happens when smarter models can’t get much smarter?

Share with others