Back to news
'Catastrophic overtraining' could harm large language AI models that are trained on more data for the sake of training
@Source: techradar.com
Skip to main content
Tech Radar Pro
Tech Radar Gaming
Tech Radar Pro
TechRadar the business technology experts
Search TechRadar
View Profile
België (Nederlands)
Deutschland
North America
US (English)
Australasia
New Zealand
Expert Insights
Website builders
Web hosting
Best web hosting
Best website builder
Best office chairs
Expert Insights
'Catastrophic overtraining' could harm large language AI models that are trained on more data for the sake of training
Wayne Williams
13 April 2025
University researchers found less is sometimes more when it comes to LLMs
When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.
(Image credit: Shutterstock / NicoElNino)
Researchers from top US universities warn extending pre-training can be detrimental to performance
Too much pre-training can deliver worse performance due to something akin to the butterfly effect
The more they are pre-trained, the more they become sensitive to small changes that could disrupt the end result
Researchers from Carnegie Mellon, Stanford, Harvard, and Princeton are challenging one of AI development’s accepted core beliefs - that the more pre-training data the better the performance.
As reported by HPCwire, a new paper discuses the concept of “catastrophic overtraining,” whereby extended pre-training can harm a model’s performance after fine-tuning.
The researchers compared two versions of the OLMo-1B model, one trained on 2.3 trillion tokens and another on 3 trillion. Despite the larger training set, the more extensively trained model reportedly performed up to 3% worse on benchmarks like AlpacaEval and ARC.
You may like
DeepSeek and the race to surpass human intelligence
Shut it all down? Microsoft research suggests AI usage is making us feel dumber – but you don't need to panic yet
Reaching the inflection point
This performance drop, the study claims, is linked to a phenomenon called “progressive sensitivity.”
As the token count increases, the model becomes more fragile. Even small tweaks, like adjustments during fine-tuning, or the introduction of noise, can reverse earlier gains.
The authors demonstrated this by injecting Gaussian noise into pre-trained models, noting that performance degraded more sharply the longer the model was trained.
The point where this additional training starts to degrade performance is called the “inflection point.”
Are you a pro? Subscribe to our newsletter
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
Contact me with news and offers from other Future brandsReceive email from us on behalf of our trusted partners or sponsorsBy submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.
Once reached, the benefits of training start to become outweighed by the risk of internal instability. The study found that this tipping point often occurs beyond 2.5 trillion tokens in smaller models, like OLMo-1B.
“Catastrophic overtraining may be inevitable... especially when the pre-training and fine-tuning tasks are misaligned,” the authors warn in their paper, which you can access through the arXiv pre-print server.
While the researchers are not suggesting an end to pre-training, they do feel that developers should consider just how much pre-training is enough. As the paper concludes, “Our findings call for a renewed focus on model scaling that considers the entire training pipeline.”
For AI developers chasing scale, the message seems clear: sometimes, less really is more.
You might also like
'An extension of a scientist's brain': Researchers explore AI to augment inspiration
Researchers design tech that could 'potentially replace solar cells' in applications
A new AI tool wants to make serendipitous scientific discovery less human
Wayne Williams
Social Links Navigation
Wayne Williams is a freelancer writing news for TechRadar Pro. He has been writing about computers, technology, and the web for 30 years. In that time he wrote for most of the UK’s PC magazines, and launched, edited and published a number of them too.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.
DeepSeek and the race to surpass human intelligence
Shut it all down? Microsoft research suggests AI usage is making us feel dumber – but you don't need to panic yet
Hallucinations are dropping in ChatGPT but that's not the end of our AI problems
Navigating transparency, bias, and the human imperative in the age of democratized AI
Is the DeepSeek hype justified?
AI doesn't belong in the classroom unless you want kids to learn all the wrong lessons
China has spent billions of dollars building far too many data centers for AI and compute - could it lead to a huge market crash?
What’s next for AI innovation in a post-DeepSeek world
What companies can learn from the gold rush for the AI boom
The AI lie: how trillion-dollar hype is killing humanity
I tried using the Deep Research feature with Google's Gemini 2.5 Pro model, and now I wonder if an AI can overthink
The surprising reason ChatGPT and other AI tools make things up – and why it’s not just a glitch
Latest in Pro
“We want to work with the best" - Okta reveals new security tools designed to safeguard GenAI systems
Mass quishing attacks linked to organized crime gangs across the UK
Google Workspace is offering huge discounts for the US government
Mark Zuckerberg allegedly offered US data to China in bid to enter market, ex-Meta exec tells Senate
Microsoft study claims AI is still struggling to debug software
China admits behind closed doors it was involved in Volt Typhoon attacks
Canva launches Canva AI for coding, photo editing and spreadsheets
Russian hackers hit military mission in Ukraine with info-stealing malware on external drives
Oracle says "obsolete servers" hacked, denies cloud breach
Amazon CEO says it has to operate like the “world's largest startup”, urges AI investment
Mastering SaaS contract management: Five key strategies for IT leaders
Top US sensor maker Sensata hit by worrying ransomware attack
Latest in News
Leaked renders may have given us our first proper look at the Google Pixel Watch 4
NYT Connections hints and answers for Monday, April 14 (game #673)
Quordle hints and answers for Monday, April 14 (game #1176)
NYT Strands hints and answers for Monday, April 14 (game #407)
Are iPhone prices safe? Phones, computers, and chips are now exempt from US tariffs
Netflix is testing an AI search engine to supercharge your recommendations
Some of Siri's delayed Apple Intelligence features are tipped to arrive with iOS 19
NYT Connections hints and answers for Sunday, April 13 (game #672)
Quordle hints and answers for Sunday, April 13 (game #1175)
NYT Strands hints and answers for Sunday, April 13 (game #406)
ICYMI: the week's 7 biggest tech stories from tariff-based iPhone panic buying to Samsung One UI 7 update taking its times
Fujifilm's quirky new compact just leaked – and it could be 2025's most fun camera
LATEST ARTICLES
Microsoft is digging its own grave with Windows 11, and it has to stop
AMD is making a handheld gaming PC chip with proper AI capabilities, but do gamers really need this?
‘The key is to build a bridge with iOS’: OnePlus has a plan to tackle Apple's smartphone industry dominance
NYT Connections hints and answers for Monday, April 14 (game #673)
NYT Strands hints and answers for Monday, April 14 (game #407)
TechRadar is part of Future US Inc, an international media group and leading digital publisher. Visit our corporate site.
Contact Future's experts
Terms and conditions
Privacy policy
Cookies policy
Advertise with us
Web notifications
Accessibility Statement
Future US, Inc. Full 7th Floor, 130 West 42nd Street,
Please login or signup to comment
Please wait...
Related News
15 Mar, 2025
Special Olympics World Winter Games: Ind . . .
17 Mar, 2025
NZ Prime Minister's 'Champions Trophy' R . . .
13 Mar, 2025
Seeing Red! Waratahs U18s out to impress . . .
31 Mar, 2025
IPL 2025: RR skipper Riyan Parag penalis . . .
08 Apr, 2025
Jimmy Thelin expands Aberdeen scouting t . . .
28 Feb, 2025
Farewell Eurosport: home of lesser-spott . . .
18 Mar, 2025
How Newcastle outsmarted Liverpool to wi . . .
16 Apr, 2025
TikTok star young vicar opens up on 'ver . . .