Skip to main content
Early access β€” new tools and guides added regularly
Core AI

Regularization

Last reviewed: April 2026

A set of techniques that prevent AI models from memorising training data too closely, helping them perform better on new, unseen data.

Regularization is any technique that prevents a machine learning model from fitting too closely to its training data β€” a problem called overfitting. An overfitted model performs brilliantly on data it has seen but poorly on new data, which defeats the purpose of building a predictive model.

Why overfitting happens

Machine learning models are pattern finders. Given enough capacity, a model will find patterns in everything β€” including the noise and random quirks that are specific to the training dataset. A model predicting house prices might memorise that the three most expensive houses in the training data were all painted blue, and then incorrectly learn that blue paint increases value.

Regularization forces the model to learn general patterns rather than memorising specifics.

Common regularization techniques

  • L1 regularization (Lasso): Adds a penalty based on the absolute size of the model's weights. This encourages the model to set unimportant weights to exactly zero, effectively performing feature selection.
  • L2 regularization (Ridge): Adds a penalty based on the squared size of weights. This encourages smaller weights overall, preventing any single feature from dominating predictions.
  • Dropout: Used in neural networks. During training, random neurons are temporarily switched off, forcing the network to learn redundant representations rather than relying on specific pathways.
  • Early stopping: Monitor the model's performance on a validation set during training and stop when performance begins to degrade, even if training loss is still improving.
  • Data augmentation: Artificially expand the training dataset by creating modified versions of existing examples (rotating images, adding noise, paraphrasing text).

How to tell if you need regularization

The classic sign of overfitting is a large gap between training performance and validation performance. If your model achieves 98 percent accuracy on training data but only 75 percent on new data, it has memorised rather than learned.

Regularization in large language models

LLMs use regularization extensively during training. Dropout is common in transformer architectures. The massive scale of training data itself acts as a form of regularization β€” with billions of diverse examples, the model is less likely to memorise quirks from any single source.

The underfitting counterpart

Too much regularization causes the opposite problem β€” underfitting β€” where the model is too constrained to learn the real patterns in the data. Good model development involves finding the right balance.

Want to go deeper?
This topic is covered in our Advanced level. Access all 60+ lessons free.

Why This Matters

Regularization is what separates a model that works in the lab from one that works in production. When evaluating AI solutions or reviewing model performance reports, understanding regularization helps you ask the right questions about whether a model will generalise to your real-world data or merely perform well on test benchmarks.

Related Terms

Learn More

Continue learning in Advanced

This topic is covered in our lesson: How AI Models Learn and Generalise