Core AI

Overfitting

Last reviewed: April 2026

When an AI model performs excellently on training data but poorly on new data because it has memorised specific examples rather than learning general patterns.

Overfitting occurs when a machine learning model learns the training data too well — memorising specific examples, including their noise and quirks, rather than learning the general patterns that would help it perform well on new, unseen data.

The analogy

Imagine a student who memorises every answer in a textbook but cannot solve a problem worded differently. They score perfectly on practice tests (training data) but fail the real exam (new data). That is overfitting.

How to detect overfitting

The classic signal: training accuracy is high but validation/test accuracy is significantly lower. If your model gets ninety-eight per cent on training data but only seventy per cent on the test set, it has overfit.

Plot learning curves — training loss and validation loss over epochs. If training loss keeps decreasing while validation loss starts increasing, the model has begun memorising rather than learning.

Why overfitting happens

Model too complex — the model has more capacity (parameters) than the problem requires
Insufficient data — not enough examples to represent the full range of real-world variation
Training too long — the model has seen the data too many times and starts memorising
Noisy data — the model learns the noise along with the signal
Leaky features — features that accidentally encode the answer (like a patient ID that correlates with the diagnosis)

How to prevent overfitting

More data — the simplest and most effective solution when possible
Data augmentation — artificially expanding the training set
Regularisation — adding penalties for model complexity (L1, L2 regularisation)
Dropout — randomly deactivating neurons during training to prevent co-dependency
Early stopping — monitoring validation loss and stopping training before overfitting begins
Simpler models — using fewer layers, fewer parameters, or a less complex architecture
Cross-validation — evaluating on multiple different train-test splits for more reliable estimates

Underfitting: the opposite problem

If a model is too simple to capture the patterns in the data, it underfits — performing poorly on both training and test data. The goal is to find the sweet spot between underfitting and overfitting.

Want to go deeper?

This topic is covered in our Foundations level. Access all 100+ lessons free.

Why This Matters

Overfitting is the most common reason AI models fail in production after performing well during development. Understanding it helps you recognise warning signs early, ask the right questions about model evaluation, and ensure that the impressive demo results will actually translate to real-world performance.

Related Terms

Machine Learning (ML)

A type of AI where systems learn patterns from data instead of following explicitly programmed rules. The system improves its performance through experience.

Training Data

The dataset used to teach an AI model. The quality, size, and composition of training data directly determines what the AI can and cannot do well.

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns. The 'deep' refers to the number of layers, not the depth of understanding.

Model Evaluation

The systematic process of measuring an AI model's performance using held-out data and appropriate metrics to determine whether it is good enough for its intended use.

Learn More

Continue learning in Foundations

This topic is covered in our lesson: What Is Artificial Intelligence (Really)?

← Back to Glossary