Overfitting
When an AI model performs excellently on training data but poorly on new data because it has memorised specific examples rather than learning general patterns.
Overfitting occurs when a machine learning model learns the training data too well β memorising specific examples, including their noise and quirks, rather than learning the general patterns that would help it perform well on new, unseen data.
The analogy
Imagine a student who memorises every answer in a textbook but cannot solve a problem worded differently. They score perfectly on practice tests (training data) but fail the real exam (new data). That is overfitting.
How to detect overfitting
The classic signal: training accuracy is high but validation/test accuracy is significantly lower. If your model gets ninety-eight per cent on training data but only seventy per cent on the test set, it has overfit.
Plot learning curves β training loss and validation loss over epochs. If training loss keeps decreasing while validation loss starts increasing, the model has begun memorising rather than learning.
Why overfitting happens
- Model too complex β the model has more capacity (parameters) than the problem requires
- Insufficient data β not enough examples to represent the full range of real-world variation
- Training too long β the model has seen the data too many times and starts memorising
- Noisy data β the model learns the noise along with the signal
- Leaky features β features that accidentally encode the answer (like a patient ID that correlates with the diagnosis)
How to prevent overfitting
- More data β the simplest and most effective solution when possible
- Data augmentation β artificially expanding the training set
- Regularisation β adding penalties for model complexity (L1, L2 regularisation)
- Dropout β randomly deactivating neurons during training to prevent co-dependency
- Early stopping β monitoring validation loss and stopping training before overfitting begins
- Simpler models β using fewer layers, fewer parameters, or a less complex architecture
- Cross-validation β evaluating on multiple different train-test splits for more reliable estimates
Underfitting: the opposite problem
If a model is too simple to capture the patterns in the data, it underfits β performing poorly on both training and test data. The goal is to find the sweet spot between underfitting and overfitting.
Why This Matters
Overfitting is the most common reason AI models fail in production after performing well during development. Understanding it helps you recognise warning signs early, ask the right questions about model evaluation, and ensure that the impressive demo results will actually translate to real-world performance.
Related Terms
Continue learning in Foundations
This topic is covered in our lesson: What Is Artificial Intelligence (Really)?