Core AI

Dropout

Last reviewed: April 2026

A regularisation technique where randomly selected neurons are temporarily deactivated during training, forcing the network to develop more robust and generalisable features.

Dropout is a regularisation technique used during neural network training. At each training step, a random subset of neurons is temporarily "dropped out" — deactivated and ignored. This forces the remaining neurons to compensate, preventing the network from becoming overly reliant on any single neuron or group of neurons.

How dropout works

During each training step:

Each neuron has a probability (typically 50%) of being temporarily removed.
The remaining neurons process the input and produce the output.
Weights are updated based on the reduced network.
In the next step, a different random set of neurons is dropped.

At inference time (when the model is actually being used), all neurons are active. Their outputs are scaled down by the dropout rate to compensate for the fact that more neurons are now contributing.

Why dropout prevents overfitting

Overfitting occurs when a model memorises the training data rather than learning general patterns. Dropout combats this in several ways:

Redundancy: Because any neuron might be dropped, the network cannot rely on specific neurons to memorise specific training examples. It must distribute knowledge across many neurons.
Ensemble effect: Training with dropout is mathematically similar to training many slightly different networks and averaging their predictions — an approach known to improve generalisation.
Feature independence: Neurons cannot co-adapt — learn to work only in combination with specific other neurons — because their partners change randomly at each step.

The intuition

Think of a team where any member might be absent on any given day. The team cannot rely on one expert for a critical task — everyone must have some capability to cover. This makes the team more resilient. Dropout achieves the same effect in neural networks.

Practical considerations

Dropout rate: The typical rate is 0.5 (50% of neurons dropped) for hidden layers and 0.2 (20%) for input layers. These are starting points — the optimal rate depends on the network and task.
Training time: Dropout typically requires more training steps because each step uses a smaller effective network. However, each step is faster because fewer neurons are active.
Modern alternatives: While dropout remains widely used, newer techniques like batch normalisation, weight decay, and data augmentation can serve similar purposes. Many modern architectures use a combination.

Historical significance

Introduced by Geoffrey Hinton and colleagues in 2012, dropout was one of the key innovations that made deep learning practical. Before dropout, deep networks were notoriously prone to overfitting. Dropout provided a simple, effective solution that required almost no additional complexity.

Want to go deeper?

This topic is covered in our Advanced level. Access all 100+ lessons free.

Why This Matters

Dropout illustrates a fundamental principle in AI: models perform better when they are forced to be robust rather than allowed to take shortcuts. Understanding regularisation techniques like dropout helps you evaluate whether an AI model has been properly trained and is likely to perform reliably on new, unseen data.

Related Terms

Overfitting

When an AI model performs excellently on training data but poorly on new data because it has memorised specific examples rather than learning general patterns.

Regularization

A set of techniques that prevent AI models from memorising training data too closely, helping them perform better on new, unseen data.

Neural Network

A computing system loosely inspired by the human brain, made of layers of interconnected nodes that learn to recognise patterns in data.

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns. The 'deep' refers to the number of layers, not the depth of understanding.

Learn More

Continue learning in Advanced

This topic is covered in our lesson: AI Infrastructure and Deployment

← Back to Glossary