Skip to main content
Early access β€” new tools and guides added regularly
Core AI

Backpropagation

Last reviewed: April 2026

The training algorithm that teaches neural networks by calculating how much each weight contributed to errors and adjusting them to reduce mistakes.

Backpropagation β€” short for backward propagation of errors β€” is the algorithm that makes neural network training possible. It is how a network learns from its mistakes.

How backpropagation works

Training a neural network involves three phases repeated millions of times:

  1. Forward pass β€” input data flows through the network, layer by layer, producing a prediction
  2. Loss calculation β€” the prediction is compared to the correct answer using a loss function, producing an error score
  3. Backward pass (backpropagation) β€” the error is sent backwards through the network, and calculus (specifically, the chain rule) is used to calculate how much each weight contributed to the error

Once the contributions are known, each weight is adjusted slightly to reduce the error. This adjustment is controlled by the learning rate β€” too large and the model overshoots; too small and training takes forever.

Why backpropagation matters

Before backpropagation became practical in the 1980s, there was no efficient way to train multi-layer neural networks. Single-layer networks could only solve simple problems. Backpropagation unlocked deep networks with many layers, which can learn complex patterns in language, images, and more.

The chain rule connection

Backpropagation relies on a fundamental concept from calculus called the chain rule. Each layer's contribution to the error depends on the layers after it. The chain rule lets you decompose this nested dependency into manageable calculations, working backwards from the output to the input.

Challenges with backpropagation

  • Vanishing gradients β€” in very deep networks, the error signal can become vanishingly small by the time it reaches early layers, causing them to stop learning. Techniques like ReLU activation and residual connections address this.
  • Computational cost β€” for large models with billions of parameters, backpropagation requires enormous computing resources
  • Local minima β€” the algorithm can get stuck in suboptimal solutions, though in practice this is less problematic than once feared
Want to go deeper?
This topic is covered in our Advanced level. Access all 60+ lessons free.

Why This Matters

Backpropagation is the engine behind every neural network you interact with. Understanding it demystifies how AI learns and helps explain why training large models is so expensive β€” every parameter must be adjusted through countless backward passes over massive datasets.

Related Terms

Learn More

Continue learning in Advanced

This topic is covered in our lesson: How LLMs Actually Work