Core AI

Loss Function

Last reviewed: April 2026

A mathematical function that measures how far a model's predictions are from the correct answers, providing the error signal that drives learning during training.

A loss function (also called a cost function or objective function) is the mathematical formula that measures how wrong a model's predictions are. It produces a single number — the loss — that the training algorithm works to minimise.

Why loss functions matter

The loss function defines what "good" means for a model. It is the model's report card during training. A lower loss means better predictions. The entire training process — gradient descent, backpropagation, parameter updates — exists to reduce this number.

Common loss functions

Mean Squared Error (MSE) — for regression tasks. Measures the average squared difference between predicted and actual values. Penalises large errors heavily.
Cross-Entropy Loss — for classification tasks. Measures how different the predicted probability distribution is from the actual labels. Used in nearly all classification models.
Binary Cross-Entropy — a special case for two-class (yes/no) classification
Huber Loss — combines MSE and absolute error, being less sensitive to outliers than pure MSE

Choosing the right loss function

The loss function must match your task:

Predicting a continuous value (price, temperature)? Use MSE or MAE.
Classifying into categories (spam/not spam, sentiment)? Use cross-entropy.
Ranking items (search results, recommendations)? Use ranking losses like pairwise or listwise losses.
Generating text? Language models use cross-entropy loss computed over the next-token prediction task.

The loss function shapes behaviour

Whatever you measure, the model optimises for. If your loss function penalises false negatives more than false positives, the model will err on the side of caution. This is why choosing the right loss function is a design decision with real-world consequences.

Monitoring loss during training

Training loss decreasing — the model is learning
Validation loss decreasing — the learning generalises to new data
Training loss decreasing but validation loss increasing — the model is overfitting
Loss not decreasing — the learning rate may be too high or too low, or the model architecture may be insufficient

Want to go deeper?

This topic is covered in our Advanced level. Access all 100+ lessons free.

Why This Matters

The loss function quietly shapes every AI model's behaviour. Understanding it helps you recognise that AI performance is not accidental — it is the result of specific design choices about what to optimise for. When a model behaves unexpectedly, the loss function is often the first place to investigate.

Related Terms

Machine Learning (ML)

A type of AI where systems learn patterns from data instead of following explicitly programmed rules. The system improves its performance through experience.

Gradient Descent

The optimisation algorithm that trains neural networks by iteratively adjusting model parameters in the direction that reduces prediction errors.

Backpropagation

The training algorithm that teaches neural networks by calculating how much each weight contributed to errors and adjusting them to reduce mistakes.

Neural Network

A computing system loosely inspired by the human brain, made of layers of interconnected nodes that learn to recognise patterns in data.

Learn More

Continue learning in Advanced

This topic is covered in our lesson: How LLMs Actually Work

← Back to Glossary