Core AI

Precision and Recall

Last reviewed: April 2026

Two complementary metrics for classification models — precision measures how many predicted positives were correct, recall measures how many actual positives were found.

Precision and recall are two complementary metrics that together give a much clearer picture of classification model performance than accuracy alone. They answer different questions about the same model.

Precision: "When the model says yes, how often is it right?"

Precision = True Positives / (True Positives + False Positives)

A spam filter with high precision rarely marks legitimate emails as spam. When it flags something, you can trust that it is probably spam. But it might miss some actual spam.

Recall: "Of all the actual positives, how many did the model find?"

Recall = True Positives / (True Positives + False Negatives)

A spam filter with high recall catches almost all spam. Very little gets through. But it might also flag some legitimate emails.

The precision-recall trade-off

In most systems, improving precision reduces recall and vice versa. You can catch more spam (higher recall) but you will also flag more legitimate emails (lower precision). The right balance depends on the cost of each type of error.

When to prioritise precision

Content recommendation — better to recommend fewer items than to recommend irrelevant ones
Email filtering — a false positive (legitimate email in spam) is very costly
Criminal justice — convicting an innocent person is worse than letting a guilty one go free

When to prioritise recall

Medical screening — missing a cancer diagnosis is worse than a false alarm that leads to further testing
Fraud detection — missing actual fraud is worse than investigating some legitimate transactions
Safety systems — missing a genuine threat is worse than occasional false alarms

F1 Score: balancing both

The F1 score is the harmonic mean of precision and recall, providing a single number that balances both. It is most useful when you care equally about precision and recall.

F1 = 2 * (Precision * Recall) / (Precision + Recall)

Precision-recall curves

By adjusting the model's classification threshold, you trace a curve showing how precision and recall trade off against each other. This curve is more informative than any single number and helps you choose the threshold that best fits your use case.

Want to go deeper?

This topic is covered in our Practitioner level. Access all 100+ lessons free.

Why This Matters

In business applications, the cost of false positives and false negatives is almost never equal. A fraud detection system that misses ninety per cent of fraud is useless regardless of its accuracy. Understanding precision and recall helps you define the right performance requirements for your specific use case rather than defaulting to accuracy alone.

Related Terms

Classification

An AI task that assigns input to predefined categories. Spam detection, sentiment analysis, and image recognition are all classification tasks.

Confusion Matrix

A table that shows the breakdown of a classification model's correct and incorrect predictions, revealing exactly where and how the model makes mistakes.

Accuracy (AI)

A metric that measures how often an AI model's predictions are correct, expressed as a percentage of total predictions.

Machine Learning (ML)

A type of AI where systems learn patterns from data instead of following explicitly programmed rules. The system improves its performance through experience.

Model Evaluation

The systematic process of measuring an AI model's performance using held-out data and appropriate metrics to determine whether it is good enough for its intended use.

Learn More

Continue learning in Practitioner

This topic is covered in our lesson: Building Your First AI Workflow

← Back to Glossary