Core AI

Confusion Matrix

Last reviewed: April 2026

A table that shows the breakdown of a classification model's correct and incorrect predictions, revealing exactly where and how the model makes mistakes.

A confusion matrix is a table that breaks down a classification model's predictions into four categories, showing not just whether the model is right or wrong, but exactly how it is wrong. It is one of the most useful tools for understanding model performance.

The four cells (binary classification)

For a model that predicts "yes" or "no" (such as spam detection):

True Positives (TP) — model said "spam" and it was spam. Correct.
True Negatives (TN) — model said "not spam" and it was not spam. Correct.
False Positives (FP) — model said "spam" but it was not spam. A false alarm.
False Negatives (FN) — model said "not spam" but it was spam. A missed detection.

Why the breakdown matters

Accuracy alone tells you the overall percentage correct, but the confusion matrix tells you the story behind that number. Two models with identical accuracy can have very different error patterns:

Model A catches ninety-five per cent of spam but sends ten per cent of legitimate emails to the spam folder
Model B catches seventy per cent of spam but almost never misclassifies legitimate email

Which is better depends on your use case. The confusion matrix makes this trade-off visible.

Metrics derived from the confusion matrix

Precision — of all the items the model labelled positive, what fraction actually were? (TP / (TP + FP))
Recall — of all the actual positives, what fraction did the model catch? (TP / (TP + FN))
F1 Score — the harmonic mean of precision and recall, balancing both

Multi-class confusion matrices

For models with more than two classes, the matrix expands. A three-class model produces a three-by-three grid. This reveals which specific classes the model confuses with each other — invaluable for understanding failure patterns.

Want to go deeper?

This topic is covered in our Practitioner level. Access all 100+ lessons free.

Why This Matters

When evaluating AI for high-stakes decisions — medical diagnosis, fraud detection, content moderation — the type of error matters as much as the error rate. A confusion matrix shows you the full picture, helping you decide whether a model's error pattern is acceptable for your specific use case.

Related Terms

Classification

An AI task that assigns input to predefined categories. Spam detection, sentiment analysis, and image recognition are all classification tasks.

Machine Learning (ML)

A type of AI where systems learn patterns from data instead of following explicitly programmed rules. The system improves its performance through experience.

Accuracy (AI)

A metric that measures how often an AI model's predictions are correct, expressed as a percentage of total predictions.

Precision and Recall

Two complementary metrics for classification models — precision measures how many predicted positives were correct, recall measures how many actual positives were found.

Learn More

Continue learning in Practitioner

This topic is covered in our lesson: Building Your First AI Workflow

← Back to Glossary