Core AI

Supervised Learning

Last reviewed: April 2026

A machine learning approach where the model learns from labelled examples — input data paired with correct answers. The most common type of machine learning in business applications.

Supervised learning is a type of machine learning where the model learns from labelled examples — training data that includes both the input and the correct output. The model studies thousands or millions of these input-output pairs, identifies patterns, and learns to predict the correct output for new inputs it has never seen before.

The analogy

Imagine teaching someone to identify dog breeds using flashcards. You show them a photo (input) and tell them the breed (label): "This is a Labrador. This is a Beagle. This is a Golden Retriever." After hundreds of examples, they learn to identify breeds on their own — even for photos they have never seen.

Supervised learning works the same way, except with data instead of flashcards.

How supervised learning works

Collect labelled data: Gather examples where you know the correct answer. For spam detection, this means thousands of emails labelled as "spam" or "not spam."
Train the model: The model processes the labelled data, looking for patterns that distinguish one category from another. What word patterns appear more often in spam?
Validate: Test the model on labelled data it has not seen during training to check its accuracy.
Deploy: Use the trained model to make predictions on new, unlabelled data.

The two main supervised learning tasks

Classification: Predicting a category. "Is this email spam or not spam?" "Is this customer likely to churn or stay?" "Is this transaction fraudulent or legitimate?" The output is a discrete category.
Regression: Predicting a number. "What will this house sell for?" "How many units will we sell next quarter?" "What is the expected lifetime value of this customer?" The output is a continuous value.

Supervised learning in business

Supervised learning powers many everyday business applications:

Email spam filters: Trained on millions of emails labelled as spam or not spam
Credit scoring: Trained on historical loan data with outcomes (defaulted or repaid)
Image recognition: Trained on millions of images with labels (product categories, defects, faces)
Sentiment analysis: Trained on reviews labelled as positive, negative, or neutral
Sales forecasting: Trained on historical sales data with actual outcomes
Medical diagnosis: Trained on patient data with confirmed diagnoses

The labelling challenge

The biggest bottleneck in supervised learning is creating labelled data. Someone has to go through thousands of examples and mark the correct answer. This is:

Time-consuming: Labelling 10,000 images can take weeks
Expensive: Professional labellers or domain experts cost money
Error-prone: Inconsistent labelling degrades model quality

This is why pre-trained models (like LLMs) are so valuable — they were trained on vast amounts of data that was effectively self-labelled (predicting the next word in text), avoiding the manual labelling bottleneck.

Supervised learning vs LLMs

LLMs have changed how many supervised learning tasks are approached. Instead of training a custom model on labelled data, you can often prompt an LLM to perform the same task:

Instead of training a sentiment analysis model, you can ask Claude: "Is this review positive or negative?"
Instead of building a classification model, you can use few-shot prompting with examples

For many business tasks, LLM-based approaches are faster to implement and require no training data at all.

Want to go deeper?

This topic is covered in our Foundations level. Access all 100+ lessons free.

Why This Matters

Supervised learning is the foundation of most practical machine learning in business. Understanding it helps you recognise when your organisation has the labelled data needed for a custom ML project versus when a pre-trained LLM can handle the task without custom training. This distinction saves significant time and money — many projects that would have required months of supervised learning can now be solved with well-crafted prompts.

Related Terms

Machine Learning (ML)

A type of AI where systems learn patterns from data instead of following explicitly programmed rules. The system improves its performance through experience.

Classification

An AI task that assigns input to predefined categories. Spam detection, sentiment analysis, and image recognition are all classification tasks.

Regression

An AI task that predicts a numerical value based on input data. Sales forecasting, price estimation, and demand prediction are all regression tasks.

Unsupervised Learning

A machine learning approach where the model finds patterns in data without being given correct answers. Used for discovering hidden structure, grouping similar items, and detecting anomalies.

Training Data

The dataset used to teach an AI model. The quality, size, and composition of training data directly determines what the AI can and cannot do well.

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns. The 'deep' refers to the number of layers, not the depth of understanding.

Learn More

Continue learning in Foundations

This topic is covered in our lesson: AI vs Machine Learning vs Deep Learning

← Back to Glossary