Supervised Learning
A machine learning approach where the model learns from labelled examples — input data paired with correct answers. The most common type of machine learning in business applications.
Supervised learning is a type of machine learning where the model learns from labelled examples — training data that includes both the input and the correct output. The model studies thousands or millions of these input-output pairs, identifies patterns, and learns to predict the correct output for new inputs it has never seen before.
The analogy
Imagine teaching someone to identify dog breeds using flashcards. You show them a photo (input) and tell them the breed (label): "This is a Labrador. This is a Beagle. This is a Golden Retriever." After hundreds of examples, they learn to identify breeds on their own — even for photos they have never seen.
Supervised learning works the same way, except with data instead of flashcards.
How supervised learning works
- Collect labelled data: Gather examples where you know the correct answer. For spam detection, this means thousands of emails labelled as "spam" or "not spam."
- Train the model: The model processes the labelled data, looking for patterns that distinguish one category from another. What word patterns appear more often in spam?
- Validate: Test the model on labelled data it has not seen during training to check its accuracy.
- Deploy: Use the trained model to make predictions on new, unlabelled data.
The two main supervised learning tasks
- Classification: Predicting a category. "Is this email spam or not spam?" "Is this customer likely to churn or stay?" "Is this transaction fraudulent or legitimate?" The output is a discrete category.
- Regression: Predicting a number. "What will this house sell for?" "How many units will we sell next quarter?" "What is the expected lifetime value of this customer?" The output is a continuous value.
Supervised learning in business
Supervised learning powers many everyday business applications:
- Email spam filters: Trained on millions of emails labelled as spam or not spam
- Credit scoring: Trained on historical loan data with outcomes (defaulted or repaid)
- Image recognition: Trained on millions of images with labels (product categories, defects, faces)
- Sentiment analysis: Trained on reviews labelled as positive, negative, or neutral
- Sales forecasting: Trained on historical sales data with actual outcomes
- Medical diagnosis: Trained on patient data with confirmed diagnoses
The labelling challenge
The biggest bottleneck in supervised learning is creating labelled data. Someone has to go through thousands of examples and mark the correct answer. This is:
- Time-consuming: Labelling 10,000 images can take weeks
- Expensive: Professional labellers or domain experts cost money
- Error-prone: Inconsistent labelling degrades model quality
This is why pre-trained models (like LLMs) are so valuable — they were trained on vast amounts of data that was effectively self-labelled (predicting the next word in text), avoiding the manual labelling bottleneck.
Supervised learning vs LLMs
LLMs have changed how many supervised learning tasks are approached. Instead of training a custom model on labelled data, you can often prompt an LLM to perform the same task:
- Instead of training a sentiment analysis model, you can ask Claude: "Is this review positive or negative?"
- Instead of building a classification model, you can use few-shot prompting with examples
For many business tasks, LLM-based approaches are faster to implement and require no training data at all.
Why This Matters
Supervised learning is the foundation of most practical machine learning in business. Understanding it helps you recognise when your organisation has the labelled data needed for a custom ML project versus when a pre-trained LLM can handle the task without custom training. This distinction saves significant time and money — many projects that would have required months of supervised learning can now be solved with well-crafted prompts.
Related Terms
Continue learning in Foundations
This topic is covered in our lesson: AI vs Machine Learning vs Deep Learning