Core AI

Decision Tree

Last reviewed: April 2026

A machine learning model that makes predictions by following a series of if-then rules, splitting data into branches like a flowchart until reaching a decision.

A decision tree is a machine learning model that makes predictions by asking a series of yes-or-no questions about the data, splitting into branches at each step until it reaches a final answer. It works exactly like a flowchart.

How decision trees work

Imagine predicting whether a customer will churn. The tree might start with: "Is their contract month-to-month?" If yes, go left. "Have they called support more than three times?" If yes, predict churn. Each split is chosen to best separate the data into pure groups.

The algorithm learns which questions to ask and in what order by finding the splits that create the most homogeneous groups. It uses metrics like Gini impurity or information gain to measure how well each potential split separates the data.

Strengths of decision trees

Interpretable — you can trace exactly why the model made any prediction by following the branches. This makes them popular in regulated industries.
No data preparation needed — they handle numerical and categorical data, missing values, and outliers without extensive preprocessing
Fast — both training and prediction are computationally cheap
Non-linear — they capture complex relationships that linear models miss

Weaknesses

Overfitting — a single decision tree will happily memorise every quirk in the training data, performing poorly on new data
Instability — small changes in data can produce a completely different tree
Limited accuracy — a single tree rarely matches the performance of more complex models

Ensemble methods

The solution to these weaknesses is using many trees together:

Random Forests — train hundreds of trees on random subsets of data and features, then average their predictions. This reduces overfitting dramatically.
Gradient Boosted Trees (XGBoost, LightGBM) — train trees sequentially, with each new tree correcting the errors of the previous ones. Often the best-performing model for structured business data.

Want to go deeper?

This topic is covered in our Foundations level. Access all 100+ lessons free.

Why This Matters

Decision trees and their ensemble variants are the workhorses of business AI. For structured data like sales records, customer databases, and financial data, gradient boosted trees often outperform deep learning while being faster, cheaper, and more interpretable. They are the right starting point for most tabular data problems.

Related Terms

Machine Learning (ML)

A type of AI where systems learn patterns from data instead of following explicitly programmed rules. The system improves its performance through experience.

Classification

An AI task that assigns input to predefined categories. Spam detection, sentiment analysis, and image recognition are all classification tasks.

Supervised Learning

A machine learning approach where the model learns from labelled examples — input data paired with correct answers. The most common type of machine learning in business applications.

Artificial Intelligence (AI)

Software that can perform tasks that normally require human intelligence, such as understanding language, recognising patterns, and making decisions.

Learn More

Continue learning in Foundations

This topic is covered in our lesson: What Is Artificial Intelligence (Really)?

← Back to Glossary