Core AI

Cross-Validation

Last reviewed: April 2026

A statistical technique for evaluating AI models by splitting data into multiple training and testing subsets to get a reliable estimate of real-world performance.

Cross-validation is a method for assessing how well a machine learning model will perform on data it has never seen before. Instead of splitting your data into a single training set and test set, cross-validation creates multiple splits and evaluates the model on each one.

Why a simple train/test split is not enough

Imagine you have a dataset of 10,000 customer records and you want to build a model that predicts churn. You split the data: 8,000 for training and 2,000 for testing. The model scores 85% accuracy on the test set. Is that reliable?

Perhaps not. That particular random split might have put all the easy cases in the test set. Or the test set might not be representative of the full population. A single split gives you a single number, and you have no idea how much that number would change with a different split.

How k-fold cross-validation works

The most common approach is k-fold cross-validation (typically k=5 or k=10):

Divide your data into k equal-sized subsets (folds)
For each fold: train the model on the other k-1 folds and test on the held-out fold
Repeat until every fold has served as the test set exactly once
Average the results across all k runs

With 5-fold cross-validation, you get five accuracy scores instead of one. If they are all between 83% and 87%, you have high confidence the model genuinely performs around 85%. If they range from 60% to 95%, something is wrong — the model is unstable or the data has problematic patterns.

Variants of cross-validation

Stratified k-fold: Ensures each fold has the same proportion of each class. Essential when your data is imbalanced — for example, when only 5% of customers churn.
Leave-one-out: Each individual data point serves as its own test set. Very thorough but computationally expensive for large datasets.
Time series cross-validation: Respects the temporal order of data — always trains on past data and tests on future data. Critical for financial, demand, and any time-dependent predictions.

Cross-validation and overfitting

Cross-validation is one of the best defences against overfitting — the problem where a model memorises training data but fails on new data. By testing on multiple held-out sets, cross-validation gives you a realistic picture of generalisation performance rather than an optimistic one.

When to use cross-validation

Use cross-validation whenever you are selecting between models, tuning hyperparameters, or reporting performance metrics. It costs more computation time than a single split, but the reliability of the results is well worth it. In production settings, once you have selected your model via cross-validation, you retrain it on the full dataset before deployment.

Want to go deeper?

This topic is covered in our Practitioner level. Access all 100+ lessons free.

Why This Matters

When a vendor claims their AI model achieves 95% accuracy, cross-validation is how you verify that claim is robust and not an artefact of a lucky data split. Understanding this technique helps you ask the right questions about model evaluation and avoid deploying models that look good on paper but fail in production.

Related Terms

Overfitting

When an AI model performs excellently on training data but poorly on new data because it has memorised specific examples rather than learning general patterns.

Validation Set

A portion of data held back from training and used to evaluate an AI model's performance during development, helping prevent overfitting.

Accuracy (AI)

A metric that measures how often an AI model's predictions are correct, expressed as a percentage of total predictions.

Machine Learning (ML)

A type of AI where systems learn patterns from data instead of following explicitly programmed rules. The system improves its performance through experience.

Learn More

Continue learning in Practitioner

This topic is covered in our lesson: Building Your First AI Workflow

← Back to Glossary