Core AI

Normalisation

Last reviewed: April 2026

The process of scaling numerical data to a standard range or distribution, ensuring that no single feature dominates model training simply because of its scale.

Normalisation (also spelled normalization) is the process of rescaling numerical data so that features with different units or ranges contribute equally to model training. Without normalisation, features with large values can dominate the learning process.

Why normalisation matters

Imagine a model using two features: annual salary (ranging from 20,000 to 200,000) and years of experience (ranging from 0 to 40). Without normalisation, salary would have a much larger influence on the model simply because its numbers are bigger — not because it is more important. Normalisation puts all features on an equal footing.

Common normalisation techniques

Min-max scaling — rescales values to a fixed range, typically 0 to 1. Formula: (value - min) / (max - min). Good when you know the data bounds.
Z-score standardisation — centres data at zero with a standard deviation of one. Formula: (value - mean) / standard deviation. Good for data that follows a roughly normal distribution.
Robust scaling — uses median and interquartile range instead of mean and standard deviation. Less affected by outliers.
Log transformation — applies a logarithm to compress the range of highly skewed data (common for income, prices, or page views).

When normalisation is essential

Distance-based algorithms — k-nearest neighbours, clustering, and support vector machines are highly sensitive to feature scales
Gradient descent — training converges much faster with normalised features because the loss landscape is more uniform
Neural networks — batch normalisation and layer normalisation are standard components that normalise activations between layers
Regularisation — techniques like L1 and L2 regularisation penalise large weights, so features must be on similar scales for fair penalisation

When normalisation is unnecessary

Tree-based models — decision trees, random forests, and gradient boosted trees split on thresholds and are unaffected by feature scale
Pre-normalised data — data that is already on a consistent scale (percentages, binary flags)

Normalisation in deep learning

Modern neural networks use internal normalisation layers:

Batch normalisation — normalises activations across a mini-batch
Layer normalisation — normalises across features within each example (used in transformers)

These techniques stabilise and accelerate training, and are now standard in most architectures.

Want to go deeper?

This topic is covered in our Practitioner level. Access all 100+ lessons free.

Why This Matters

Normalisation is a fundamental data preparation step that can dramatically improve model performance with minimal effort. If your data science team skips it, distance-based models and neural networks will perform poorly regardless of how good the underlying data is. It is one of the simplest yet most impactful preprocessing steps.

Related Terms

Machine Learning (ML)

A type of AI where systems learn patterns from data instead of following explicitly programmed rules. The system improves its performance through experience.

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns. The 'deep' refers to the number of layers, not the depth of understanding.

Feature Engineering

The process of selecting, transforming, and creating input variables from raw data to help machine learning models learn more effectively.

Training Data

The dataset used to teach an AI model. The quality, size, and composition of training data directly determines what the AI can and cannot do well.

Learn More

Continue learning in Practitioner

This topic is covered in our lesson: Building Your First AI Workflow

← Back to Glossary