Core AI

Hyperparameter Tuning

Last reviewed: April 2026

The process of finding the optimal configuration settings for an AI model — the knobs you set before training begins that determine how the model learns.

Hyperparameter tuning is the process of finding the best configuration settings for a machine learning model. Unlike model parameters (which are learned during training), hyperparameters are set before training begins and control how the learning process itself works.

Parameters versus hyperparameters

Parameters: The values the model learns during training — for example, the weights in a neural network. You do not set these; the training process finds them.
Hyperparameters: The settings you choose before training starts — for example, the learning rate, the number of layers, the batch size, the dropout rate. These control how training happens.

Think of it as the difference between what a student learns (parameters) and the study method they use (hyperparameters). You can choose the study method; the knowledge is acquired through the process.

Common hyperparameters

Learning rate: How much the model adjusts its weights in response to each error. Too high and the model overshoots; too low and training is painfully slow.
Batch size: How many training examples the model processes before updating its weights. Larger batches are more stable but use more memory.
Number of epochs: How many times the model sees the entire training dataset.
Network architecture: The number of layers, neurons per layer, and activation functions.
Regularisation strength: How aggressively overfitting is penalised (dropout rate, weight decay coefficient).

Tuning approaches

Grid search: Try every combination of hyperparameter values from a predefined grid. Thorough but computationally expensive.
Random search: Try random combinations. Surprisingly effective — research shows random search often finds good configurations faster than grid search.
Bayesian optimisation: Use a probabilistic model to predict which combinations are likely to perform well, focusing the search on promising regions.
Population-based training: Run multiple training runs simultaneously, periodically copying hyperparameters from the best-performing runs. Used by DeepMind for large-scale experiments.

Why tuning matters

The same model architecture with different hyperparameters can produce wildly different results. A neural network with a learning rate of 0.001 might achieve 95% accuracy, while the same network with a learning rate of 0.1 might fail to learn at all. Proper tuning is often the difference between a model that works and one that does not.

Automated approaches

Modern tools like Optuna, Ray Tune, and Weights & Biases Sweeps automate much of the hyperparameter tuning process. These tools manage the search, track results, and can early-stop unpromising configurations to save compute.

Practical advice

Start with established defaults from the literature or framework documentation. Tune the most impactful hyperparameters first (learning rate is almost always the most important). Use validation performance, not training performance, to evaluate configurations. And document everything — reproducibility depends on recording exact hyperparameter values.

Want to go deeper?

This topic is covered in our Practitioner level. Access all 100+ lessons free.

Why This Matters

Hyperparameter tuning is where much of the "art" in AI model development lies. Understanding this process helps you appreciate why two teams using the same algorithm can get very different results, and why AI development involves systematic experimentation rather than simply pressing a "train" button.

Related Terms

Hyperparameter

A configuration setting chosen before training begins — like learning rate, batch size, or number of layers — that controls how a model learns rather than what it learns.

Machine Learning (ML)

A type of AI where systems learn patterns from data instead of following explicitly programmed rules. The system improves its performance through experience.

Overfitting

When an AI model performs excellently on training data but poorly on new data because it has memorised specific examples rather than learning general patterns.

Cross-Validation

A statistical technique for evaluating AI models by splitting data into multiple training and testing subsets to get a reliable estimate of real-world performance.

Learn More

Continue learning in Practitioner

This topic is covered in our lesson: Building Your First AI Workflow

← Back to Glossary