Core AI

Hyperparameter

Last reviewed: April 2026

A configuration setting chosen before training begins — like learning rate, batch size, or number of layers — that controls how a model learns rather than what it learns.

A hyperparameter is a configuration value set before training begins that controls the training process itself. Unlike regular parameters (weights and biases that the model learns from data), hyperparameters are chosen by the practitioner and remain fixed during training.

Parameters vs. hyperparameters

Parameters — learned from data during training. A neural network's millions of weights are parameters. You do not set them; the training algorithm discovers them.
Hyperparameters — set by the practitioner before training. They control how the model learns, not what it learns.

Common hyperparameters

Learning rate — how large each update step is during gradient descent. The single most important hyperparameter.
Batch size — how many training examples are processed together before updating weights.
Number of epochs — how many times the model sees the entire training dataset.
Number of layers and neurons — the architecture of the neural network.
Dropout rate — the fraction of neurons randomly deactivated during training to prevent overfitting.
Regularisation strength — how much the model is penalised for complexity.
Temperature — in language models, controls randomness in text generation.

How hyperparameters are chosen

Manual tuning — try different values based on experience and intuition
Grid search — systematically try every combination of a predefined set of values
Random search — try random combinations, which is often more efficient than grid search
Bayesian optimisation — use previous results to intelligently choose which combinations to try next
Learning rate schedulers — automatically adjust the learning rate during training

The impact of hyperparameters

Good hyperparameter choices can be the difference between a model that works brilliantly and one that fails completely — even with identical data and architecture. This is why experienced ML engineers are valuable: they have intuition about which hyperparameters matter most for which problems.

Hyperparameters in LLM usage

When using language models via APIs, the hyperparameters you control include temperature, top-p, maximum tokens, and system prompt configuration. These settings significantly affect output quality and should be tuned for each use case.

Want to go deeper?

This topic is covered in our Practitioner level. Access all 100+ lessons free.

Why This Matters

Hyperparameters are the tuning knobs of AI. When a model underperforms, the issue is often hyperparameter configuration rather than data quality or architecture. Understanding this helps you have productive conversations with data science teams and set realistic expectations for the iterative nature of model development.

Related Terms

Machine Learning (ML)

A type of AI where systems learn patterns from data instead of following explicitly programmed rules. The system improves its performance through experience.

Temperature

A setting that controls how creative or conservative AI output is. Low temperature = predictable and focused. High temperature = varied and creative.

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns. The 'deep' refers to the number of layers, not the depth of understanding.

Fine-Tuning

Training an existing AI model on your specific data to improve its performance on your specific tasks. Like giving the AI specialised on-the-job training.

Parameters

The total number of adjustable values in an AI model. A model with more parameters can capture more complex patterns but requires more computing power to train and run.

Learn More

Continue learning in Practitioner

This topic is covered in our lesson: Building Your First AI Workflow

← Back to Glossary