Skip to main content
Early access — new tools and guides added regularly
Core AI

Hyperparameter

Last reviewed: April 2026

A configuration setting chosen before training begins — like learning rate, batch size, or number of layers — that controls how a model learns rather than what it learns.

A hyperparameter is a configuration value set before training begins that controls the training process itself. Unlike regular parameters (weights and biases that the model learns from data), hyperparameters are chosen by the practitioner and remain fixed during training.

Parameters vs. hyperparameters

  • Parameters — learned from data during training. A neural network's millions of weights are parameters. You do not set them; the training algorithm discovers them.
  • Hyperparameters — set by the practitioner before training. They control how the model learns, not what it learns.

Common hyperparameters

  • Learning rate — how large each update step is during gradient descent. The single most important hyperparameter.
  • Batch size — how many training examples are processed together before updating weights.
  • Number of epochs — how many times the model sees the entire training dataset.
  • Number of layers and neurons — the architecture of the neural network.
  • Dropout rate — the fraction of neurons randomly deactivated during training to prevent overfitting.
  • Regularisation strength — how much the model is penalised for complexity.
  • Temperature — in language models, controls randomness in text generation.

How hyperparameters are chosen

  • Manual tuning — try different values based on experience and intuition
  • Grid search — systematically try every combination of a predefined set of values
  • Random search — try random combinations, which is often more efficient than grid search
  • Bayesian optimisation — use previous results to intelligently choose which combinations to try next
  • Learning rate schedulers — automatically adjust the learning rate during training

The impact of hyperparameters

Good hyperparameter choices can be the difference between a model that works brilliantly and one that fails completely — even with identical data and architecture. This is why experienced ML engineers are valuable: they have intuition about which hyperparameters matter most for which problems.

Hyperparameters in LLM usage

When using language models via APIs, the hyperparameters you control include temperature, top-p, maximum tokens, and system prompt configuration. These settings significantly affect output quality and should be tuned for each use case.

Want to go deeper?
This topic is covered in our Practitioner level. Access all 60+ lessons free.

Why This Matters

Hyperparameters are the tuning knobs of AI. When a model underperforms, the issue is often hyperparameter configuration rather than data quality or architecture. Understanding this helps you have productive conversations with data science teams and set realistic expectations for the iterative nature of model development.

Related Terms

Learn More

Continue learning in Practitioner

This topic is covered in our lesson: Building Your First AI Workflow