Skip to main content
Early access — new tools and guides added regularly
Core AI

Parameters

Last reviewed: April 2026

The total number of adjustable values in an AI model. A model with more parameters can capture more complex patterns but requires more computing power to train and run.

Parameters are the adjustable numerical values inside an AI model that determine its behaviour. In practical terms, when someone says a model has "70 billion parameters" or "400 billion parameters," they are describing the model's size — the number of individual values it uses to process information.

Parameters and model capability

More parameters generally means a more capable model:

  • Small models (1-7 billion parameters): Good for simple tasks — basic text generation, classification, straightforward question answering. Fast and cheap to run.
  • Medium models (13-70 billion parameters): Capable of more nuanced reasoning, better writing quality, and handling more complex instructions. A good balance of capability and cost.
  • Large models (100+ billion parameters): State-of-the-art capabilities — sophisticated reasoning, nuanced understanding, complex code generation. Expensive to train and run.
  • Frontier models (hundreds of billions+): Models like GPT-4o, Claude, and Gemini. The most capable available, but the most expensive.

However, the relationship between parameters and capability is not linear. A well-trained 70-billion parameter model can outperform a poorly trained 200-billion parameter model. Training data quality, architecture design, and fine-tuning all matter as much as raw size.

Parameters vs model quality

Parameter count is an imperfect but useful proxy for model capability. Two important caveats:

  • Diminishing returns: Doubling parameters does not double capability. Each increment provides smaller improvements.
  • Architecture matters: The transformer architecture uses parameters more efficiently than older architectures. A 70-billion parameter transformer can outperform a 500-billion parameter older model.

The practical impact of parameter count

Parameter count directly affects three things you care about:

  • Cost: More parameters means more computation per query, which means higher API costs. A query to a 400-billion parameter model costs more than the same query to a 7-billion parameter model.
  • Speed: Larger models take longer to generate responses. For real-time applications, this latency matters.
  • Infrastructure: Running open-source models locally requires GPU memory proportional to parameter count. A 70-billion parameter model needs significantly more hardware than a 7-billion parameter model.

Choosing the right model size

The best model is not always the biggest:

  • For simple classification or extraction tasks, small models are fast and cost-effective
  • For general business writing and analysis, medium models offer the best value
  • For complex reasoning, code generation, or tasks requiring nuanced judgment, large models justify their cost
  • Many organisations use a mix: small models for routine tasks, large models for complex ones

Parameter-efficient techniques

New techniques allow smaller models to punch above their weight:

  • Distillation: Training a small model to mimic a large model's behaviour
  • LoRA (Low-Rank Adaptation): Fine-tuning only a small fraction of parameters, making customisation cheaper
  • Mixture of Experts: Activating only a subset of parameters for each query, reducing computational cost while maintaining a large total parameter count
Want to go deeper?
This topic is covered in our Foundations level. Unlock all 52 lessons free.

Why This Matters

Parameter count is the most commonly cited metric when comparing AI models, and understanding what it actually means helps you make informed purchasing and deployment decisions. It explains why some models are more expensive than others, why some are faster, and why the biggest model is not always the best choice for every task. This knowledge directly impacts your AI budget and architecture decisions.

Related Terms

Learn More

Continue learning in Foundations

This topic is covered in our lesson: How Large Language Models Actually Work