Core AI

Model Weights

Last reviewed: April 2026

The numerical values inside a neural network that determine how it processes information. Weights are what the model learns during training — they encode its knowledge and capabilities.

Model weights are the internal numerical values of a neural network that determine how it transforms input into output. When we say a model has been "trained," what we mean is that its weights have been adjusted — billions of times — until the model produces useful results. The weights are, in a very real sense, what the model knows.

What weights actually are

Imagine a neural network as a vast web of connections between nodes. Each connection has a number associated with it — its weight. When data flows through the network, each weight determines how much influence one node has on the next. A high weight means a strong connection; a low weight means a weak one; a negative weight means an inhibitory effect.

A modern large language model like Claude or GPT-5.4 has hundreds of billions of these weights. Together, they encode everything the model learned from its training data — language patterns, factual knowledge, reasoning strategies, writing styles, and more.

How weights are learned

During training, the model processes enormous amounts of data and adjusts its weights to minimise errors. This process works through a technique called backpropagation:

The model makes a prediction based on its current weights
The prediction is compared to the correct answer
The error is measured
The weights are adjusted slightly to reduce the error
This cycle repeats billions of times across the training dataset

Over time, the weights converge to values that produce consistently useful output. No human specifies what these values should be — the model discovers them from the data.

Why weights matter for business

Understanding model weights helps you grasp several important concepts:

Model size: When you hear "a 70-billion parameter model," those parameters are the weights. More weights generally means more capability but also more computational cost.
Model files: When you download an open-source model like Llama, you are downloading its weight files. These are the model's learned knowledge in numerical form.
Fine-tuning: When you fine-tune a model on your data, you are adjusting its weights to perform better on your specific tasks.
Quantisation: A technique that reduces the precision of weights (for example, from 32-bit to 4-bit numbers) to make models smaller and faster, with some quality trade-off.

Open weights vs closed weights

Some AI models publish their weights (open-weight models like Llama, Mistral, and Gemma), allowing anyone to download, run, and modify them. Others keep their weights proprietary (closed models like GPT-5.4 and Claude), offering access only through APIs.

Open-weight models give organisations more control over their data and deployment but require technical infrastructure to run. Closed models are easier to use but mean your data passes through the provider's servers.

Want to go deeper?

This topic is covered in our Foundations level. Access all 100+ lessons free.

Why This Matters

Model weights are the intellectual property of AI companies — they represent billions of pounds in training investment. Understanding weights helps you evaluate the trade-offs between open and closed models, understand why model sizes vary, and make informed decisions about fine-tuning and deployment. When a vendor talks about model parameters or open-source AI, you will know exactly what they mean.

Related Terms

Neural Network

A computing system loosely inspired by the human brain, made of layers of interconnected nodes that learn to recognise patterns in data.

Parameters

The total number of adjustable values in an AI model. A model with more parameters can capture more complex patterns but requires more computing power to train and run.

Training Data

The dataset used to teach an AI model. The quality, size, and composition of training data directly determines what the AI can and cannot do well.

Fine-Tuning

Training an existing AI model on your specific data to improve its performance on your specific tasks. Like giving the AI specialised on-the-job training.

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns. The 'deep' refers to the number of layers, not the depth of understanding.

GPU (Graphics Processing Unit)

A specialised processor originally designed for rendering graphics but now essential for training and running AI models. GPUs can perform thousands of calculations simultaneously.

Learn More

Continue learning in Foundations

This topic is covered in our lesson: How Large Language Models Actually Work

← Back to Glossary