Model Weights
The numerical values inside a neural network that determine how it processes information. Weights are what the model learns during training — they encode its knowledge and capabilities.
Model weights are the internal numerical values of a neural network that determine how it transforms input into output. When we say a model has been "trained," what we mean is that its weights have been adjusted — billions of times — until the model produces useful results. The weights are, in a very real sense, what the model knows.
What weights actually are
Imagine a neural network as a vast web of connections between nodes. Each connection has a number associated with it — its weight. When data flows through the network, each weight determines how much influence one node has on the next. A high weight means a strong connection; a low weight means a weak one; a negative weight means an inhibitory effect.
A modern large language model like Claude or GPT-4o has hundreds of billions of these weights. Together, they encode everything the model learned from its training data — language patterns, factual knowledge, reasoning strategies, writing styles, and more.
How weights are learned
During training, the model processes enormous amounts of data and adjusts its weights to minimise errors. This process works through a technique called backpropagation:
- The model makes a prediction based on its current weights
- The prediction is compared to the correct answer
- The error is measured
- The weights are adjusted slightly to reduce the error
- This cycle repeats billions of times across the training dataset
Over time, the weights converge to values that produce consistently useful output. No human specifies what these values should be — the model discovers them from the data.
Why weights matter for business
Understanding model weights helps you grasp several important concepts:
- Model size: When you hear "a 70-billion parameter model," those parameters are the weights. More weights generally means more capability but also more computational cost.
- Model files: When you download an open-source model like Llama, you are downloading its weight files. These are the model's learned knowledge in numerical form.
- Fine-tuning: When you fine-tune a model on your data, you are adjusting its weights to perform better on your specific tasks.
- Quantisation: A technique that reduces the precision of weights (for example, from 32-bit to 4-bit numbers) to make models smaller and faster, with some quality trade-off.
Open weights vs closed weights
Some AI models publish their weights (open-weight models like Llama, Mistral, and Gemma), allowing anyone to download, run, and modify them. Others keep their weights proprietary (closed models like GPT-4o and Claude), offering access only through APIs.
Open-weight models give organisations more control over their data and deployment but require technical infrastructure to run. Closed models are easier to use but mean your data passes through the provider's servers.
Why This Matters
Model weights are the intellectual property of AI companies — they represent billions of pounds in training investment. Understanding weights helps you evaluate the trade-offs between open and closed models, understand why model sizes vary, and make informed decisions about fine-tuning and deployment. When a vendor talks about model parameters or open-source AI, you will know exactly what they mean.
Related Terms
Continue learning in Foundations
This topic is covered in our lesson: How Large Language Models Actually Work