Core AI

GPU (Graphics Processing Unit)

Last reviewed: April 2026

A specialised processor originally designed for rendering graphics but now essential for training and running AI models. GPUs can perform thousands of calculations simultaneously.

A GPU, or Graphics Processing Unit, is a specialised computer chip originally designed to render images and video for games and graphics applications. It turns out that the same kind of parallel processing that makes GPUs excellent at rendering pixels also makes them ideal for the mathematical operations that power AI. Today, GPUs are the backbone of the entire AI industry.

Why GPUs matter for AI

Traditional CPUs (the main processor in your computer) are designed to handle one complex task at a time, very quickly. GPUs are designed to handle thousands of simple tasks simultaneously. This is called parallel processing.

AI training and inference involve enormous numbers of matrix multiplications — mathematical operations that can be broken into thousands of independent calculations. A GPU can perform these calculations in parallel, making AI workloads run 10-100x faster than on a CPU alone.

The GPU landscape

The AI GPU market is dominated by NVIDIA, whose chips power the vast majority of AI training and inference worldwide:

NVIDIA H100/H200: The current standard for AI training and inference in data centres
NVIDIA A100: The previous generation, still widely used
NVIDIA RTX series: Consumer GPUs that can run smaller AI models locally
AMD MI300: An emerging competitor to NVIDIA's data centre GPUs
Apple Silicon (M-series): Integrated GPU capabilities that can run smaller models on laptops

GPUs and AI costs

GPU availability and cost are major factors in AI economics:

Training a frontier model requires thousands of GPUs running for months, costing hundreds of millions of pounds
A single high-end AI GPU (H100) costs approximately £25,000-40,000
AI companies invest billions in GPU clusters — entire data centres filled with GPUs
GPU scarcity has been a bottleneck for AI development, with waiting lists for the latest chips

This is why AI API pricing exists: instead of buying your own GPUs, you pay a provider (OpenAI, Anthropic, Google) a fraction of a penny per query to use their GPU infrastructure.

Cloud GPU access

You do not need to buy GPUs to use AI. Cloud providers offer GPU access on demand:

AWS, Google Cloud, Azure: Rent GPU instances by the hour for training or running models
Specialised providers: Lambda, CoreWeave, and others focus specifically on GPU cloud services
AI APIs: The simplest approach — use Claude, GPT, or Gemini through their APIs and the provider handles all GPU infrastructure

GPUs and your AI strategy

For most organisations, the GPU question boils down to:

API access (most common): You never think about GPUs. The AI provider manages everything. Best for most businesses.
Cloud GPUs: You rent GPU time to run your own models. Useful for custom AI applications or data privacy requirements.
On-premise GPUs: You buy and manage your own GPU hardware. Only necessary for organisations with strict data sovereignty requirements or very high-volume AI workloads.

Want to go deeper?

This topic is covered in our Foundations level. Access all 100+ lessons free.

Why This Matters

GPUs are the physical infrastructure that makes AI possible, and their cost and availability directly impact AI pricing, performance, and strategy. Understanding GPUs helps you evaluate why AI services cost what they do, why some models are faster than others, and whether your organisation should invest in its own GPU infrastructure or use cloud-based AI services. For most businesses, the answer is API access — but knowing why saves you from unnecessary infrastructure investments.

Related Terms

TPU (Tensor Processing Unit)

A custom AI chip designed by Google specifically for machine learning workloads. TPUs are optimised for the tensor operations that neural networks rely on.

Inference

The process of an AI model generating output from your input. Every time you send a prompt and get a response, that is inference.

Training Data

The dataset used to teach an AI model. The quality, size, and composition of training data directly determines what the AI can and cannot do well.

Parameters

The total number of adjustable values in an AI model. A model with more parameters can capture more complex patterns but requires more computing power to train and run.

Model Weights

The numerical values inside a neural network that determine how it processes information. Weights are what the model learns during training — they encode its knowledge and capabilities.

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns. The 'deep' refers to the number of layers, not the depth of understanding.

Related Comparisons

Ollama vs LM Studio

A detailed comparison of Ollama and LM Studio for running large language models locally. Covers ease of use, model support, performance, and developer experience.

Local AI vs Cloud AI

Running AI models locally vs using cloud APIs — privacy, cost, performance, capabilities, and maintenance compared. A practical guide to choosing your AI infrastructure.

Learn More

Continue learning in Foundations

This topic is covered in our lesson: How Large Language Models Actually Work

← Back to Glossary