Core AI

TPU (Tensor Processing Unit)

Last reviewed: April 2026

A custom AI chip designed by Google specifically for machine learning workloads. TPUs are optimised for the tensor operations that neural networks rely on.

A TPU, or Tensor Processing Unit, is a custom-designed computer chip created by Google specifically for machine learning workloads. While GPUs are general-purpose parallel processors adapted for AI, TPUs are purpose-built from the ground up for the mathematical operations — particularly tensor operations — that neural networks rely on.

What is a tensor?

A tensor is a multi-dimensional array of numbers. In AI, tensors are the fundamental data structure:

A single number is a scalar (0-dimensional tensor)
A list of numbers is a vector (1-dimensional tensor)
A table of numbers is a matrix (2-dimensional tensor)
Higher-dimensional arrays are tensors (3+ dimensions)

Neural network computations are essentially tensor operations — multiplying, adding, and transforming these multi-dimensional arrays. TPUs are optimised specifically for these operations.

TPU vs GPU

Both TPUs and GPUs accelerate AI workloads, but they differ:

GPUs (primarily NVIDIA) are versatile, widely available, and supported by a mature software ecosystem. They handle a wide range of AI tasks well.
TPUs (Google only) are designed specifically for machine learning. They can be more efficient for certain workloads but are only available through Google Cloud.

In practice, most AI development happens on GPUs because of their wider availability and software support. TPUs are primarily used within Google's own AI projects and by organisations deeply integrated with Google Cloud.

Generations of TPUs

Google has released multiple TPU generations, each more powerful:

TPU v1 (2016): Google's first custom AI chip, used internally for inference
TPU v2/v3: Made available through Google Cloud for external developers
TPU v4: Significant performance improvements, used to train Google's PaLM and Gemini models
TPU v5/v6: Latest generations with further efficiency gains

Why TPUs matter

TPUs represent an important trend: the move toward specialised AI hardware. As AI workloads have grown, general-purpose processors have become insufficient. The race to build the most efficient AI chips now involves:

Google (TPUs)
NVIDIA (GPUs optimised for AI)
AMD (MI-series GPUs)
Amazon (Trainium and Inferentia chips)
Microsoft (Maia chips)
Various startups (Cerebras, Groq, SambaNova)

This competition drives down costs and improves performance, which ultimately benefits everyone using AI services.

Practical implications

For most business users, TPUs are invisible — they power some Google AI services behind the scenes, but you interact with them through APIs, not directly. The key takeaway is understanding why custom AI chips exist: AI workloads are so massive and economically important that companies are designing specialised hardware to run them more efficiently. This trend will continue to improve AI performance and reduce costs over time.

Want to go deeper?

This topic is covered in our Foundations level. Access all 100+ lessons free.

Why This Matters

TPUs illustrate the broader trend of purpose-built AI hardware, which directly impacts AI pricing and performance. Understanding that companies like Google invest billions in custom chips helps you appreciate why AI capabilities are improving so rapidly and why costs continue to fall. For strategic planning, the hardware competition between Google, NVIDIA, AMD, and others suggests that AI will continue to become cheaper and more accessible.

Related Terms

GPU (Graphics Processing Unit)

A specialised processor originally designed for rendering graphics but now essential for training and running AI models. GPUs can perform thousands of calculations simultaneously.

Neural Network

A computing system loosely inspired by the human brain, made of layers of interconnected nodes that learn to recognise patterns in data.

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns. The 'deep' refers to the number of layers, not the depth of understanding.

Inference

The process of an AI model generating output from your input. Every time you send a prompt and get a response, that is inference.

Training Data

The dataset used to teach an AI model. The quality, size, and composition of training data directly determines what the AI can and cannot do well.

Parameters

The total number of adjustable values in an AI model. A model with more parameters can capture more complex patterns but requires more computing power to train and run.

Learn More

Continue learning in Foundations

This topic is covered in our lesson: How Large Language Models Actually Work

← Back to Glossary