TPU (Tensor Processing Unit)
A custom AI chip designed by Google specifically for machine learning workloads. TPUs are optimised for the tensor operations that neural networks rely on.
A TPU, or Tensor Processing Unit, is a custom-designed computer chip created by Google specifically for machine learning workloads. While GPUs are general-purpose parallel processors adapted for AI, TPUs are purpose-built from the ground up for the mathematical operations — particularly tensor operations — that neural networks rely on.
What is a tensor?
A tensor is a multi-dimensional array of numbers. In AI, tensors are the fundamental data structure:
- A single number is a scalar (0-dimensional tensor)
- A list of numbers is a vector (1-dimensional tensor)
- A table of numbers is a matrix (2-dimensional tensor)
- Higher-dimensional arrays are tensors (3+ dimensions)
Neural network computations are essentially tensor operations — multiplying, adding, and transforming these multi-dimensional arrays. TPUs are optimised specifically for these operations.
TPU vs GPU
Both TPUs and GPUs accelerate AI workloads, but they differ:
- GPUs (primarily NVIDIA) are versatile, widely available, and supported by a mature software ecosystem. They handle a wide range of AI tasks well.
- TPUs (Google only) are designed specifically for machine learning. They can be more efficient for certain workloads but are only available through Google Cloud.
In practice, most AI development happens on GPUs because of their wider availability and software support. TPUs are primarily used within Google's own AI projects and by organisations deeply integrated with Google Cloud.
Generations of TPUs
Google has released multiple TPU generations, each more powerful:
- TPU v1 (2016): Google's first custom AI chip, used internally for inference
- TPU v2/v3: Made available through Google Cloud for external developers
- TPU v4: Significant performance improvements, used to train Google's PaLM and Gemini models
- TPU v5/v6: Latest generations with further efficiency gains
Why TPUs matter
TPUs represent an important trend: the move toward specialised AI hardware. As AI workloads have grown, general-purpose processors have become insufficient. The race to build the most efficient AI chips now involves:
- Google (TPUs)
- NVIDIA (GPUs optimised for AI)
- AMD (MI-series GPUs)
- Amazon (Trainium and Inferentia chips)
- Microsoft (Maia chips)
- Various startups (Cerebras, Groq, SambaNova)
This competition drives down costs and improves performance, which ultimately benefits everyone using AI services.
Practical implications
For most business users, TPUs are invisible — they power some Google AI services behind the scenes, but you interact with them through APIs, not directly. The key takeaway is understanding why custom AI chips exist: AI workloads are so massive and economically important that companies are designing specialised hardware to run them more efficiently. This trend will continue to improve AI performance and reduce costs over time.
Why This Matters
TPUs illustrate the broader trend of purpose-built AI hardware, which directly impacts AI pricing and performance. Understanding that companies like Google invest billions in custom chips helps you appreciate why AI capabilities are improving so rapidly and why costs continue to fall. For strategic planning, the hardware competition between Google, NVIDIA, AMD, and others suggests that AI will continue to become cheaper and more accessible.
Related Terms
Continue learning in Foundations
This topic is covered in our lesson: How Large Language Models Actually Work