Core AI

Variational Autoencoder (VAE)

Last reviewed: April 2026

A generative AI architecture that learns to compress data into a compact representation and generate new, similar data from that representation.

A variational autoencoder is a type of generative AI model that learns to compress data into a compact representation and then reconstruct it — or generate entirely new, similar data. VAEs are used in image generation, drug discovery, anomaly detection, and data augmentation.

How a basic autoencoder works

An autoencoder has two halves:

Encoder: Compresses input data into a smaller representation (called the latent space or bottleneck)
Decoder: Reconstructs the original data from the compressed representation

The model is trained by feeding it data and asking it to reproduce that data after passing through the bottleneck. To succeed, the encoder must learn what information is essential and discard what is not.

What makes a VAE "variational"

A standard autoencoder produces a single point in the latent space for each input. A VAE instead produces a probability distribution — a mean and variance that define a region in latent space. During training, the model:

Encodes the input into a distribution (not a single point)
Samples a random point from that distribution
Decodes the sampled point back into data

This randomness has a crucial benefit: the latent space becomes smooth and continuous. Nearby points in the latent space produce similar outputs, and you can sample random points to generate entirely new data.

Applications

Image generation: Generate new faces, artwork, or product designs by sampling from the latent space
Drug discovery: Explore chemical spaces to generate novel molecular structures with desired properties
Anomaly detection: Learn what "normal" data looks like, then flag data that produces high reconstruction error
Data augmentation: Generate synthetic training examples to supplement limited datasets
Style transfer: Manipulate specific attributes by navigating the latent space

VAEs vs GANs vs diffusion models

VAEs are one of three major generative model families:

VAEs: Produce slightly blurry but diverse outputs. Training is stable. Good for structured generation and anomaly detection.
GANs (Generative Adversarial Networks): Produce sharper images but are harder to train and less diverse.
Diffusion models: The current state of the art for image generation (Stable Diffusion, DALL-E). Higher quality than both VAEs and GANs but slower to generate.

Why VAEs still matter

Despite being overshadowed by diffusion models for image generation, VAEs remain valuable because they provide a meaningful latent space — a compressed representation that captures the essential features of the data. This makes them useful for tasks beyond generation, including clustering, interpolation, and understanding data structure.

Want to go deeper?

This topic is covered in our Advanced level. Access all 100+ lessons free.

Why This Matters

VAEs represent a fundamental approach to generative AI with applications beyond image creation. Understanding them helps you evaluate AI tools for anomaly detection, drug discovery, and data augmentation — and appreciate how different generative architectures offer different trade-offs between quality, speed, and controllability.

Related Terms

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns. The 'deep' refers to the number of layers, not the depth of understanding.

Generative AI

AI that creates new content — text, images, code, audio, video — rather than just analysing or classifying existing data.

Neural Network

A computing system loosely inspired by the human brain, made of layers of interconnected nodes that learn to recognise patterns in data.

Embedding

A numerical representation of text (or images, audio, etc.) that captures its meaning. Embeddings let AI measure how similar two pieces of content are.

Synthetic Data

Data generated by AI rather than collected from real-world sources. Used for training AI models, testing systems, and filling gaps where real data is expensive, sensitive, or unavailable.

Learn More

Continue learning in Advanced

This topic is covered in our lesson: Neural Network Architectures Explained

← Back to Glossary