Core AI

Sequence-to-Sequence (Seq2Seq)

Last reviewed: April 2026

A model architecture that converts one sequence of data into another, originally designed for machine translation and now underlying many AI text tasks.

Sequence-to-sequence is a model architecture designed to transform one sequence into another. It takes an input sequence (like a sentence in English) and produces an output sequence (like the same sentence in French). This architecture was a breakthrough for machine translation and laid the groundwork for modern AI assistants.

How it works

A seq2seq model has two main components:

Encoder: Reads the input sequence and compresses it into a fixed-length representation called a context vector. This vector captures the meaning of the entire input.
Decoder: Takes the context vector and generates the output sequence one token at a time. At each step, it considers the context vector and all previously generated tokens.

The encoder processes the input, the decoder produces the output. This simple division of labour handles remarkably complex transformations.

Applications

Seq2seq models power many AI capabilities:

Machine translation: Converting text from one language to another
Text summarisation: Converting a long document into a short summary
Question answering: Converting a question into an answer
Chatbots: Converting a user message into a response
Code generation: Converting a natural language description into code
Speech recognition: Converting audio sequences into text

Evolution of seq2seq

The original seq2seq models (2014) used recurrent neural networks, which processed input one token at a time. This worked but was slow and struggled with long sequences because the fixed-length context vector became a bottleneck.

The attention mechanism (2015) solved this by allowing the decoder to look back at all encoder positions, not just the compressed context vector. The decoder could focus on the most relevant parts of the input at each generation step.

The transformer architecture (2017) took this further by removing the sequential processing entirely, using self-attention to process all positions simultaneously. Modern LLMs are essentially advanced seq2seq systems built entirely on transformer architecture.

Why the concept still matters

Even though the specific architectures have evolved, the seq2seq framework remains the mental model for understanding how AI transforms inputs into outputs. When you give Claude a prompt and receive a response, you are using a seq2seq system — just a vastly more powerful one than the original 2014 design.

Want to go deeper?

This topic is covered in our Advanced level. Access all 100+ lessons free.

Why This Matters

The seq2seq framework is the conceptual foundation for understanding how AI generates text, translates languages, and answers questions. Understanding this architecture helps you grasp why AI has certain capabilities and limitations, and how modern transformer-based models evolved from earlier designs.

Related Terms

Transformer

The neural network architecture behind modern AI assistants like ChatGPT and Claude. Introduced in 2017, it processes all words simultaneously using an attention mechanism.

Recurrent Neural Network (RNN)

A type of neural network designed to process sequential data by maintaining a memory of previous inputs, once widely used for text and time-series tasks.

Natural Language Processing (NLP)

The branch of AI focused on enabling computers to understand, interpret, and generate human language in useful ways.

Large Language Model (LLM)

A type of AI trained on vast amounts of text to understand and generate human language. ChatGPT, Claude, and Gemini are all LLMs.

Token

The smallest unit of text an AI model processes. Roughly 3-4 characters or three-quarters of a word. AI pricing is typically measured in tokens.

Learn More

Continue learning in Advanced

This topic is covered in our lesson: Neural Network Architectures Explained

← Back to Glossary