Core AI

Recurrent Neural Network (RNN)

Last reviewed: April 2026

A type of neural network designed to process sequential data by maintaining a memory of previous inputs, once widely used for text and time-series tasks.

A recurrent neural network is a type of neural network that processes data in sequence, maintaining an internal memory that carries information from one step to the next. Unlike standard neural networks that treat each input independently, RNNs are designed to understand order and context — making them natural fits for text, speech, and time-series data.

How RNNs work

Imagine reading a sentence word by word. As you read each word, you carry the meaning of previous words in your memory. An RNN does something similar. At each step, it takes two inputs: the current data point and its memory of what came before. It produces an output and updates its memory, then moves to the next step.

This loop — process, remember, move forward — is what makes RNNs "recurrent." The same network is applied at every step, with memory flowing through the sequence.

The vanishing gradient problem

RNNs have a fundamental weakness: they struggle to remember information from many steps ago. As information passes through dozens or hundreds of steps, it degrades — a phenomenon called the vanishing gradient problem. An RNN processing a long document might effectively forget what the opening paragraph said by the time it reaches the conclusion.

Two architectures were developed to address this:

LSTM (Long Short-Term Memory): Adds gates that control what information to keep, update, or discard, allowing the network to maintain relevant information over much longer sequences.
GRU (Gated Recurrent Unit): A simplified version of LSTM that achieves similar results with fewer parameters.

RNNs vs transformers

RNNs dominated sequence processing until 2017, when the transformer architecture arrived. Transformers process entire sequences simultaneously rather than step by step, making them faster to train and better at capturing long-range dependencies. Today, transformers power virtually all large language models.

However, RNNs remain relevant in specific scenarios: real-time systems where low latency matters, embedded devices with limited memory, and certain time-series applications where sequential processing is natural.

Why you still encounter RNNs

Many production systems still use LSTM or GRU models for tasks like anomaly detection in time-series data, real-time speech processing, and sensor data analysis. Understanding RNNs helps you evaluate whether a legacy system needs upgrading or whether it is still fit for purpose.

Want to go deeper?

This topic is covered in our Advanced level. Access all 100+ lessons free.

Why This Matters

While transformers have largely replaced RNNs for language tasks, recurrent architectures remain in production across many industries. Understanding RNNs helps you assess existing AI systems, recognise when an upgrade to transformer-based models would deliver meaningful improvement, and understand the historical evolution that led to today's AI capabilities.

Related Terms

Neural Network

A computing system loosely inspired by the human brain, made of layers of interconnected nodes that learn to recognise patterns in data.

Transformer

The neural network architecture behind modern AI assistants like ChatGPT and Claude. Introduced in 2017, it processes all words simultaneously using an attention mechanism.

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns. The 'deep' refers to the number of layers, not the depth of understanding.

Natural Language Processing (NLP)

The branch of AI focused on enabling computers to understand, interpret, and generate human language in useful ways.

Sequence-to-Sequence (Seq2Seq)

A model architecture that converts one sequence of data into another, originally designed for machine translation and now underlying many AI text tasks.

Learn More

Continue learning in Advanced

This topic is covered in our lesson: Neural Network Architectures Explained

← Back to Glossary