Recurrent Neural Network (RNN)
A type of neural network designed to process sequential data by maintaining a memory of previous inputs, once widely used for text and time-series tasks.
A recurrent neural network is a type of neural network that processes data in sequence, maintaining an internal memory that carries information from one step to the next. Unlike standard neural networks that treat each input independently, RNNs are designed to understand order and context β making them natural fits for text, speech, and time-series data.
How RNNs work
Imagine reading a sentence word by word. As you read each word, you carry the meaning of previous words in your memory. An RNN does something similar. At each step, it takes two inputs: the current data point and its memory of what came before. It produces an output and updates its memory, then moves to the next step.
This loop β process, remember, move forward β is what makes RNNs "recurrent." The same network is applied at every step, with memory flowing through the sequence.
The vanishing gradient problem
RNNs have a fundamental weakness: they struggle to remember information from many steps ago. As information passes through dozens or hundreds of steps, it degrades β a phenomenon called the vanishing gradient problem. An RNN processing a long document might effectively forget what the opening paragraph said by the time it reaches the conclusion.
Two architectures were developed to address this:
- LSTM (Long Short-Term Memory): Adds gates that control what information to keep, update, or discard, allowing the network to maintain relevant information over much longer sequences.
- GRU (Gated Recurrent Unit): A simplified version of LSTM that achieves similar results with fewer parameters.
RNNs vs transformers
RNNs dominated sequence processing until 2017, when the transformer architecture arrived. Transformers process entire sequences simultaneously rather than step by step, making them faster to train and better at capturing long-range dependencies. Today, transformers power virtually all large language models.
However, RNNs remain relevant in specific scenarios: real-time systems where low latency matters, embedded devices with limited memory, and certain time-series applications where sequential processing is natural.
Why you still encounter RNNs
Many production systems still use LSTM or GRU models for tasks like anomaly detection in time-series data, real-time speech processing, and sensor data analysis. Understanding RNNs helps you evaluate whether a legacy system needs upgrading or whether it is still fit for purpose.
Why This Matters
While transformers have largely replaced RNNs for language tasks, recurrent architectures remain in production across many industries. Understanding RNNs helps you assess existing AI systems, recognise when an upgrade to transformer-based models would deliver meaningful improvement, and understand the historical evolution that led to today's AI capabilities.
Related Terms
Continue learning in Advanced
This topic is covered in our lesson: Neural Network Architectures Explained