Beam Search
A text generation strategy that explores multiple possible continuations simultaneously and selects the sequence with the highest overall probability.
Beam search is a decoding algorithm used in AI text generation that keeps track of multiple candidate sequences at each step and ultimately selects the one with the highest cumulative probability.
The generation problem
When a language model generates text, it predicts one token at a time. At each step, there are thousands of possible next tokens, each with a different probability. The challenge is choosing a path through these possibilities that produces coherent, high-quality text.
How beam search works
Beam search maintains a fixed number of candidate sequences β called the "beam width." If the beam width is 3, the algorithm keeps the top 3 most probable sequences at each step.
At step one, it picks the 3 most likely first tokens. At step two, it expands each of those 3 sequences with all possible next tokens, scores the resulting combinations, and keeps only the top 3 overall. This continues until the sequences are complete.
By considering multiple paths simultaneously, beam search avoids committing to a locally good choice that leads to a poor overall sequence.
Beam search vs other strategies
- Greedy decoding: Always picks the single most probable next token. Fast but often produces repetitive or suboptimal text.
- Beam search: Explores multiple paths. Better quality but more computationally expensive.
- Sampling with temperature: Randomly selects from probable tokens, introducing variety. Better for creative text.
- Top-k and top-p sampling: Constrained randomness that balances quality and diversity.
When beam search is used
Beam search excels in tasks where there is a clearly "correct" output β machine translation, speech recognition, and structured data generation. For open-ended creative writing or conversation, sampling-based methods are generally preferred because beam search tends to produce safe, repetitive text.
Practical impact
Most users never configure beam search directly. But understanding it explains why AI-generated text sometimes feels "safe" or predictable β deterministic decoding strategies like beam search optimise for the most probable output rather than the most interesting one.
Why This Matters
Understanding beam search helps you grasp why AI models sometimes produce bland or repetitive text and why adjusting generation settings like temperature can dramatically change output quality. It is foundational knowledge for anyone tuning AI outputs for specific use cases.
Related Terms
Continue learning in Advanced
This topic is covered in our lesson: How Language Models Generate Text