Temperature Scaling
A parameter that controls the randomness of AI model outputs β lower temperatures produce more predictable responses while higher temperatures increase creativity and variety.
Temperature is a parameter that controls how random or deterministic an AI model's outputs are. It is one of the most important and commonly adjusted settings when using language models.
How temperature works
When a language model predicts the next token, it assigns a probability to every possible token in its vocabulary. Temperature scales these probabilities before a token is selected.
- Low temperature (0.0-0.3): Sharpens the probability distribution. The most likely tokens become even more likely, and unlikely tokens become nearly impossible. The model becomes more predictable and deterministic.
- Medium temperature (0.4-0.7): A balanced distribution that allows some variety while staying coherent.
- High temperature (0.8-1.5+): Flattens the probability distribution. Less likely tokens get a greater chance of being selected, introducing more randomness, creativity, and unpredictability.
At temperature 0, the model always selects the most probable token β the output is completely deterministic. As temperature increases, the model is increasingly willing to take less probable paths.
Choosing the right temperature
- Low temperature for: factual answers, data extraction, code generation, technical writing, structured output, any task where accuracy matters more than creativity.
- Medium temperature for: general conversation, business writing, summarization, balanced tasks requiring both accuracy and natural language.
- High temperature for: creative writing, brainstorming, generating diverse options, fiction, poetry, any task where variety and novelty are desired.
Temperature in practice
Most AI APIs expose temperature as a configurable parameter. Default values are typically around 0.7-1.0. For production applications where consistency matters, lower temperatures are generally preferred. For creative tools and idea generation, higher temperatures produce more interesting results.
Temperature vs other sampling parameters
Temperature works alongside other parameters like top-p (nucleus sampling) and top-k. Top-p limits selection to the smallest set of tokens whose cumulative probability exceeds a threshold. Top-k limits selection to the k most probable tokens. These parameters can be combined with temperature for fine-grained control over output behaviour.
A common mistake
Setting temperature to 0 does not guarantee identical outputs for the same prompt across all API calls due to floating-point arithmetic and hardware variations. Near-deterministic behaviour requires temperature 0 plus careful control of other randomness sources.
Why This Matters
Temperature is the single most impactful setting for controlling AI output quality. Understanding how to adjust it for different tasks lets you get more accurate results for analytical work and more creative results for brainstorming β using the same model.
Related Terms
Continue learning in Essentials
This topic is covered in our lesson: Configuring AI for Better Results