Practical

Context Engineering (Practice)

Last reviewed: April 2026

The discipline of designing and managing all information provided to an AI model — system prompts, retrieved documents, conversation history, and tool outputs — to maximise response quality.

Context engineering is the practice of systematically designing and managing everything that goes into an AI model's context window — not just the user's prompt, but the system prompt, retrieved documents, conversation history, tool outputs, and any other information the model uses to generate its response.

Beyond prompt engineering

Prompt engineering focuses on crafting effective user prompts. Context engineering is broader — it encompasses the entire information environment the model operates within:

System prompts: Instructions, persona definitions, and behavioural guidelines
Retrieved context: Documents, knowledge base entries, and data pulled in via RAG
Conversation history: Previous turns in a multi-turn conversation
Tool outputs: Results from function calls, API responses, and database queries
Examples: Few-shot demonstrations that show the model the expected format and quality
Metadata: User information, session state, and application context

Why context engineering matters

Research consistently shows that model performance depends more on what is in the context than on which specific model is used. A well-engineered context with a smaller model often outperforms a poorly engineered context with a larger, more expensive model.

Key principles

Relevance: Include only information the model needs for the specific task. Every irrelevant token wastes context space and can distract the model.
Structure: Organise context with clear headers, sections, and formatting. Models process well-structured context more effectively than unstructured text dumps.
Positioning: Place the most important information at the beginning and end of the context, where models attend most effectively.
Compression: Use concise language and structured formats to maximise the useful information per token.
Freshness: Ensure retrieved context is current and relevant. Stale information produces stale responses.
Deduplication: Remove redundant information that appears in multiple retrieved documents.

Context engineering for RAG systems

In retrieval-augmented generation, context engineering is particularly critical:

How many documents to retrieve (too few misses information, too many overwhelms the model)
How to order retrieved documents (most relevant first, or strategically positioned)
Whether to summarise long documents before including them
How to handle contradictions between retrieved documents
What metadata to include alongside document content

Context engineering for agents

For AI agent systems, context engineering includes:

Managing growing conversation history (when to summarise versus when to keep verbatim)
Formatting tool outputs for model consumption
Maintaining planning and reasoning state across multiple steps
Balancing task instructions with safety guidelines

Measuring context quality

Evaluate context engineering through:

Response quality: Are answers accurate, complete, and well-structured?
Token efficiency: Are you achieving good results with fewer tokens?
Consistency: Do similar queries produce consistent responses?
Edge case handling: Does the context support appropriate behaviour for unusual inputs?

Want to go deeper?

This topic is covered in our Practitioner level. Access all 100+ lessons free.

Why This Matters

Context engineering is arguably the highest-leverage skill for improving AI application quality. Understanding these principles helps you get dramatically better results from existing models without changing the model itself — often the most cost-effective improvement available.

Related Terms

Context Engineering

The practice of carefully designing what information an AI receives — including system prompts, retrieved documents, conversation history, and tool outputs — to maximise the quality of its responses.

Prompt Engineering

The skill of writing instructions to AI that consistently produce useful, accurate, high-quality output.

Retrieval-Augmented Generation (RAG)

A technique that connects AI to your own documents and data so it can answer questions using your specific information, not just its general training.

Context Window

The maximum amount of text an AI can process at once. Think of it as the AI's working memory — everything it can see and consider when generating a response.

Learn More

Continue learning in Practitioner

This topic is covered in our lesson: Mastering Prompt Engineering for Work

← Back to Glossary