Context Engineering (Practice)
The discipline of designing and managing all information provided to an AI model β system prompts, retrieved documents, conversation history, and tool outputs β to maximise response quality.
Context engineering is the practice of systematically designing and managing everything that goes into an AI model's context window β not just the user's prompt, but the system prompt, retrieved documents, conversation history, tool outputs, and any other information the model uses to generate its response.
Beyond prompt engineering
Prompt engineering focuses on crafting effective user prompts. Context engineering is broader β it encompasses the entire information environment the model operates within:
- System prompts: Instructions, persona definitions, and behavioural guidelines
- Retrieved context: Documents, knowledge base entries, and data pulled in via RAG
- Conversation history: Previous turns in a multi-turn conversation
- Tool outputs: Results from function calls, API responses, and database queries
- Examples: Few-shot demonstrations that show the model the expected format and quality
- Metadata: User information, session state, and application context
Why context engineering matters
Research consistently shows that model performance depends more on what is in the context than on which specific model is used. A well-engineered context with a smaller model often outperforms a poorly engineered context with a larger, more expensive model.
Key principles
- Relevance: Include only information the model needs for the specific task. Every irrelevant token wastes context space and can distract the model.
- Structure: Organise context with clear headers, sections, and formatting. Models process well-structured context more effectively than unstructured text dumps.
- Positioning: Place the most important information at the beginning and end of the context, where models attend most effectively.
- Compression: Use concise language and structured formats to maximise the useful information per token.
- Freshness: Ensure retrieved context is current and relevant. Stale information produces stale responses.
- Deduplication: Remove redundant information that appears in multiple retrieved documents.
Context engineering for RAG systems
In retrieval-augmented generation, context engineering is particularly critical:
- How many documents to retrieve (too few misses information, too many overwhelms the model)
- How to order retrieved documents (most relevant first, or strategically positioned)
- Whether to summarise long documents before including them
- How to handle contradictions between retrieved documents
- What metadata to include alongside document content
Context engineering for agents
For AI agent systems, context engineering includes:
- Managing growing conversation history (when to summarise versus when to keep verbatim)
- Formatting tool outputs for model consumption
- Maintaining planning and reasoning state across multiple steps
- Balancing task instructions with safety guidelines
Measuring context quality
Evaluate context engineering through:
- Response quality: Are answers accurate, complete, and well-structured?
- Token efficiency: Are you achieving good results with fewer tokens?
- Consistency: Do similar queries produce consistent responses?
- Edge case handling: Does the context support appropriate behaviour for unusual inputs?
Why This Matters
Context engineering is arguably the highest-leverage skill for improving AI application quality. Understanding these principles helps you get dramatically better results from existing models without changing the model itself β often the most cost-effective improvement available.
Related Terms
Continue learning in Practitioner
This topic is covered in our lesson: Mastering Prompt Engineering for Work
Training your team on AI? Enigmatica offers structured enterprise training built on this curriculum. Explore enterprise AI training β