Core AI

Retrieval-Augmented Fine-Tuning

Last reviewed: April 2026

A training approach that combines fine-tuning with retrieval capabilities, teaching the model both to retrieve relevant information and to generate better responses from retrieved context.

Retrieval-augmented fine-tuning (RAFT) is a training approach that combines the benefits of fine-tuning and retrieval-augmented generation. Instead of choosing between teaching the model new knowledge (fine-tuning) or providing knowledge at query time (RAG), RAFT trains the model to be better at both retrieving and using retrieved information for a specific domain.

The motivation

Standard RAG has a limitation: the base model may not be optimally trained to use retrieved documents. It might ignore relevant retrieved information, over-rely on its parametric knowledge, or struggle to synthesize information from multiple retrieved passages. RAFT addresses this by fine-tuning the model specifically on the task of answering questions using retrieved context.

How RAFT works

The training data for RAFT includes questions, relevant documents (that contain the answer), distractor documents (that are related but do not contain the answer), and the correct answer with citations to the relevant document.

During training, the model learns to identify which retrieved documents are relevant, extract the correct information from those documents, ignore distractors, and generate answers that are faithful to the source material. Critically, some training examples include the relevant document and some do not, teaching the model to know when to rely on retrieved context versus its own knowledge.

RAFT vs standard approaches

Standard RAG: Uses a base model with retrieved context. The model was not specifically trained to use retrieved documents effectively.
Standard fine-tuning: Teaches the model new knowledge baked into its weights. No retrieval mechanism.
RAFT: Fine-tunes the model to excel at using retrieved documents, combining the adaptability of RAG with the domain specialization of fine-tuning.

Practical benefits

RAFT produces models that are more accurate when working with domain-specific document collections. They make fewer hallucinations because they are trained to ground responses in retrieved evidence. They handle "distractor" documents better — real-world retrieval often returns partially relevant results, and RAFT models learn to navigate this noise.

When to use RAFT

RAFT is most valuable when you have a specific domain with a defined document collection, accuracy and faithfulness to sources is critical, and standard RAG produces too many hallucinations or misses important information from retrieved documents.

Want to go deeper?

This topic is covered in our Advanced level. Access all 100+ lessons free.

Why This Matters

Retrieval-augmented fine-tuning represents the cutting edge of making AI systems reliable for domain-specific applications. Understanding it helps you evaluate advanced AI deployment strategies and recognise when standard RAG or fine-tuning alone may not be sufficient.

Related Terms

Retrieval-Augmented Generation (RAG)

A technique that connects AI to your own documents and data so it can answer questions using your specific information, not just its general training.

Fine-Tuning

Training an existing AI model on your specific data to improve its performance on your specific tasks. Like giving the AI specialised on-the-job training.

Large Language Model (LLM)

A type of AI trained on vast amounts of text to understand and generate human language. ChatGPT, Claude, and Gemini are all LLMs.

Hallucination

When AI generates confident but incorrect information. The AI is not lying — it is producing statistically plausible text that happens to be wrong.

Learn More

Continue learning in Advanced

This topic is covered in our lesson: Advanced RAG Techniques

← Back to Glossary