Comprehensive Guide

Generative AI and RAG: How Modern AI Creates Content

Generative AI creates new content — text, images, code, audio — rather than just analysing existing data. This guide covers the technologies that make it work: how models generate content, how RAG connects them to your data, and how fine-tuning customises them for your needs.

In this guide

How generative AI works
Retrieval-augmented generation (RAG)
Customising AI models
Understanding the infrastructure

How generative AI works

Generative AI models learn patterns from training data and use those patterns to create new content. They do not copy — they generate statistically likely output based on learned relationships. Understanding this mechanism explains both their remarkable capabilities and their tendency to hallucinate.

Generative AI

AI that creates new content — text, images, code, audio, video — rather than just analysing or classifying existing data.

Hallucination

When AI generates confident but incorrect information. The AI is not lying — it is producing statistically plausible text that happens to be wrong.

Multi-Modal AI

AI that can process and generate multiple types of content — text, images, audio, and video — within a single model. Claude, GPT-5.4, and Gemini are all multi-modal.

Retrieval-augmented generation (RAG)

RAG combines AI generation with information retrieval. Instead of relying solely on training data, a RAG system searches your documents, databases, or knowledge base first, then generates a response grounded in retrieved information. This dramatically reduces hallucination and keeps responses current.

Retrieval-Augmented Generation (RAG)

A technique that connects AI to your own documents and data so it can answer questions using your specific information, not just its general training.

Embedding

A numerical representation of text (or images, audio, etc.) that captures its meaning. Embeddings let AI measure how similar two pieces of content are.

Vector Database

A specialised database designed to store and search embeddings — the numerical representations of text, images, or other data used in AI applications.

Customising AI models

Fine-tuning trains a model on your specific data to improve performance on your tasks. It is more expensive and complex than prompting but can produce dramatically better results for specialised domains. The trade-off: fine-tuning requires data, compute, and expertise that prompting does not.

Fine-Tuning

Training an existing AI model on your specific data to improve its performance on your specific tasks. Like giving the AI specialised on-the-job training.

Understanding the infrastructure

Tokens are the units AI models process. Embeddings are numerical representations of meaning. Vector databases store and search these representations. Together, they form the infrastructure that makes modern AI applications possible.

Token

The smallest unit of text an AI model processes. Roughly 3-4 characters or three-quarters of a word. AI pricing is typically measured in tokens.

Tokenizer (Tokeniser)

The component that converts text into tokens — the numerical units an AI model processes. Different models use different tokenisers, which affects how they handle text.

Embedding

A numerical representation of text (or images, audio, etc.) that captures its meaning. Embeddings let AI measure how similar two pieces of content are.

Vector Database

A specialised database designed to store and search embeddings — the numerical representations of text, images, or other data used in AI applications.

Continue Learning