Core AI

Annotation

Last reviewed: April 2026

The process of adding labels or tags to raw data so AI models can learn from it during training.

Annotation is the process of labelling raw data — text, images, audio, video — so that a machine learning model can learn from it. Without annotation, most AI training would be impossible.

How annotation works

Imagine you want to train a model to detect cats in photos. You need thousands of images where a human has marked which ones contain cats and which do not. That marking process is annotation. For more complex tasks, annotation might involve drawing bounding boxes around objects, highlighting specific words in a sentence, or transcribing spoken words from audio.

Types of annotation

Text annotation includes labelling sentiment (positive, negative, neutral), tagging named entities (people, places, companies), and marking intent in customer queries
Image annotation includes bounding boxes, polygon outlines, key-point marking, and pixel-level segmentation
Audio annotation includes transcription, speaker identification, and emotion labelling
Video annotation tracks objects frame by frame for tasks like autonomous driving

Who does the annotating

Annotation is often done by large teams of human workers, sometimes called data labellers. Companies like Scale AI and Labelbox provide annotation platforms and workforces. Increasingly, AI-assisted annotation speeds up the process: a model generates initial labels, and humans correct them.

Quality matters enormously

The quality of annotations directly determines the quality of the trained model. Ambiguous labels, inconsistent standards, or careless annotation create noisy training data that leads to unreliable models. This is why annotation guidelines — clear rules for how to label each data point — are critical.

The cost of annotation

Annotation is one of the most expensive and time-consuming parts of building AI. It is often the bottleneck in AI projects, especially for specialised domains like medical imaging where expert annotators are required.

Want to go deeper?

This topic is covered in our Practitioner level. Access all 100+ lessons free.

Why This Matters

If your organisation is building or customising AI models, annotation quality will make or break the project. Understanding annotation helps you budget realistically, set quality standards, and recognise when a model's poor performance stems from bad training data rather than bad architecture.

Related Terms

Training Data

The dataset used to teach an AI model. The quality, size, and composition of training data directly determines what the AI can and cannot do well.

Supervised Learning

A machine learning approach where the model learns from labelled examples — input data paired with correct answers. The most common type of machine learning in business applications.

Machine Learning (ML)

A type of AI where systems learn patterns from data instead of following explicitly programmed rules. The system improves its performance through experience.

Deep Learning

A subset of machine learning that uses neural networks with many layers to learn complex patterns. The 'deep' refers to the number of layers, not the depth of understanding.

Learn More

Continue learning in Practitioner

This topic is covered in our lesson: Building Your First AI Workflow

← Back to Glossary