Skip to main content
Early access β€” new tools and guides added regularly
Practical

Text Classification

Last reviewed: April 2026

An AI task that assigns predefined categories or labels to text, used for spam detection, sentiment analysis, topic tagging, and content moderation.

Text classification is the task of assigning a category or label to a piece of text. It is one of the most widely deployed AI applications in business, powering everything from email spam filters to customer service routing to content moderation.

Common examples

  • Spam detection: Classifying emails as spam or legitimate
  • Sentiment analysis: Classifying reviews as positive, negative, or neutral
  • Topic tagging: Classifying news articles by topic (politics, sports, technology)
  • Support ticket routing: Classifying customer queries by department (billing, technical, sales)
  • Content moderation: Classifying user-generated content as safe, flagged, or prohibited
  • Intent detection: Classifying chatbot messages by user intent (question, complaint, request)

Traditional approach

Before LLMs, text classification required:

  1. Collecting thousands of labelled examples
  2. Converting text into numerical features (bag of words, TF-IDF, embeddings)
  3. Training a classification model (Naive Bayes, SVM, or neural network)
  4. Evaluating on held-out test data
  5. Deploying and monitoring

This process was effective but time-consuming and required machine learning expertise.

The LLM revolution

Modern LLMs have dramatically simplified text classification. Instead of training a custom model, you can write a prompt:

"Classify the following customer email into one of these categories: billing, technical support, account management, general enquiry. Return only the category name."

This zero-shot approach requires no training data and works surprisingly well. For higher accuracy, you can provide examples in the prompt (few-shot classification).

When to use which approach

  • LLM prompting: Best for prototyping, low-volume tasks, or when categories change frequently. Simple to implement but more expensive per classification.
  • Fine-tuned models: Best for high-volume production use. Lower cost per classification, higher consistency, but requires labelled data and ML expertise.
  • Hybrid: Use LLMs to generate initial labels for training data, then train a lightweight model for production.

Evaluation metrics

  • Accuracy: Percentage of correct classifications
  • Precision: Of items classified as X, how many actually were X
  • Recall: Of all actual X items, how many were correctly classified
  • F1 score: Harmonic mean of precision and recall

Multi-label vs multi-class

Multi-class classification assigns exactly one label (an email is either spam or not). Multi-label classification can assign multiple labels (an article can be tagged as both "technology" and "business").

Want to go deeper?
This topic is covered in our Essentials level. Access all 60+ lessons free.

Why This Matters

Text classification is one of the most immediately applicable AI capabilities for any business. It automates the tedious work of categorising, routing, and prioritising text β€” saving hours of manual work and enabling consistent, scalable processing of customer communications, documents, and content.

Related Terms

Learn More

Continue learning in Essentials

This topic is covered in our lesson: Practical AI Applications for Your Role