Named Entity Recognition (NER)
An NLP task that identifies and classifies proper nouns and specific terms in text into predefined categories like person, organisation, location, and date.
Named entity recognition (NER) is a natural language processing task that scans text and identifies mentions of specific entities β people, organisations, locations, dates, monetary values, and other defined categories. It is one of the foundational building blocks of text analytics.
How NER works
Given the text "Satya Nadella announced that Microsoft will invest $10 billion in OpenAI's San Francisco headquarters in January 2025," a NER system identifies:
- "Satya Nadella" β Person
- "Microsoft" β Organisation
- "$10 billion" β Money
- "OpenAI" β Organisation
- "San Francisco" β Location
- "January 2025" β Date
NER approaches
- Rule-based β handcrafted patterns and dictionaries. Precise but inflexible and expensive to maintain.
- Statistical models β trained on annotated text using algorithms like conditional random fields (CRFs). The traditional ML approach.
- Deep learning models β transformer-based models fine-tuned on NER datasets. Current state of the art, especially for complex and ambiguous text.
- LLM-based β using large language models with prompting for entity extraction. Most flexible but slower and more expensive per query.
Standard entity types
The most widely used NER categories (from the OntoNotes standard) include: Person, Organisation, Location, Date, Time, Money, Percentage, Product, Event, and Law.
Custom NER
Many business applications require custom entity types:
- Insurance: policy numbers, claim types, coverage categories
- Healthcare: drug names, symptoms, procedures, anatomical terms
- Legal: case citations, statute references, party names
- Finance: ticker symbols, fund names, regulatory references
Custom NER requires annotated training data specific to your domain.
NER in the LLM era
Large language models have made NER more accessible. Instead of training a custom model, you can prompt an LLM to extract entities from text. This works well for prototyping and low-volume use cases, but dedicated NER models are faster and cheaper at scale.
Why This Matters
NER transforms unstructured text into structured, actionable data. For organisations processing large volumes of documents, emails, or customer communications, NER automates information extraction that would otherwise require hours of manual work. It is often the first step in building intelligent document processing systems.
Related Terms
Continue learning in Practitioner
This topic is covered in our lesson: Building Your First AI Workflow