Practical

Vector Database

Last reviewed: April 2026

A specialised database designed to store and search embeddings — the numerical representations of text, images, or other data used in AI applications.

A vector database is a specialised database designed to store, index, and search embeddings — the numerical representations (vectors) that AI uses to capture the meaning of text, images, or other data. While a traditional database searches by exact matches or text keywords, a vector database searches by meaning similarity.

Why vector databases exist

Traditional databases are built for structured queries: "Find all customers in London with orders over £100." They work with exact matches, ranges, and filters.

But AI applications need to answer questions like: "Find documents similar in meaning to this question." That requires comparing the mathematical similarity of embedding vectors across potentially millions of entries. Traditional databases are not optimised for this kind of search. Vector databases are.

How vector databases work

Storage: You store embedding vectors alongside their metadata (document title, source, date, etc.). Each vector might be 1,536 numbers long.

Indexing: The database builds specialised indices that organise vectors for fast similarity search. Common indexing methods include HNSW (Hierarchical Navigable Small World) and IVF (Inverted File Index).

Querying: You provide a query vector (the embedding of a question or search term) and the database returns the most similar vectors from its store, ranked by similarity. This is called nearest-neighbour search.

Filtering: Most vector databases support combining similarity search with metadata filters — "find the most similar vectors, but only among documents tagged as 'engineering' from the last 6 months."

Popular vector databases

The vector database market has grown rapidly:

Pinecone: Fully managed cloud service. Easy to set up, scales automatically.
Weaviate: Open-source with a cloud option. Includes built-in vectorisation.
Qdrant: Open-source, focused on performance. Written in Rust.
Chroma: Lightweight, developer-friendly. Popular for prototyping.
Milvus/Zilliz: Open-source with enterprise cloud offering. Handles very large scale.
pgvector: An extension for PostgreSQL that adds vector search. Lets you use your existing database.

When you need a vector database

You need a vector database when you are building:

RAG systems: The retrieved documents in RAG are found via vector database queries.
Semantic search: Any search feature that should understand meaning, not just keywords.
Recommendation engines: Finding similar products, articles, or content.
Duplicate detection: Identifying semantically similar entries in large datasets.
AI chatbots with knowledge: Any chatbot that needs to reference your specific documentation.

You probably do NOT need a dedicated vector database when: - Your dataset is small (under 10,000 entries). Simple in-memory search works fine. - You only need keyword search. A traditional search engine (Elasticsearch) may be sufficient. - You are prototyping. Start with a simpler solution and migrate when scale demands it.

Key concepts

Similarity metrics: Cosine similarity (most common), Euclidean distance, and dot product are different ways to measure how close two vectors are.
Approximate vs exact search: For large datasets, vector databases use approximate nearest neighbour (ANN) algorithms that sacrifice a tiny amount of accuracy for dramatically faster search.
Hybrid search: Combining vector similarity search with traditional keyword search for better results.

Want to go deeper?

This topic is covered in our Advanced level. Access all 100+ lessons free.

Why This Matters

Vector databases are the enabling infrastructure for enterprise AI search and RAG systems. If your organisation plans to build AI-powered knowledge retrieval — customer support, internal documentation search, or research tools — you will need to evaluate and choose a vector database. Understanding what they do helps you have informed conversations with technical teams and vendors, and prevents over-engineering (a simple solution may be sufficient) or under-engineering (a production system needs proper infrastructure).

Related Terms

Embedding

A numerical representation of text (or images, audio, etc.) that captures its meaning. Embeddings let AI measure how similar two pieces of content are.

Retrieval-Augmented Generation (RAG)

A technique that connects AI to your own documents and data so it can answer questions using your specific information, not just its general training.

API (Application Programming Interface)

A way for software to communicate with other software. APIs are how developers connect AI capabilities to websites, apps, and business tools.

Large Language Model (LLM)

A type of AI trained on vast amounts of text to understand and generate human language. ChatGPT, Claude, and Gemini are all LLMs.

Natural Language Processing (NLP)

The branch of AI focused on enabling computers to understand, interpret, and generate human language in useful ways.

Related Comparisons

RAG vs Fine-Tuning

RAG (Retrieval-Augmented Generation) vs fine-tuning compared across setup complexity, cost, data freshness, accuracy, customisation depth, and maintenance.

Learn More

Continue learning in Advanced

This topic is covered in our lesson: Context Windows: Why AI Forgets and How to Fix It

← Back to Glossary