Definition

Vector database

A vector database stores and searches high-dimensional embeddings using similarity metrics, powering semantic search, RAG, and recommendation.

A vector database stores high-dimensional embeddings and retrieves the closest matches to a query vector using approximate-nearest-neighbor (ANN) search. It's the storage layer most RAG pipelines sit on. Popular options include Pinecone, Qdrant, Weaviate, Milvus, and the pgvector extension for Postgres.

Why it matters

Generic databases handle exact lookups, range queries, and full-text search but not "find the 10 most semantically similar rows to this vector." Doing that naively over millions of vectors is too slow — each query would compute distance to every stored vector. ANN algorithms (HNSW, IVF, DiskANN) reduce this to near-logarithmic time at the cost of some recall, and vector databases package those algorithms with the normal conveniences (indexing, filtering, replication, APIs).

For AI developer tooling, vector databases are usually one step back from what you see day-to-day. Claude Code, Codex CLI, and Qwen Code users interact with them through MCP servers that do RAG for you — the vector DB is behind the server.

How it works

A typical workflow:

Compute embeddings for your documents
Insert them into the vector DB with optional metadata (source, timestamp, tags)
At query time, embed the user's query and call similarity_search(vector, k=10, filter={...})
The DB uses its ANN index to return the top-K neighbors fast

Key features to compare:

Filtering — "nearest neighbors where language = 'rust'"
Hybrid search — combine keyword + vector scores
Scale — billions of vectors vs thousands
Deployment — hosted (Pinecone, managed Qdrant) vs self-hosted (pgvector, Qdrant OSS, Milvus)
Updates — how fast new vectors are searchable

For small projects (under 1M vectors) pgvector on a normal Postgres is usually enough. At web scale you want a dedicated vector DB.

How it's used

Vector DB patterns in dev tooling:

Codebase search — indexed functions/files, queried by semantic meaning
Docs search — product docs embedded and searched by user question
Long-term agent memory — conversations stored and recalled
Deduplication — find near-duplicate snippets

Embedding — what's stored
RAG — the most common consumer
MCP — how agents typically access vector DBs
LLM — what generates the final answer
Token — a separate but often co-occurring concept

Vector database

Why it matters

How it works

How it's used

FAQ

Do I need a vector database to use Claude Code?

Postgres with pgvector vs a dedicated vector DB?

Related terms