Vector database
A vector database stores and searches high-dimensional embeddings using similarity metrics, powering semantic search, RAG, and recommendation.
A vector database stores high-dimensional embeddings and retrieves the closest matches to a query vector using approximate-nearest-neighbor (ANN) search. It's the storage layer most RAG pipelines sit on. Popular options include Pinecone, Qdrant, Weaviate, Milvus, and the pgvector extension for Postgres.
Why it matters
Generic databases handle exact lookups, range queries, and full-text search but not "find the 10 most semantically similar rows to this vector." Doing that naively over millions of vectors is too slow — each query would compute distance to every stored vector. ANN algorithms (HNSW, IVF, DiskANN) reduce this to near-logarithmic time at the cost of some recall, and vector databases package those algorithms with the normal conveniences (indexing, filtering, replication, APIs).
For AI developer tooling, vector databases are usually one step back from what you see day-to-day. Claude Code, Codex CLI, and Qwen Code users interact with them through MCP servers that do RAG for you — the vector DB is behind the server.
How it works
A typical workflow:
- Compute embeddings for your documents
- Insert them into the vector DB with optional metadata (source, timestamp, tags)
- At query time, embed the user's query and call
similarity_search(vector, k=10, filter={...}) - The DB uses its ANN index to return the top-K neighbors fast
Key features to compare:
- Filtering — "nearest neighbors where
language = 'rust'" - Hybrid search — combine keyword + vector scores
- Scale — billions of vectors vs thousands
- Deployment — hosted (Pinecone, managed Qdrant) vs self-hosted (pgvector, Qdrant OSS, Milvus)
- Updates — how fast new vectors are searchable
For small projects (under 1M vectors) pgvector on a normal Postgres is usually enough. At web scale you want a dedicated vector DB.
How it's used
Vector DB patterns in dev tooling:
- Codebase search — indexed functions/files, queried by semantic meaning
- Docs search — product docs embedded and searched by user question
- Long-term agent memory — conversations stored and recalled
- Deduplication — find near-duplicate snippets
Related terms
- Embedding — what's stored
- RAG — the most common consumer
- MCP — how agents typically access vector DBs
- LLM — what generates the final answer
- Token — a separate but often co-occurring concept
FAQ
Do I need a vector database to use Claude Code?
No. If you need one, it's usually because you've built a custom MCP server for semantic search over your docs or codebase. Default Claude Code usage doesn't require any vector store.
Postgres with pgvector vs a dedicated vector DB?
Start with pgvector if you already run Postgres — one fewer service to operate. Move to a dedicated vector DB when query latency, write throughput, or recall becomes a bottleneck.
Related terms
- Agentic codingAgentic coding is software development where an LLM-powered agent plans, edits, runs, and verifies code on its own using tools, not just autocomplete.
- AI pair programmingAI pair programming is a collaboration style where an LLM assistant sits alongside you, suggesting code and reviewing changes in real time as you work.
- ANSI escape codesANSI escape codes are control sequences that terminals interpret for colors, cursor movement, and screen clearing — the language of every modern CLI UI.
- Autonomous agentAn autonomous agent is an AI program that perceives, decides, and acts on its own toward a goal — the architecture behind modern coding CLIs.
- CheckpointA checkpoint is a saved snapshot of file state that lets you roll back an AI coding agent's changes to a known-good point.
- Claude CodeClaude Code is Anthropic's official command-line agent that plans, edits, runs, and verifies code across your repo using Claude models and tool use.