Case Study 1 — Semantic Search with pgvector (No Separate Vector Database)
A team built AI-powered "semantic search" — finding documents by meaning, not keywords — and assumed they needed a dedicated vector database. They got it working with
pgvectorinstead, keeping embeddings beside their relational data in the PostgreSQL they already ran.
Background
A knowledge-base product let users search thousands of help articles. Keyword search (PostgreSQL full-text, Chapter 16) worked for exact terms but missed meaning: a search for "I can't log in" didn't surface an article titled "Resetting your password," because they share no keywords despite being about the same thing. The team wanted semantic search — results ranked by conceptual similarity — to power both the search box and an AI assistant (RAG: feeding the most relevant articles to an LLM as context).
Semantic search needs embeddings and vector similarity (this chapter). The team's first assumption: "we need Pinecone (a dedicated vector database)." That would mean running and paying for a second system, and syncing article data into it and keeping it consistent with PostgreSQL (the polyglot cost, Chapter 33).
The implementation: pgvector
Before adopting a separate system, they tried pgvector, the PostgreSQL extension that adds vector storage and similarity search. The whole feature lived in the database they already had:
CREATE EXTENSION vector;
ALTER TABLE articles ADD COLUMN embedding vector(1536); -- embedding dimension of their model
CREATE INDEX ON articles USING hnsw (embedding vector_cosine_ops); -- fast ANN index
The pipeline:
1. For each article, call an embedding model (e.g., an embeddings API) to turn its text into a 1536-dimension vector, and store it in the embedding column (just an UPDATE).
2. At search time, embed the user's query the same way, then find the nearest article embeddings:
-- the k most semantically-similar articles to the query
SELECT article_id, title
FROM articles
ORDER BY embedding <=> :query_embedding -- <=> = cosine distance; smallest = most similar
LIMIT 5;
Now "I can't log in" returns "Resetting your password" — because their embeddings are close in meaning, even with no shared keywords. The HNSW index makes the nearest-neighbor search fast even over many thousands of articles. For the AI assistant, the same query retrieved the top-5 relevant articles to pass to the LLM as context (RAG).
The decisive advantage: the embeddings live right next to the articles, in the same PostgreSQL database. They could JOIN vector-search results to other tables (filter by product, by access level, by language) in one SQL query, wrap everything in transactions, and operate one database. No second system, no sync pipeline, no drift.
When they'd reach for a dedicated vector database
The team noted the boundary honestly: pgvector served their thousands-to-low-millions of vectors well. If they grew to hundreds of millions of vectors, needed specialized ANN index tuning, or wanted vector search scaled and operated independently of the primary database, a dedicated vector database (Pinecone, Weaviate, Milvus, Qdrant) would earn its place. Until then, pgvector kept the architecture simple and the data unified — the right default (theme #4).
The analysis
-
Vector search powers semantic features. Embeddings represent meaning; nearest-neighbor search finds conceptually-similar items — enabling semantic search, recommendations, and RAG that keyword search can't. This is the core AI-data pattern.
-
pgvector often replaces a dedicated vector database. It adds vector types, distance operators (
<=>), and ANN indexes (HNSW/IVFFlat) to PostgreSQL — so for many applications you build AI features in the database you already run, no second system. -
Co-locating embeddings with relational data is a big win. Because vectors live beside the articles, you can
JOIN, filter, and transact across both in one query — impossible (or requiring brittle syncing) when vectors live in a separate system. Unified data, unified operations. -
Index the vectors. Nearest-neighbor over high-dimensional vectors needs an ANN index (HNSW/IVFFlat) to be fast — the vector analog of Chapter 23's lesson. Without it, similarity search scans every vector.
-
Know the boundary. pgvector scales to many millions of vectors; dedicated vector databases earn their place at extreme scale or for specialized tuning. Try pgvector first (theme #4); adopt the specialist on demonstrated need — exactly the Part VI judgment.
Discussion questions
- Why does keyword search miss "Resetting your password" for the query "I can't log in," while vector search finds it?
- What is the role of the embedding model, and what does the
<=>operator compute? - Why is co-locating embeddings with relational data (pgvector) advantageous over a separate vector store?
- Why is an ANN index (HNSW) necessary, and what's the analogy to Chapter 23?
- ⭐ At what point would you move from pgvector to a dedicated vector database, and what would you weigh?