Chapter 36 — Quiz
12 questions. Answers at the bottom.
Multiple choice
Q1. Time-series data is characterized by: - A) Random access by key - B) Time-stamped, append-heavy points queried by time range/aggregation - C) Deep relationship traversal - D) Document flexibility
Q2. The PostgreSQL extension for time-series is: - A) PostGIS - B) pgvector - C) TimescaleDB - D) Citus
Q3. An embedding is: - A) A foreign key - B) A vector of numbers representing the meaning of text/image/audio - C) An index - D) A backup
Q4. Vector (nearest-neighbor) search finds: - A) Exact keyword matches - B) Items whose embeddings are closest (most similar in meaning) - C) The newest rows - D) Duplicate rows
Q5. pgvector lets PostgreSQL:
- A) Shard automatically
- B) Store embeddings and do similarity search (e.g., <=> distance)
- C) Do spatial queries
- D) Replicate
Q6. RAG (Retrieval-Augmented Generation) uses vector search to: - A) Train the model - B) Retrieve relevant documents to give an LLM context - C) Encrypt prompts - D) Cache results
Q7. PostGIS adds to PostgreSQL: - A) Time-series functions - B) Geometry/geography types, spatial functions, and spatial (GiST) indexes - C) Vector search - D) Full-text search
Q8. "All stores within 5 km of a point" needs:
- A) A B-tree on two numeric columns
- B) Spatial types/indexes (PostGIS ST_DWithin)
- C) A vector index
- D) Sharding
Q9. For most applications, advanced search is covered by: - A) Always Elasticsearch - B) PostgreSQL full-text search (Elasticsearch for extreme scale/features) - C) pgvector - D) TimescaleDB
Q10. The chapter's through-line (theme #4) is: - A) Always use a dedicated specialized database - B) A PostgreSQL extension often replaces a specialized system — try it first - C) Avoid PostgreSQL for specialized data - D) Specialized data can't be stored relationally
True/False
Q11. Vector search is exact matching, like a primary-key lookup. (True / False)
Q12. pgvector lets you keep embeddings alongside relational data, with transactions and joins. (True / False)
Short answer
Q13. You're building "find articles similar in meaning to this one." Outline how you'd implement it with PostgreSQL + pgvector, and when you'd instead use a dedicated vector database.
---
Answer key
Q1 — B. Time-stamped, append-heavy, time-range/aggregation queries.
Q2 — C. TimescaleDB (built on partitioning).
Q3 — B. A vector representing meaning.
Q4 — B. Closest embeddings = most similar.
Q5 — B. Vector type + similarity operators/indexes.
Q6 — B. Retrieve relevant context for an LLM.
Q7 — B. Spatial types, functions, GiST indexes.
Q8 — B. PostGIS spatial query (ST_DWithin).
Q9 — B. PostgreSQL FTS for most; Elasticsearch for extreme scale/features.
Q10 — B. Try the PostgreSQL extension first; it often suffices.
Q11 — False. It's approximate similarity (nearest-neighbor), not exact match.
Q12 — True. Embeddings live in PostgreSQL with the relational data, transactions, and joins intact.
Q13. Generate an embedding for each article (via an AI model) and store it in a vector column with pgvector; add an HNSW/IVFFlat index. To find similar articles, embed the query article and run ORDER BY embedding <=> :query_embedding LIMIT N (nearest by cosine/L2 distance). This keeps embeddings beside the article data, with SQL/joins/transactions. Use a dedicated vector database (Pinecone, Weaviate, Milvus) instead when you have very large vector volumes, need specialized ANN index tuning/scale, or want vector search decoupled from the primary database — i.e., when you genuinely exceed pgvector.
Scoring: 10–12 you know the specialized landscape; 7–9 review vectors and theme #4; below 7, redo Exercises B.