Chapter 36 — Quiz

DataField.Dev

Chapter 36 — Quiz

12 questions. Answers at the bottom.

Multiple choice

Q1. Time-series data is characterized by: - A) Random access by key - B) Time-stamped, append-heavy points queried by time range/aggregation - C) Deep relationship traversal - D) Document flexibility

Q2. The PostgreSQL extension for time-series is: - A) PostGIS - B) pgvector - C) TimescaleDB - D) Citus

Q3. An embedding is: - A) A foreign key - B) A vector of numbers representing the meaning of text/image/audio - C) An index - D) A backup

Q4. Vector (nearest-neighbor) search finds: - A) Exact keyword matches - B) Items whose embeddings are closest (most similar in meaning) - C) The newest rows - D) Duplicate rows

Q5. pgvector lets PostgreSQL: - A) Shard automatically - B) Store embeddings and do similarity search (e.g., <=> distance) - C) Do spatial queries - D) Replicate

Q6. RAG (Retrieval-Augmented Generation) uses vector search to: - A) Train the model - B) Retrieve relevant documents to give an LLM context - C) Encrypt prompts - D) Cache results

Q7. PostGIS adds to PostgreSQL: - A) Time-series functions - B) Geometry/geography types, spatial functions, and spatial (GiST) indexes - C) Vector search - D) Full-text search

Q8. "All stores within 5 km of a point" needs: - A) A B-tree on two numeric columns - B) Spatial types/indexes (PostGIS ST_DWithin) - C) A vector index - D) Sharding

Q9. For most applications, advanced search is covered by: - A) Always Elasticsearch - B) PostgreSQL full-text search (Elasticsearch for extreme scale/features) - C) pgvector - D) TimescaleDB

Q10. The chapter's through-line (theme #4) is: - A) Always use a dedicated specialized database - B) A PostgreSQL extension often replaces a specialized system — try it first - C) Avoid PostgreSQL for specialized data - D) Specialized data can't be stored relationally

True/False

Q11. Vector search is exact matching, like a primary-key lookup. (True / False)

Q12. pgvector lets you keep embeddings alongside relational data, with transactions and joins. (True / False)

Short answer

Q13. You're building "find articles similar in meaning to this one." Outline how you'd implement it with PostgreSQL + pgvector, and when you'd instead use a dedicated vector database.

---

Answer key

Q1 — B. Time-stamped, append-heavy, time-range/aggregation queries.

Q2 — C. TimescaleDB (built on partitioning).

Q3 — B. A vector representing meaning.

Q4 — B. Closest embeddings = most similar.

Q5 — B. Vector type + similarity operators/indexes.

Q6 — B. Retrieve relevant context for an LLM.

Q7 — B. Spatial types, functions, GiST indexes.

Q8 — B. PostGIS spatial query (ST_DWithin).

Q9 — B. PostgreSQL FTS for most; Elasticsearch for extreme scale/features.

Q10 — B. Try the PostgreSQL extension first; it often suffices.

Q11 — False. It's approximate similarity (nearest-neighbor), not exact match.

Q12 — True. Embeddings live in PostgreSQL with the relational data, transactions, and joins intact.

Q13. Generate an embedding for each article (via an AI model) and store it in a vector column with pgvector; add an HNSW/IVFFlat index. To find similar articles, embed the query article and run ORDER BY embedding <=> :query_embedding LIMIT N (nearest by cosine/L2 distance). This keeps embeddings beside the article data, with SQL/joins/transactions. Use a dedicated vector database (Pinecone, Weaviate, Milvus) instead when you have very large vector volumes, need specialized ANN index tuning/scale, or want vector search decoupled from the primary database — i.e., when you genuinely exceed pgvector.

Scoring: 10–12 you know the specialized landscape; 7–9 review vectors and theme #4; below 7, redo Exercises B.