Chapter 36 — Key Takeaways
The big idea
Some data has a specific shape (time-series, vectors, geography, search) best served by specialized databases — or, often, by a PostgreSQL extension. Theme #4 peaks here: one extended PostgreSQL frequently replaces a fleet of specialized systems.
The specialized families
| Data | Pattern | Dedicated tool | PostgreSQL extension |
|---|---|---|---|
| Time-series | time-stamped, append-heavy, range/aggregate, retention | InfluxDB, Prometheus | TimescaleDB (+ partitioning, date_trunc) |
| Vector / AI | embeddings (meaning), nearest-neighbor similarity | Pinecone, Weaviate, Milvus | pgvector (<=>, HNSW index) |
| Spatial | points/polygons, radius/contains, true geo distance | (GIS systems) | PostGIS (the standard) |
| Search | large-scale ranked/faceted/fuzzy text | Elasticsearch | built-in full-text (Ch. 16) |
Vectors & AI (the hot one)
- An embedding is a vector representing meaning; vector search = nearest-neighbor in high-dimensional space → semantic search, recommendations, RAG (retrieve context for an LLM).
- pgvector stores embeddings beside relational data and does similarity search in SQL — so you build AI features without a separate vector DB, with joins/transactions intact. Index with HNSW/IVFFlat. (Case Study 1.)
- Dedicated vector DBs earn their place at extreme scale/tuning.
Spatial
- Lat/long are angles on a sphere — flat-plane arithmetic is wrong (distorts with latitude) and unindexable for radius queries. PostGIS gives geography types, true spherical distance (
ST_DWithin/ST_Distance), and GiST spatial indexes — correct and fast. (Case Study 2.)
Theme #4: try the extension first
For each specialized need, a PostgreSQL extension (TimescaleDB/pgvector/PostGIS/FTS) often suffices — keeping data unified, with one operational footprint. Adopt the dedicated specialist only on genuine, demonstrated need (extreme scale, specialized features).
Common mistakes
Adopting a specialized DB before trying the extension; forcing specialized data into plain columns (geo radius via arithmetic, vector similarity in app code); over-collecting time-series without retention; treating vector search as exact match (it's approximate similarity).
You can now…
- ☐ Recognize time-series, vector, spatial, and search data and their patterns.
- ☐ Explain embeddings and vector (nearest-neighbor) search and their AI uses.
- ☐ Map each need to a PostgreSQL extension (TimescaleDB, pgvector, PostGIS, FTS).
- ☐ Judge when a dedicated specialized database is warranted.
Looking ahead
Chapter 37 — The Database Decision (closes Part VI). A framework tying together relational, NoSQL, warehouse, distributed, and specialized — how to choose the right database, and why PostgreSQL is the right default.
One sentence to carry forward: Specialized data (time-series, vectors, geography, search) has specialized tools — but a PostgreSQL extension (TimescaleDB, pgvector, PostGIS, built-in FTS) usually does the job, so try it before adding a whole new system.