Chapter 23 — Key Takeaways
The big idea
An index turns finding rows from an O(n) scan into an O(log n) lookup — the single most impactful performance tool, and (theme #5) basic competence, not wizardry. The anchor: 45 s → 12 ms from one index. (Case Study 1.)
What an index is
- A separate, sorted data structure (like a book's index) for finding rows by value without scanning the table.
- The B-tree (default) stays shallow (logarithmic height), serving equality, ranges, sorting, and prefix
LIKE.
When an index helps
Columns in WHERE (equality/range), JOIN conditions (index your foreign keys — #1 cause of slow joins), ORDER BY/GROUP BY, and UNIQUE constraints.
Diagnosing a missing index
EXPLAIN ANALYZE → a Seq Scan on a large table with Rows Removed by Filter huge while returning few rows = missing index. Add it → plan switches to Index Scan. (Chapter 24 covers plans fully.)
Index types
| Type | For |
|---|---|
| B-tree (default) | equality, ranges, sorting, prefix |
| GIN | JSONB @>, arrays, full-text (Ch. 16) |
| GiST | ranges, spatial, nearest-neighbor |
| BRIN | huge naturally-ordered tables (time-series) |
| multi-column | leftmost-prefix (order matters!) |
| partial | a subset of rows (WHERE ...) |
| expression | computed values (lower(email) — fixes function-on-column) |
covering/INCLUDE |
index-only scans (answer from the index alone) |
The cost — and when NOT to index
Indexes slow writes (every write maintains every index), use storage, and need maintenance. Don't index: small tables, low-cardinality columns (write-heavy), rarely-queried columns, or queries returning most of the table. Twenty indexes on a write-heavy table = twenty taxes per insert; drop unused ones (pg_stat_user_indexes, idx_scan = 0). (Case Study 2.)
The balance
- Too few indexes → slow reads (Case Study 1). Too many → slow writes + bloat (Case Study 2).
- Add an index in response to a measured slow query; review and prune periodically. Justify each index with a real query.
- Build with
CREATE INDEX CONCURRENTLYin production (no write lock).
You can now…
- ☐ Explain O(log n) vs O(n) and what an index is.
- ☐ Diagnose a missing index (
Seq Scansmell) and add the right one. - ☐ Choose among B-tree/GIN/GiST/BRIN/multi-column/partial/expression/covering.
- ☐ State index costs and when not to index; find and drop unused indexes.
- ☐ Build indexes safely (
CONCURRENTLY).
Looking ahead
Chapter 24 — Query Optimization. Reading EXPLAIN plans in depth — scan types, join algorithms, cost estimates — to diagnose why any query is slow and fix it.
One sentence to carry forward: Index the columns you filter, join, and sort on — diagnosing the missing one is the 45s→12ms skill — but every index taxes writes, so add them for measured reads and prune the ones no query uses.