Chapter 23 — Key Takeaways

The big idea

An index turns finding rows from an O(n) scan into an O(log n) lookup — the single most impactful performance tool, and (theme #5) basic competence, not wizardry. The anchor: 45 s → 12 ms from one index. (Case Study 1.)

What an index is

  • A separate, sorted data structure (like a book's index) for finding rows by value without scanning the table.
  • The B-tree (default) stays shallow (logarithmic height), serving equality, ranges, sorting, and prefix LIKE.

When an index helps

Columns in WHERE (equality/range), JOIN conditions (index your foreign keys — #1 cause of slow joins), ORDER BY/GROUP BY, and UNIQUE constraints.

Diagnosing a missing index

EXPLAIN ANALYZE → a Seq Scan on a large table with Rows Removed by Filter huge while returning few rows = missing index. Add it → plan switches to Index Scan. (Chapter 24 covers plans fully.)

Index types

Type For
B-tree (default) equality, ranges, sorting, prefix
GIN JSONB @>, arrays, full-text (Ch. 16)
GiST ranges, spatial, nearest-neighbor
BRIN huge naturally-ordered tables (time-series)
multi-column leftmost-prefix (order matters!)
partial a subset of rows (WHERE ...)
expression computed values (lower(email) — fixes function-on-column)
covering/INCLUDE index-only scans (answer from the index alone)

The cost — and when NOT to index

Indexes slow writes (every write maintains every index), use storage, and need maintenance. Don't index: small tables, low-cardinality columns (write-heavy), rarely-queried columns, or queries returning most of the table. Twenty indexes on a write-heavy table = twenty taxes per insert; drop unused ones (pg_stat_user_indexes, idx_scan = 0). (Case Study 2.)

The balance

  • Too few indexes → slow reads (Case Study 1). Too many → slow writes + bloat (Case Study 2).
  • Add an index in response to a measured slow query; review and prune periodically. Justify each index with a real query.
  • Build with CREATE INDEX CONCURRENTLY in production (no write lock).

You can now…

  • ☐ Explain O(log n) vs O(n) and what an index is.
  • ☐ Diagnose a missing index (Seq Scan smell) and add the right one.
  • ☐ Choose among B-tree/GIN/GiST/BRIN/multi-column/partial/expression/covering.
  • ☐ State index costs and when not to index; find and drop unused indexes.
  • ☐ Build indexes safely (CONCURRENTLY).

Looking ahead

Chapter 24 — Query Optimization. Reading EXPLAIN plans in depth — scan types, join algorithms, cost estimates — to diagnose why any query is slow and fix it.

One sentence to carry forward: Index the columns you filter, join, and sort on — diagnosing the missing one is the 45s→12ms skill — but every index taxes writes, so add them for measured reads and prune the ones no query uses.