Case Study 1 — 45 Seconds to 12 Milliseconds
The book's signature performance anchor, as a full incident. A dashboard query that quietly degraded as data grew finally became unusable — and one well-chosen index fixed it. This is the most satisfying skill in databases, and the most learnable.
Background
An internal "customer 360" page showed a single customer's recent orders. Early on, with thousands of orders, it was instant. Nobody thought about it. Two years later, with the orders table past 50 million rows, the page took 45 seconds to load — when it didn't time out. Support reps refreshed, gave up, and worked blind. The query was simple:
SELECT order_id, order_date, status
FROM orders
WHERE customer_id = 4267
ORDER BY order_date DESC
LIMIT 20;
It returns at most 20 rows. Why would 20 rows take 45 seconds?
Diagnosis
The on-call engineer ran the one command that answers "why is this slow" — EXPLAIN ANALYZE:
EXPLAIN ANALYZE SELECT order_id, order_date, status FROM orders
WHERE customer_id = 4267 ORDER BY order_date DESC LIMIT 20;
Limit (actual time=44210 ... )
-> Sort (actual time=44210 ... )
Sort Key: order_date DESC
-> Seq Scan on orders (actual time=0.3..43800 rows=312 ...)
Filter: (customer_id = 4267)
Rows Removed by Filter: 51000000 ← the smoking gun
Planning Time: 0.2 ms
Execution Time: 44231 ms
The plan said it plainly: a Seq Scan on orders — reading all 51 million rows, throwing away 51,000,000 that didn't match customer_id = 4267, to find the 312 that did, then sorting them. The database had no way to find customer 4267's rows except by examining every row. The 45 seconds was the cost of scanning 50M+ rows on every page load. (Rows Removed by Filter: 51000000 is the classic missing-index signature — Chapter 24.)
The fix
One index on the filtered column:
CREATE INDEX CONCURRENTLY idx_orders_customer_id ON orders (customer_id);
(CONCURRENTLY so building it didn't lock the live table — Chapters 14, 22.) Re-running EXPLAIN ANALYZE:
Limit (actual time=0.05..0.12 rows=20 ...)
-> Index Scan using idx_orders_customer_id on orders (actual time=0.04..0.10 rows=312 ...)
Index Cond: (customer_id = 4267)
Execution Time: 12 ms
The plan switched to an Index Scan: it jumped straight to customer 4267's rows via the B-tree (O(log n)) instead of scanning 50M rows (O(n)). 45,000 ms → 12 ms — about 3,500× faster. The page became instant.
Making it even better
The engineer noticed the query also sorts by order_date DESC and takes the top 20. A composite index matching the access pattern lets the database skip the sort too:
CREATE INDEX CONCURRENTLY idx_orders_cust_date ON orders (customer_id, order_date DESC);
Now the index supplies customer 4267's rows already in date order, so the LIMIT 20 reads just the first 20 index entries — no sort, no extra work. The plan's Sort node disappears. For a "most recent N per X" query, an index ordered (X, date DESC) is the ideal shape.
The analysis
-
This is the anchor, and it's basic competence (theme #5). A 3,500× speedup from one index isn't exotic — it's the everyday payoff of knowing that
WHERE customer_id = ...on a big table needs an index. Every developer should be able to do this. -
EXPLAIN ANALYZEturns "it's slow" into "here's why." TheSeq Scan+Rows Removed by Filter: 51000000told the whole story in one line. Don't guess at performance — read the plan (Chapter 24). -
The smell is "scan a huge table, return few rows." Whenever a query fetches a small fraction of a large table and you see a sequential scan, suspect a missing index on the filter/join column.
-
Match the index to the whole access pattern. The single-column index fixed the filter; the composite
(customer_id, order_date DESC)index also eliminated the sort — index design follows how the query filters and orders. -
Build it
CONCURRENTLYin production. A plainCREATE INDEXwould have locked the liveorderstable while building;CONCURRENTLYavoided a second outage during the fix. -
It degraded silently. The query was fine at small scale and got slowly worse — nobody "broke" it. Performance problems often grow with data; profiling top queries periodically (not just when they catch fire) catches them earlier.
Discussion questions
- Why did a query returning 20 rows take 45 seconds before the index?
- What two pieces of the
EXPLAIN ANALYZEoutput pinpointed the problem? - Explain the before/after plans (
Seq ScanvsIndex Scan) in O(n) vs O(log n) terms. - Why does the composite
(customer_id, order_date DESC)index help beyond the single-column one? - ⭐ Why use
CONCURRENTLY, and what's its trade-off? Why is "degraded silently" an argument for periodic profiling?