Chapter 12 — Key Takeaways
The big idea
Window functions compute over a set of related rows (a window) without collapsing them — the analytics GROUP BY can't do. They replace whole categories of slow self-joins and correlated subqueries with single, fast, one-pass expressions.
The OVER clause
function(...) OVER (PARTITION BY <cols> ORDER BY <cols> <frame>)
PARTITION BY— groups without collapsing (the row-preservingGROUP BY).ORDER BY— orders within the partition (needed for running totals, ranking,LAG/LEAD).OVER ()— whole result is one window (grand total, % of total).
The function families
- Ranking:
ROW_NUMBER(unique),RANK(ties skip: 1,1,3),DENSE_RANK(ties no gap: 1,1,2),NTILE(n)(buckets). - Windowed aggregates:
SUM/AVG/COUNT OVER (...)— group totals alongside rows; withORDER BY→ running totals. - Offset:
LAG/LEAD— previous/next row (period-over-period). - Positional:
FIRST_VALUE/LAST_VALUE/NTH_VALUE.
Patterns to memorize
- Top-N-per-group:
ROW_NUMBER() OVER (PARTITION BY g ORDER BY x DESC)in a subquery/CTE, thenWHERE rn <= N. - % of total:
100.0 * x / SUM(x) OVER (). - Running total:
SUM(x) OVER (ORDER BY t). - Period-over-period:
LAG(x) OVER (ORDER BY t). - Moving average:
AVG(x) OVER (ORDER BY t ROWS BETWEEN n PRECEDING AND CURRENT ROW).
Two essential gotchas
- Can't filter a window function in
WHERE(it's computed afterWHERE). Wrap-and-filter: compute in a subquery/CTE, filter outside. (The top-N pattern.) - Default frame is cumulative (
... UNBOUNDED PRECEDING AND CURRENT ROW), soLAST_VALUEreturns the current row. Use an explicit full frame, orFIRST_VALUE(... ORDER BY t DESC). (Case Study 2.) ROWS(physical rows) vsRANGE(peer values) frames differ on ties.
Performance (theme #5)
One ordered pass instead of the O(n²) of a correlated-subquery running total (Case Study 1). A matching index can avoid the sort. When you reach for a self-join "to compare a row to its group," reach for a window function instead.
You can now…
- ☐ Compute per-group values without collapsing rows (
PARTITION BY). - ☐ Rank and do top-N-per-group (wrap-and-filter).
- ☐ Compute running totals, % of total, and moving averages (frames).
- ☐ Compare to adjacent rows with
LAG/LEAD. - ☐ Avoid the
WHERE-filter and default-frame/LAST_VALUEtraps.
Looking ahead
Chapter 13 — Data Modification. INSERT, UPDATE, DELETE, TRUNCATE, RETURNING, UPSERT. You've read data for eight chapters; now you change it — safely (an UPDATE without WHERE is a production horror story).
One sentence to carry forward: Whenever you want to compute something across rows while keeping every row, it's a window function — just mind the default frame and that you can't filter it in
WHERE.