Chapter 7 — Further Reading
References for aggregation, plus the statistical traps every analyst should know.
Official reference (everyone)
- PostgreSQL Docs: "Aggregate Functions." The full catalog —
count,sum,avg,min,max, plusstring_agg,array_agg,bool_and/or, and statistical aggregates (stddev,variance,percentile_cont). https://www.postgresql.org/docs/current/functions-aggregate.html - PostgreSQL Docs: "GROUP BY and HAVING," "GROUPING SETS, CUBE, and ROLLUP." The authoritative treatment of grouping, including the advanced summary syntax.
- PostgreSQL Docs: "Aggregate Expressions." The fine print on
DISTINCT,FILTER (WHERE ...)(a clean way to do conditional aggregates), and ordering within aggregates.
The FILTER clause (💻 Developer · 📊 Analyst)
- PostgreSQL's
FILTER (WHERE ...)is a hidden gem:COUNT(*) FILTER (WHERE status = 'delivered')counts only delivered rows within a grouped query — cleaner thanSUM(CASE WHEN ... THEN 1 ELSE 0 END). Standard SQL; look it up and use it.
Avoiding analytics traps (📊 Analyst · everyone)
- Simpson's paradox — read any clear explainer. Case Study 1 is a special case; understanding the general phenomenon will make you skeptical of grouped averages in the right way.
- "Don't average averages" / weighted-mean explainers. Reinforce why you compute ratios at their natural grain.
- Articles on "fan-out" and aggregation (dbt, BI-tool docs). The aggregation-side view of Chapter 6's fan trap (Case Study 2).
Build fluency (📊 Analyst)
- Mode SQL Tutorial — "Aggregate functions" and "GROUP BY." Practical, dataset-driven.
- pgexercises.com — the "Aggregates" section. Excellent graduated
GROUP BY/HAVINGdrills against a real schema. - Anthony DeBarros, Practical SQL — the aggregation and statistics chapters. Hands-on, PostgreSQL-based.
Reference cards (this book)
- Appendix C — SQL Quick Reference: aggregation syntax and the evaluation order.
- Appendix I — SQL Cookbook: recipes for "top N per group," conditional aggregates, and running/grouped summaries (window functions come in Chapter 12).
Do, don't just read
- Recreate Case Study 2's bug, then fix it both ways (
COUNT(DISTINCT)and separate CTEs). Compare toSELECT COUNT(*) FROM orders;. - Try
FILTER: rewrite aGROUP BY+CASEconditional count usingCOUNT(*) FILTER (WHERE ...). You'll never go back. - Pair every
AVGwith aCOUNTin your next ten reports. Watch how often the count changes how you read the average.
Next: Chapter 8 — Built-in Functions: strings, numbers, dates, and type conversion.