Chapter 7 — Further Reading

References for aggregation, plus the statistical traps every analyst should know.

Official reference (everyone)

  • PostgreSQL Docs: "Aggregate Functions." The full catalog — count, sum, avg, min, max, plus string_agg, array_agg, bool_and/or, and statistical aggregates (stddev, variance, percentile_cont). https://www.postgresql.org/docs/current/functions-aggregate.html
  • PostgreSQL Docs: "GROUP BY and HAVING," "GROUPING SETS, CUBE, and ROLLUP." The authoritative treatment of grouping, including the advanced summary syntax.
  • PostgreSQL Docs: "Aggregate Expressions." The fine print on DISTINCT, FILTER (WHERE ...) (a clean way to do conditional aggregates), and ordering within aggregates.

The FILTER clause (💻 Developer · 📊 Analyst)

  • PostgreSQL's FILTER (WHERE ...) is a hidden gem: COUNT(*) FILTER (WHERE status = 'delivered') counts only delivered rows within a grouped query — cleaner than SUM(CASE WHEN ... THEN 1 ELSE 0 END). Standard SQL; look it up and use it.

Avoiding analytics traps (📊 Analyst · everyone)

  • Simpson's paradox — read any clear explainer. Case Study 1 is a special case; understanding the general phenomenon will make you skeptical of grouped averages in the right way.
  • "Don't average averages" / weighted-mean explainers. Reinforce why you compute ratios at their natural grain.
  • Articles on "fan-out" and aggregation (dbt, BI-tool docs). The aggregation-side view of Chapter 6's fan trap (Case Study 2).

Build fluency (📊 Analyst)

  • Mode SQL Tutorial — "Aggregate functions" and "GROUP BY." Practical, dataset-driven.
  • pgexercises.com — the "Aggregates" section. Excellent graduated GROUP BY/HAVING drills against a real schema.
  • Anthony DeBarros, Practical SQL — the aggregation and statistics chapters. Hands-on, PostgreSQL-based.

Reference cards (this book)

  • Appendix C — SQL Quick Reference: aggregation syntax and the evaluation order.
  • Appendix I — SQL Cookbook: recipes for "top N per group," conditional aggregates, and running/grouped summaries (window functions come in Chapter 12).

Do, don't just read

  • Recreate Case Study 2's bug, then fix it both ways (COUNT(DISTINCT) and separate CTEs). Compare to SELECT COUNT(*) FROM orders;.
  • Try FILTER: rewrite a GROUP BY + CASE conditional count using COUNT(*) FILTER (WHERE ...). You'll never go back.
  • Pair every AVG with a COUNT in your next ten reports. Watch how often the count changes how you read the average.

Next: Chapter 8 — Built-in Functions: strings, numbers, dates, and type conversion.