Further Reading: Distributional Visualization
Tier 1: Essential Reading
The seaborn Distributions Tutorial. seaborn.pydata.org/tutorial/distributions.html
The official seaborn tutorial on distributional visualization. Covers histplot, kdeplot, ecdfplot, and displot with worked examples. Essential as a direct API reference alongside this chapter.
Wilkinson, Leland. The Grammar of Graphics. 2nd ed. Springer, 2005. Wilkinson's treatment of statistical visualization includes extensive coverage of distributional charts. The chapters on "distributions" and "summaries" are particularly relevant. Dense but foundational.
Silverman, B. W. Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986. The classic text on kernel density estimation. Read selectively — the introductory chapters cover bandwidth selection, bias-variance trade-off, and the specific rules (Silverman's rule) that seaborn uses. Essential for understanding KDE's limitations.
Tier 2: Recommended Specialized Sources
Wickham, Hadley. ggplot2: Elegant Graphics for Data Analysis. 3rd ed. Springer, 2016. Wickham's ggplot2 book covers the same distributional chart types with ggplot2 syntax. The concepts transfer directly to seaborn. Useful as a comparative reference.
Wilke, Claus O. Fundamentals of Data Visualization. O'Reilly Media, 2019. Wilke's chapters on "Visualizing distributions" (Chapter 7) and "Visualizing distributions: multiple distributions" (Chapter 9) are directly relevant to this chapter. His design rationale complements seaborn's practical API. Freely available at clauswilke.com/dataviz.
Matejka, Justin, and George Fitzmaurice. "Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing." CHI 2017, 1290–1294. The paper behind the Datasaurus Dozen (Case Study 1 of this chapter and Chapter 11). Short and readable. Essential reading for understanding why visualization matters for distributional analysis.
Waskom, Michael, et al. "seaborn: statistical data visualization." Journal of Open Source Software 6, no. 60 (2021): 3021. The official seaborn publication. Brief but authoritative. Freely available at joss.theoj.org.
Claus, Caroline. "Joyplots with seaborn (and pandas)." Blog post, 2017. One of several online tutorials on building ridge plots (joy plots) with seaborn's FacetGrid. These tutorials preceded the recipe in the seaborn gallery and helped establish the pattern.
Tier 3: Tools and Online Resources
| Resource | URL / Source | Description |
|---|---|---|
| seaborn Examples Gallery | seaborn.pydata.org/examples/ | Visual gallery of seaborn plots with source code. Filter by "distributions" for histogram, KDE, ECDF, violin, and rug plot examples. |
| joypy (a ridgeplot library for Python) | github.com/leotac/joypy | A dedicated Python library for ridge plots (joy plots), built on matplotlib. An alternative to the FacetGrid recipe. |
| ggridges (the ggplot2 equivalent) | github.com/wilkelab/ggridges | ggplot2 library for ridge plots. Useful as a comparison and for R users. |
| scipy.stats.gaussian_kde | docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html | The underlying Gaussian KDE implementation that seaborn wraps. Useful when you want fine control over KDE parameters. |
| Kolmogorov-Smirnov test | docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ks_2samp.html | The statistical test for comparing two ECDFs. Natural companion to the visual ECDF comparison. |
| Anderson-Darling test | docs.scipy.org/doc/scipy/reference/generated/scipy.stats.anderson_ksamp.html | Another statistical test for distribution comparison, related to the ECDF. |
| The Datasaurus Dozen dataset | github.com/jumpingrivers/datasauRus | R and Python packages with the Datasaurus Dozen data. Install with pip install datasaurus or similar. |
| Python Graph Gallery — distributional charts | python-graph-gallery.com/distribution/ | Code examples for each distributional chart type in matplotlib and seaborn. |
| Seaborn Examples Reference for Ridge Plots | seaborn.pydata.org/examples/kde_ridgeplot.html | seaborn's official ridge plot example using FacetGrid. A slightly different recipe from Section 17.7 of this chapter; both work. |
A note on reading order: If you want one additional source, read Wilke's Fundamentals of Data Visualization, Chapters 7-9. It provides the design principles that guide the choice of distributional chart type, complementing seaborn's practical API. For deeper statistical understanding, read Silverman's book on density estimation — the theoretical foundations of KDE are worth knowing even if you will not implement them directly.