Key Takeaways: Statistical Visualization with seaborn

This is your reference card for Chapter 16 — the chapter where your visualizations gained statistical intelligence. Keep this nearby whenever you are exploring data and need to choose the right chart type.


The Three Figure-Level Functions

  1. displot() — Distribution plots. Histograms, KDEs, ECDFs, rug plots. Use when asking "What does the distribution of X look like?"

  2. catplot() — Categorical plots. Box, violin, swarm, strip, bar, count, point plots. Use when asking "How does Y vary across categories?"

  3. relplot() — Relational plots. Scatter and line plots. Use when asking "How are X and Y related?"

All three accept hue, col, and row for encoding additional variables.


Quick Reference: Which Plot When?

Question Plot Type Code Pattern
Distribution of one variable Histogram or KDE sns.displot(data=df, x="col")
Comparing distributions across groups KDE with hue, or faceted histogram sns.displot(data=df, x="col", hue="group")
Median, quartiles, outliers by category Box plot sns.catplot(data=df, x="cat", y="num", kind="box")
Full distribution shape by category Violin plot sns.catplot(data=df, x="cat", y="num", kind="violin")
Every data point by category (small data) Swarm plot sns.catplot(data=df, x="cat", y="num", kind="swarm")
Mean comparison with confidence intervals Bar plot sns.catplot(data=df, x="cat", y="num", kind="bar")
Two-variable relationship Scatter plot sns.relplot(data=df, x="x", y="y")
Trend over time with uncertainty Line plot sns.relplot(data=df, x="time", y="val", kind="line")
Linear/nonlinear regression fit Regression plot sns.lmplot(data=df, x="x", y="y")
Correlation matrix Heatmap sns.heatmap(df.corr(), annot=True)
All pairwise relationships Pair plot sns.pairplot(df, hue="group")

Encoding Variables

Parameter What It Controls Example
x, y Position on axes x="gdp", y="coverage"
hue Color (within same panel) hue="region"
size Marker area (scatter only) size="population"
style Marker shape (scatter only) style="income_group"
col Separate panels, horizontal col="year"
row Separate panels, vertical row="region"
col_wrap Max columns before wrapping col_wrap=3

Heuristic: Use hue first. If too crowded, switch to col. Use row sparingly (vertical scrolling is harder to read).


Themes and Contexts

Styles (background and grid)

Style Best For
"whitegrid" Notebooks, exploration
"darkgrid" Data-heavy plots with many reference lines
"ticks" Publications, formal reports
"white" Clean, minimal presentations

Contexts (element scaling)

Context Best For
"paper" Printed figures in papers
"notebook" Jupyter notebooks (default)
"talk" Slides and presentations
"poster" Conference posters

Set both at once: sns.set_theme(style="ticks", context="talk", palette="colorblind")


Common Palettes

Palette Type Use Case
"muted" Qualitative Default, pleasant categorical colors
"colorblind" Qualitative Accessible to color-vision-deficient viewers
"Set2" Qualitative Soft, distinctive categorical colors
"Blues" Sequential Ordered data, light-to-dark
"YlOrRd" Sequential Heat-style data
"coolwarm" Diverging Data centered around a meaningful midpoint
"RdBu" Diverging Positive/negative distinction

Figure-Level vs. Axes-Level

Feature Figure-Level (displot, relplot, catplot) Axes-Level (histplot, scatterplot, boxplot)
Creates new Figure Yes No
Returns FacetGrid matplotlib Axes
Supports col/row Yes No
Embeds in plt.subplots() No (creates own Figure) Yes (pass ax=)
Customization Use FacetGrid methods Use Axes methods directly

Rule of thumb: Use figure-level when you want faceting. Use axes-level when you want to combine multiple seaborn plots on custom subplot layouts.


Common Gotchas

Gotcha Symptom Fix
Using Axes methods on FacetGrid AttributeError Use g.set_axis_labels(), not g.set_xlabel()
Too many hue categories Indistinguishable colors Reduce categories or use col instead
Swarm plot on large data Slow rendering, wide swarm Switch to violin or box plot
KDE on tiny dataset Over-smooth, misleading curves Use rug plot or histogram with few bins
Forgetting annot=True on heatmap Numbers not visible in cells Add annot=True, fmt=".2f"
Not centering diverging colormap Misleading color assignment Add center=0 to heatmap

Terms to Remember

Term Definition
seaborn Python statistical visualization library built on matplotlib
relplot Figure-level function for scatter and line plots
catplot Figure-level function for categorical comparison plots
displot Figure-level function for distribution plots
heatmap Colored matrix visualization, often used for correlations
pairplot Grid of all pairwise scatter plots with diagonal distributions
FacetGrid Multi-panel layout engine behind figure-level functions
hue Parameter encoding a variable as color within a panel
style Parameter encoding a variable as marker shape
palette Named set of colors used for categorical or continuous mapping
kde Kernel density estimation — smooth curve approximating a distribution
violin plot Categorical plot showing KDE on each side of a central axis
box plot Categorical plot showing median, IQR, whiskers, and outliers
swarm plot Categorical plot showing non-overlapping individual data points
regression plot Scatter plot with fitted line and confidence band

What You Should Be Able to Do Now

  • [ ] Import seaborn with the standard sns alias
  • [ ] Set a theme with sns.set_theme() specifying style, palette, and context
  • [ ] Create distribution plots with displot() — histogram, KDE, ECDF, rug
  • [ ] Split distributions by group using hue and col
  • [ ] Create categorical comparisons with catplot() — box, violin, swarm, bar
  • [ ] Create scatter plots with relplot() encoding hue, size, and style
  • [ ] Create line plots with relplot(kind="line") showing aggregation and CI
  • [ ] Create regression plots with lmplot() — linear, polynomial, and LOWESS
  • [ ] Build correlation heatmaps with heatmap() on .corr() output
  • [ ] Build pair plots for multivariate overview
  • [ ] Use FacetGrid for custom multi-panel layouts
  • [ ] Choose the right plot based on question type and dataset size
  • [ ] Customize with palettes, themes, and matplotlib fine-tuning

If you checked every box, you are ready for Chapter 17, where your charts come alive with interactivity — tooltips, zoom, animation, and dashboards with plotly.