Key Takeaways — Chapter 16: seaborn Philosophy

1. seaborn Is a Statistical Visualization Layer on matplotlib

seaborn is not a replacement for matplotlib. It is a higher-level interface that handles common statistical operations (grouping, aggregation, confidence intervals, regression fits) automatically. Every seaborn call produces matplotlib Figure and Axes objects that can be further customized with matplotlib methods. Learning seaborn does not replace learning matplotlib — it extends it.

2. Tidy Data Is the Preferred Format

seaborn expects tidy DataFrames: each variable is a column, each observation is a row. For wide-form data, convert to tidy with pd.melt(id_vars=..., var_name=..., value_name=...). Tidy data enables the declarative API where you map column names directly to visual channels: x="col", y="col", hue="col", col="col".

3. The Three Function Families

seaborn's plotting functions organize into three families. Relational (relplot, scatterplot, lineplot) for "how do two continuous variables relate?" Distributional (displot, histplot, kdeplot, ecdfplot) for "how is a single variable distributed?" Categorical (catplot, stripplot, boxplot, violinplot, barplot) for "how do values compare across categories?" Each family has a figure-level function (relplot, displot, catplot) and a set of axes-level functions.

4. Figure-Level vs. Axes-Level

Axes-level functions (scatterplot, histplot, boxplot) target an existing Axes via the ax= parameter and return that Axes. Use them when integrating seaborn into manual matplotlib layouts. Figure-level functions (relplot, displot, catplot) create their own figure, support faceting via col and row, and return a FacetGrid object. Use them when seaborn should handle the whole figure and you want automatic small multiples.

5. Encode Multiple Variables with hue, style, size

The threshold concept: imperative matplotlib ("for each group, plot in a color") becomes declarative seaborn ("map group to color via hue"). The hue, style, and size parameters map DataFrame columns to visual channels automatically. seaborn handles the iteration, the color assignment, and the legend construction.

6. Facet with col and row

Figure-level functions support col and row parameters that create small multiples automatically. col="category" produces one panel per category value. row="another" adds a second faceting dimension. col_wrap=N wraps a single col faceting into multiple rows. The height and aspect parameters control per-panel sizing (instead of a single figure-level figsize).

7. Themes Provide Better Defaults Than matplotlib

sns.set_theme(style="whitegrid", context="notebook") applies a coherent theme across all subsequent plots. Styles include darkgrid, whitegrid, dark, white, ticks. Contexts include paper, notebook, talk, poster (from smallest fonts to largest). The palette parameter sets the default categorical color palette. Apply a theme once at the top of your script, and every subsequent plot inherits it.

8. Access matplotlib for Customization

Axes-level functions return the Axes directly: ax = sns.scatterplot(...). Figure-level functions return a FacetGrid with g.fig (the Figure) and g.axes (the array of Axes). Any matplotlib customization can be applied to these objects after the seaborn call. This hybrid pattern — seaborn for the statistical shortcuts, matplotlib for the customization — is how experienced practitioners use the libraries together.

9. Automatic Statistics in Plotting Calls

seaborn's lineplot aggregates automatically when there are multiple observations per x-value, computing the mean and a 95% confidence band via bootstrap. regplot fits a regression line and draws confidence intervals. histplot computes bin counts. kdeplot computes kernel density estimates. These automatic statistics are convenient for exploration but should be explicitly acknowledged in production code so readers understand what the chart represents.

10. Default to seaborn for Statistical Work; Drop Down to matplotlib for Control

Use seaborn as the default for exploratory data analysis, statistical visualization, and multi-panel comparisons. Drop down to matplotlib when you need precise layout control (GridSpec), typographic polish, or chart types seaborn does not support. The hybrid approach — seaborn first, matplotlib for refinement — combines the productivity of the higher-level API with the control of the lower-level one.