Quiz: seaborn Philosophy

Q: What are the three seaborn function families? (a) Line, bar, scatter (b) Relational, distributional, categorical (c) Figure, axes, artist (d) Simple, advanced, expert

(b) Relational, distributional, categorical. Each family has a figure-level function (relplot, displot, catplot) and a set of axes-level functions.

Q: The tidy data format that seaborn prefers means: (a) Each variable is a column and each observation is a row (b) The data is sorted and cleaned (c) The data has no missing values (d) The data is in CSV format

(a) Each variable is a column and each observation is a row. This long-form structure allows seaborn to map column names directly to visual channels via the `data=` parameter.

Q: Which seaborn parameter maps a variable to color? (a) `color` (b) `hue` (c) `palette` (d) `cmap`

(b) `hue`. The `hue` parameter takes a column name and maps the column's values to colors from the current palette. seaborn handles iteration and legend construction automatically.

Q: To convert wide-form data to tidy data in pandas, you use: (a) `pd.pivot()` (b) `pd.melt()` (c) `pd.tidy()` (d) `pd.reshape()`

(b) `pd.melt()`. The `melt` function takes `id_vars` (columns to keep as-is), `var_name` (name for the new variable column), and `value_name` (name for the new value column). It is the standard wide-to-tidy conversion in pandas.

Q: Which of the following is the correct syntax for creating a figure-level scatter plot with seaborn faceted by a column? (a) `sns.scatterplot(data=df, x="x", y="y", col="category")` (b) `sns.relplot(data=df, x="x", y="y", col="category", kind="scatter")` (c) `sns.facet(df, "x", "y", col="category")` (d) `plt.scatter(df, col="category")`

(b) `sns.relplot(data=df, x="x", y="y", col="category", kind="scatter")`. The `col` parameter is only available on figure-level functions (relplot, displot, catplot). The axes-level `scatterplot` does not support faceting. The `kind="scatter"` argument selects the underlying function.

Q: "seaborn is a complete replacement for matplotlib."

False. seaborn is built on top of matplotlib and relies on it for rendering. Advanced customization requires dropping down to the matplotlib layer. seaborn is a higher-level interface, not a replacement.

DataField.Dev

Quiz: seaborn Philosophy

20 questions. Aim for mastery (18+).

Multiple Choice (10 questions)

1. seaborn is:

(a) A replacement for matplotlib (b) A statistical visualization library built on top of matplotlib (c) An interactive visualization library like Plotly (d) A dashboard framework

Answer

**(b)** A statistical visualization library built on top of matplotlib. Every seaborn call produces matplotlib objects underneath, and advanced customization drops down to the matplotlib level.

2. What are the three seaborn function families?

(a) Line, bar, scatter (b) Relational, distributional, categorical (c) Figure, axes, artist (d) Simple, advanced, expert

Answer

**(b)** Relational, distributional, categorical. Each family has a figure-level function (relplot, displot, catplot) and a set of axes-level functions.

3. The tidy data format that seaborn prefers means:

(a) Each variable is a column and each observation is a row (b) The data is sorted and cleaned (c) The data has no missing values (d) The data is in CSV format

Answer

**(a)** Each variable is a column and each observation is a row. This long-form structure allows seaborn to map column names directly to visual channels via the `data=` parameter.

4. Which seaborn parameter maps a variable to color?

(a) color (b) hue (c) palette (d) cmap

Answer

**(b)** `hue`. The `hue` parameter takes a column name and maps the column's values to colors from the current palette. seaborn handles iteration and legend construction automatically.

5. The difference between figure-level and axes-level seaborn functions is:

(a) Figure-level functions create their own figure; axes-level functions target an existing Axes (b) Figure-level functions are faster (c) Axes-level functions cannot produce plots (d) They are the same thing

Answer

**(a)** Figure-level functions create their own figure; axes-level functions target an existing Axes. Figure-level functions (relplot, displot, catplot) support faceting via `col` and `row`. Axes-level functions (scatterplot, histplot, boxplot) accept an `ax` parameter and return the Axes for further customization.

6. To convert wide-form data to tidy data in pandas, you use:

(a) pd.pivot() (b) pd.melt() (c) pd.tidy() (d) pd.reshape()

Answer

**(b)** `pd.melt()`. The `melt` function takes `id_vars` (columns to keep as-is), `var_name` (name for the new variable column), and `value_name` (name for the new value column). It is the standard wide-to-tidy conversion in pandas.

7. Which of the following is the correct syntax for creating a figure-level scatter plot with seaborn faceted by a column?

(a) sns.scatterplot(data=df, x="x", y="y", col="category") (b) sns.relplot(data=df, x="x", y="y", col="category", kind="scatter") (c) sns.facet(df, "x", "y", col="category") (d) plt.scatter(df, col="category")

Answer

**(b)** `sns.relplot(data=df, x="x", y="y", col="category", kind="scatter")`. The `col` parameter is only available on figure-level functions (relplot, displot, catplot). The axes-level `scatterplot` does not support faceting. The `kind="scatter"` argument selects the underlying function.

8. The sns.set_theme(style="whitegrid", context="notebook") call:

(a) Creates a new figure (b) Applies a visual style and font scaling context that affects all subsequent seaborn plots (c) Saves the current figure (d) Deletes all previous plots

Answer

**(b)** Applies a visual style and font scaling context that affects all subsequent seaborn plots. The `style` controls spines, backgrounds, and gridlines; the `context` controls font sizes for different output contexts (paper, notebook, talk, poster).

9. What does sns.lineplot(data=df, x="day", y="price") do if there are multiple price observations per day?

(a) Error (b) Plots all individual points as a scatter (c) Computes the mean for each day and draws a line with a 95% confidence band (d) Plots only the first price for each day

Answer

**(c)** Computes the mean for each day and draws a line with a 95% confidence band. `sns.lineplot` aggregates automatically when there are multiple observations per x-value, bootstrapping a confidence interval. To disable aggregation, pass `estimator=None`.

10. The chapter's threshold concept is the shift from:

(a) matplotlib to pandas (b) Imperative code ("for each group, plot") to declarative code ("map group to color") (c) Static to interactive (d) Code to GUI

Answer

**(b)** Imperative code ("for each group, plot") to declarative code ("map group to color"). This is the conceptual leap from matplotlib's manual iteration to seaborn's parameter-based mapping.

True / False (5 questions)

11. "seaborn is a complete replacement for matplotlib."

Answer

**False.** seaborn is built on top of matplotlib and relies on it for rendering. Advanced customization requires dropping down to the matplotlib layer. seaborn is a higher-level interface, not a replacement.

12. "Axes-level seaborn functions return the matplotlib Axes object they plotted on."

Answer

**True.** This is one of seaborn's key design choices: axes-level functions return the Axes so you can continue to customize it with any matplotlib method. This is how you combine seaborn's statistical shortcuts with matplotlib's fine-grained control.

13. "seaborn automatically removes the top and right spines when you apply set_theme(style='whitegrid')."

Answer

**True.** The whitegrid style (and some others) removes the top and right spines by default, matching the [Chapter 6](../../part-02-design-principles/chapter-06-data-ink-ratio/index.md) declutter principles. This is one reason seaborn's default output looks more polished than matplotlib's.

14. "The col and row parameters work on both figure-level and axes-level functions."

Answer

**False.** The `col` and `row` parameters are only available on figure-level functions (relplot, displot, catplot, pairplot, lmplot). Axes-level functions plot on a single Axes and cannot facet automatically. If you want faceting, use a figure-level function or build the faceting manually with matplotlib.

15. "Wide-form data works with every seaborn function."

Answer

**False.** Most seaborn functions (those using `hue`, `style`, `col`, `row`) expect tidy (long-form) data. Some functions (like `sns.heatmap`) prefer wide-form data because a matrix is naturally wide. Converting between forms with `pd.melt` and `pd.pivot_table` is a common preprocessing step.

Short Answer (3 questions)

16. In three to four sentences, explain why seaborn requires tidy data and what you would do if your data is in wide form.

Answer

seaborn requires tidy data because its API maps DataFrame column names directly to visual channels — `x="year"`, `hue="category"`, `col="region"` — and this mapping only works when each variable has its own column. Wide-form data, with multiple variables in multiple columns (e.g., separate columns for 2023 and 2024 revenue), does not fit this pattern. To convert, use `pd.melt(id_vars=["year"], var_name="category", value_name="value")`, which pivots the data into a long form where each row represents one observation. After melting, you can pass the tidy DataFrame to any seaborn function and use the new column names in the parameters.

17. Explain the key difference between figure-level and axes-level seaborn functions. Give an example of when you would use each.

Answer

**Axes-level functions** (scatterplot, histplot, boxplot) target a specific matplotlib Axes via the `ax=` parameter and return that Axes for further customization. Use them when you want to place a seaborn chart in a specific subplot of a manually-constructed figure — for example, in a GridSpec layout where different panels use different chart types. **Figure-level functions** (relplot, displot, catplot) create their own figure and support faceting via `col` and `row` parameters, returning a FacetGrid object. Use them when you want seaborn to handle the faceting automatically and the whole figure is a single coherent visualization. For a single standalone chart integrated with matplotlib custom layout, use axes-level. For a faceted small-multiple display, use figure-level.

18. Describe how seaborn's sns.lineplot automatically handles multiple observations per x-value, and explain when this behavior is useful vs. when it is confusing.

Answer

When `sns.lineplot` receives a DataFrame with multiple y-values for the same x-value, it automatically computes the mean for each x and draws a central line with a shaded 95% confidence band (bootstrap-based). This is **useful** for time-series data from multiple experimental replicates, multiple stocks, or any dataset where you want the "typical" trajectory and uncertainty. It is **confusing** when you assumed each row was already one observation per x — the aggregation may not match your mental model of the data. To disable aggregation, pass `estimator=None`, which plots the individual points without summarization. To change the summary, pass `estimator=np.median` or similar. The behavior is convenient for exploratory work but should be explicitly acknowledged in production code so readers understand what the chart represents.

Applied Scenarios (2 questions)

19. You have a pandas DataFrame sales with columns [date, product, revenue]. You want to produce a faceted line chart showing revenue over time, with one panel per product. Write the seaborn code.

Answer

import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme(style="whitegrid", context="notebook")

g = sns.relplot(
    data=sales,
    x="date",
    y="revenue",
    col="product",
    kind="line",
    col_wrap=3,  # adjust based on number of products
    height=3,
    aspect=1.5,
    facet_kws={"sharey": False},  # optional: free y-scales
)

g.fig.suptitle("Revenue by Product", y=1.02, fontsize=14)
g.savefig("sales_by_product.png", dpi=300, bbox_inches="tight")

The `col="product"` creates one panel per product. `col_wrap=3` wraps the layout after 3 columns. `kind="line"` selects the line chart variant of relplot. The `facet_kws={"sharey": False}` gives each product its own y-scale, which matters if revenue ranges differ dramatically across products.

20. A colleague writes the following matplotlib code to plot data grouped by category. Rewrite it in seaborn, identifying three things seaborn handles automatically.

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(8, 5))
for category in df["category"].unique():
    subset = df[df["category"] == category]
    ax.scatter(subset["x"], subset["y"], label=category)
ax.legend()
ax.set_xlabel("X")
ax.set_ylabel("Y")
ax.set_title("Scatter by Category")

Answer

import seaborn as sns
import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(8, 5))
sns.scatterplot(data=df, x="x", y="y", hue="category", ax=ax)
ax.set_title("Scatter by Category")

**Three things seaborn handles automatically:** 1. **The iteration over categories.** The matplotlib version has an explicit `for` loop over `df["category"].unique()`; seaborn handles this internally when `hue="category"` is passed. 2. **The color assignment.** matplotlib uses the default color cycle, but seaborn picks a palette and assigns colors per category automatically (choosing from a qualitative palette). 3. **The legend construction.** matplotlib needs `ax.legend()` with the `label` parameter on each scatter call. seaborn builds the legend automatically from the `hue` mapping, including the title. Other things seaborn handles: the axis labels (with `data=` and `x=`/`y=` it sets appropriate labels from the column names, though in this case we still set the title manually), and the spine treatment if a theme is applied.

Review against the mastery thresholds. Chapter 17 introduces distributional visualization in seaborn — histograms, KDE, violin plots, and ridge plots.