Quiz: matplotlib Architecture

20 questions. Aim for mastery (18+). If you score below 14, revisit the relevant sections before moving to Chapter 11.


Multiple Choice (10 questions)

1. In matplotlib, a Figure is:

(a) A single plot with one set of axes (b) The top-level container that holds one or more Axes and represents the whole output image (c) A numerical axis (x-axis or y-axis) (d) A chart type

Answer **(b)** The top-level container that holds one or more Axes and represents the whole output image. A Figure is the whole image — the entire PNG, the entire PDF page, the entire Jupyter cell output. It contains one or more Axes (plotting areas), each of which in turn contains the actual plot elements. Figure is one level up from Axes in the hierarchy.

2. In matplotlib, an Axes (plural form with an 's') is:

(a) The numerical scale on one side of a chart (b) A single plotting area with its own data, title, and labels (c) The ticks on the x-axis and y-axis (d) A two-dimensional array

Answer **(b)** A single plotting area with its own data, title, and labels. An Axes is what most people mean by "a chart." It contains the plotting region where data is drawn, the x and y axes, the title, axis labels, legend, and all the Artists (lines, bars, etc.) that represent the data. "Axes" is matplotlib-specific terminology and is always spelled with an 's' even for a single plotting area.

3. The canonical object-oriented matplotlib pattern begins with:

(a) plt.plot(x, y) (b) fig = plt.figure() (c) fig, ax = plt.subplots() (d) import matplotlib

Answer **(c)** `fig, ax = plt.subplots()`. This is the canonical pattern that creates a Figure and an Axes in one call, returns both, and lets you unpack them into named variables. Every subsequent method call uses `fig` or `ax` explicitly. Option (a) is pyplot, option (b) is an older pattern, option (d) is just the import.

4. The chapter recommends the object-oriented API over pyplot because:

(a) The OO API is faster to execute (b) The OO API makes the state explicit, avoiding bugs from "current" Axes confusion in multi-panel code (c) Pyplot is deprecated and will be removed (d) The OO API uses less memory

Answer **(b)** The OO API makes the state explicit, avoiding bugs from "current" Axes confusion in multi-panel code. Pyplot maintains hidden state — a "current Figure" and "current Axes" — and every pyplot call operates on whichever is currently active. In multi-figure or multi-panel code, tracking which one is current becomes a source of bugs. The OO API holds explicit references and makes every method call unambiguous. Pyplot is not deprecated; it is perfectly fine for simple one-line charts, but the OO API is preferred for anything more complex.

5. The figsize parameter in plt.subplots(figsize=(10, 6)) specifies:

(a) The number of pixels (width × height) (b) The size of the figure in inches (width × height) (c) The data range (d) The aspect ratio as a fraction

Answer **(b)** The size of the figure in inches (width × height). Matplotlib uses inches because it was designed for publication output, where print sizes are traditionally specified in inches. The actual pixel count depends on the DPI: `figsize=(10, 6)` at `dpi=100` is 1000×600 pixels, at `dpi=300` is 3000×1800 pixels.

6. For publication-quality raster output (e.g., for a journal submission), you should save with:

(a) fig.savefig("chart.png") (b) fig.savefig("chart.png", dpi=72) (c) fig.savefig("chart.png", dpi=300, bbox_inches="tight") (d) fig.savefig("chart.jpg", quality=50)

Answer **(c)** `fig.savefig("chart.png", dpi=300, bbox_inches="tight")`. `dpi=300` is the standard print resolution that most journals require. `bbox_inches="tight"` crops the saved image to the actual content, removing excess whitespace. PNG is a lossless format suitable for raster publication. Option (a) uses the default DPI (100) which is too low for print. Option (b) is way too low. Option (d) uses JPEG with lossy compression, which degrades text and thin lines.

7. For a 150-year time-series chart of temperature anomalies, which figsize is best based on the aspect-ratio principles from Chapter 8?

(a) (6.4, 4.8) — the default (b) (6, 6) — square (c) (10, 3) or (12, 4) — wide (d) (4, 12) — tall

Answer **(c)** `(10, 3)` or `(12, 4)` — wide. Time series benefit from wide aspect ratios because the time dimension needs horizontal room. Cleveland's "banking to 45 degrees" rule also suggests wider aspect ratios for long time series. Square or tall ratios would cram the time variation into a narrow horizontal range. The default (a) is not optimized for any specific chart type.

8. When you call ax.plot([1,2,3], [4,5,6]), matplotlib:

(a) Immediately draws a line on the screen (b) Creates a new Line2D Artist and adds it to the Axes's Artist tree, to be rendered later (c) Saves a file to disk (d) Prints the values to the console

Answer **(b)** Creates a new Line2D Artist and adds it to the Axes's Artist tree, to be rendered later. This is the chapter's threshold concept: everything in matplotlib is an object. The plot method does not draw; it creates a Line2D Artist and adds it to the Axes. Rendering happens later, when you call `savefig()` or display the figure. The Line2D can be modified after creation by changing its properties (color, linewidth, etc.) before rendering.

9. The matplotlib Agg backend is:

(a) An interactive backend that pops up a window on your screen (b) A raster backend that produces PNG and similar pixel-based formats (c) A vector backend for PDF and SVG output (d) A deprecated legacy backend

Answer **(b)** A raster backend that produces PNG and similar pixel-based formats. Agg stands for Anti-Grain Geometry, the rendering library it uses internally. It is the default non-interactive backend and is used whenever you save to a raster format like PNG or JPG. Vector backends (PDF, SVG, Cairo) are used for resolution-independent output. Interactive backends (Qt, Tk, inline) are used for displaying charts on screen.

10. The chapter's threshold concept is:

(a) Matplotlib is a drawing library; you draw on a canvas (b) In matplotlib, everything visible is a Python object in a tree, and you configure the tree rather than drawing (c) Pyplot is always better than the OO API (d) Every chart needs a title

Answer **(b)** In matplotlib, everything visible is a Python object in a tree, and you configure the tree rather than drawing. This is the conceptual shift that makes matplotlib make sense. Figures, Axes, Lines, Text, Legends, Ticks — all Python objects in a tree structure, all configurable by calling methods. You are not drawing; you are building a tree. Rendering happens automatically when the tree is ready. Once you internalize this, every matplotlib method makes sense as "setting a property on some object in the tree."

True / False (5 questions)

11. "A Figure can contain zero, one, or many Axes objects."

Answer **True.** A Figure is a container. In the simplest case, it contains one Axes (a single-chart figure). A small multiple might contain 50 Axes. An empty Figure (zero Axes) is valid but unusual — you would typically add Axes to it later with `fig.add_subplot()` or `fig.add_axes()`. The point is that Figure and Axes are separate concepts: a Figure is the image, an Axes is a plotting area within the image.

12. "An Axes (plural) and an Axis (singular) are the same thing."

Answer **False.** They are different matplotlib concepts with unfortunately similar names. An **Axes** (plural, with an 's') is a single plotting area — what most people call "a chart." An **Axis** (singular, no 's') is a single numerical axis (x-axis or y-axis) within an Axes. An Axes has two Axis objects: `ax.xaxis` and `ax.yaxis`. The naming is confusing but the distinction is important: you work with Axes most of the time, and occasionally with Axis for advanced tick formatting.

13. "The plt.show() function should always be called at the end of a matplotlib script."

Answer **False.** `plt.show()` opens a window and blocks the script until the window is closed. In a script that generates files, this is actively harmful — it prevents the script from completing until a human closes the window. In Jupyter notebooks with the inline backend, `plt.show()` is unnecessary because the inline backend displays charts automatically. Use `plt.show()` only in interactive Python sessions where you want to see the chart on screen.

14. "The default matplotlib figure size is wide enough for any chart type."

Answer **False.** The default `figsize=(6.4, 4.8)` is a slightly-wider-than-tall rectangle that works for simple single charts but is not optimal for any specific chart type. Time series should be wider (Chapter 8). Horizontal bar charts with many categories should be taller. Scatter plots should be closer to square. Choosing figsize to match the chart type is one of the simplest and highest-value matplotlib customizations you can make.

15. "When you save a Figure as a PDF, matplotlib produces a vector file that can be scaled without quality loss."

Answer **True.** PDF is a vector format, meaning it stores drawing commands (lines, curves, text) rather than a grid of pixels. When you zoom in on a PDF, the vectors are re-rendered at the new zoom level without blurring. This is ideal for print publications where the figure might be resized. Raster formats (PNG, JPG) store pixels and become blurry when scaled. SVG is another vector format useful for editing, and PDF is the standard for print journal submissions.

Short Answer (3 questions)

16. In three to four sentences, explain the difference between the Figure, Axes, and Axis (singular) concepts in matplotlib, and describe how many of each a typical single-chart figure contains.

Answer A **Figure** is the top-level container representing the whole output image (the PNG, the PDF page). An **Axes** (plural) is a single plotting area within a Figure — what most people call "a chart" — containing the data, title, labels, and plot elements. An **Axis** (singular) is one of the two numerical axes within an Axes (x-axis or y-axis), containing tick marks and tick labels. A typical single-chart figure contains 1 Figure, 1 Axes, and 2 Axis objects (the x-axis and y-axis of that Axes).

17. Explain the difference between the pyplot (state-machine) API and the object-oriented API in matplotlib. Give one specific reason the OO API is preferred for anything beyond simple exploratory charts.

Answer The **pyplot API** consists of functions like `plt.plot()`, `plt.title()`, `plt.xlabel()` that operate on a "current" Figure and Axes that matplotlib tracks internally. The **object-oriented API** uses explicit references: `fig, ax = plt.subplots()` creates objects, and subsequent calls use `ax.plot()`, `ax.set_title()`, `ax.set_xlabel()` on those specific objects. The OO API is preferred because in multi-panel or multi-figure code, the pyplot state machine can become confused about which Figure or Axes is "current," leading to bugs where the wrong chart gets modified. The OO API makes the target of every method call explicit in the variable name.

18. Describe the canonical fig, ax = plt.subplots() pattern and explain what the method call returns.

Answer `plt.subplots()` with no arguments creates a new Figure containing one Axes, and returns a tuple `(fig, ax)` where `fig` is the Figure object and `ax` is the Axes object. Unpacking the tuple gives you explicit references to both objects, which you can then use for subsequent method calls: `ax.plot(...)`, `ax.set_title(...)`, `fig.savefig(...)`. With arguments, `plt.subplots(nrows, ncols)` creates a grid of Axes and returns a tuple `(fig, axes)` where `axes` is a 2D (or 1D, for single-row/column grids) numpy array of Axes objects. This pattern is the foundation of all OO-style matplotlib code.

Applied Scenarios (2 questions)

19. You are writing a Python script that needs to generate 50 different time-series charts (one per U.S. state) and save them as PNG files for a data journalism project. Each chart is a single line showing pandemic case counts over time.

(a) Should you use the pyplot API or the object-oriented API for this task? Justify your choice. (b) Write the skeleton of the loop that creates and saves all 50 charts. Include the import and the figsize choice. (c) What DPI should you use for the output, and why?

Answer **(a)** Use the object-oriented API. With 50 charts, you want explicit control over which Figure and Axes you are operating on in each iteration. Pyplot's "current Figure" state would make it unclear whether each method call is operating on the expected chart, especially if anything goes wrong or you need to debug a specific state's chart. The OO API is always preferred for loops that generate multiple charts. **(b)** Skeleton:
import matplotlib.pyplot as plt
import pandas as pd

# Load data
data = pd.read_csv("pandemic_cases_by_state.csv")
states = data["state"].unique()

for state in states:
    state_data = data[data["state"] == state]

    fig, ax = plt.subplots(figsize=(12, 4))  # wide for time series
    ax.plot(state_data["date"], state_data["cases"])
    ax.set_title(f"COVID-19 Cases: {state}")
    ax.set_xlabel("Date")
    ax.set_ylabel("Cases")

    fig.savefig(f"cases_{state}.png", dpi=300, bbox_inches="tight")
    plt.close(fig)  # IMPORTANT: close the figure to free memory in loops
**(c)** Use `dpi=300` because the charts are for publication in a data journalism project, and 300 DPI is the standard print resolution. The `plt.close(fig)` at the end of the loop is important — without it, each iteration leaves the Figure in memory, and after 50 iterations you will have consumed a lot of memory. Always close figures in long-running loops.

20. You have written the following code for a simple bar chart, but you are getting a warning about "Tight layout not applied" and the tick labels are overlapping. Diagnose the problem and propose a fix.

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(6, 4))
categories = ["Enterprise Plan", "Professional Plan", "Starter Plan", "Growth Plan", "Legacy Plan"]
values = [450, 320, 180, 250, 95]
ax.bar(categories, values)
ax.set_title("Revenue by Product Line")
ax.set_ylabel("Revenue (USD millions)")
fig.savefig("revenue.png")

(a) What is causing the overlap? (b) Name at least two fixes. (c) Which fix would you recommend, and why?

Answer **(a)** The category labels ("Enterprise Plan", "Professional Plan", etc.) are too long to fit horizontally on a 6-inch-wide figure, so matplotlib either renders them overlapping or truncates them. The figsize is too narrow for the category labels, and the default layout does not adjust. **(b) Fixes:** 1. **Increase figsize** so there is more horizontal space: `plt.subplots(figsize=(10, 4))`. 2. **Rotate the tick labels** so they do not overlap: `ax.tick_params(axis="x", rotation=45)` or `plt.setp(ax.get_xticklabels(), rotation=45, ha="right")`. 3. **Use a horizontal bar chart** so the category labels run horizontally without crowding: `ax.barh(categories, values)` (note: swap the x and y roles). 4. **Enable constrained_layout** at figure creation: `plt.subplots(figsize=(6, 4), constrained_layout=True)`. 5. **Shorten the category labels**: `categories = ["Enterprise", "Professional", "Starter", "Growth", "Legacy"]`. 6. **Call `fig.tight_layout()` before saving** (less robust than `constrained_layout`). **(c) Recommended fix: the horizontal bar chart.** With five categories that have long labels, a horizontal bar chart solves the problem cleanly: the category labels fit comfortably on the y-axis, the bars extend horizontally, and sorting the bars by value (e.g., `ax.barh(sorted_categories, sorted_values)`) produces a clear ranking. This is the design recommendation from Chapter 5 and Chapter 8 for ranked comparison with many or long-label categories. The horizontal bar chart also reads more naturally for ranking comparisons.

Review your results against the mastery thresholds at the top. If you scored below 14, revisit Sections 10.1 (architecture), 10.2 (Figure/Axes/Axis), and 10.3 (the fig/ax pattern) — those are the foundational concepts for all of Part III. Chapter 11 assumes you are comfortable with the fig/ax pattern and uses it in every example.