Key Takeaways: matplotlib Foundations
This is your reference card and cheat sheet for Chapter 15. Keep it open while building charts. The first section covers concepts; the second section is a quick-reference code guide you can copy from directly.
Key Concepts
-
matplotlib has two interfaces — use the OO one. The pyplot interface (
plt.plot(...)) is quick but implicit. The object-oriented interface (fig, ax = plt.subplots()followed byax.plot(...)) is explicit, composable, and essential for multi-panel figures. We use OO throughout this book. -
Figure is the canvas; Axes is the chart. A
Figureholds one or moreAxes. EachAxesis a complete coordinate system with its own data, title, labels, and legend. Subplots are just multiple Axes on one Figure. -
The four workhorses: line, bar, scatter, histogram.
ax.plot()for trends over time.ax.bar()/ax.barh()for categorical comparison.ax.scatter()for relationships between continuous variables.ax.hist()for distributions. These four cover the vast majority of data science visualization needs. -
Customization turns exploratory into explanatory. A default chart is fine for exploration. For communication, add: a descriptive title (finding, not topic), axis labels with units, appropriate axis ranges (zero baseline for bars), clean design (faint gridlines, removed spines), and annotations highlighting key findings.
-
Multi-panel figures enable comparison.
plt.subplots(rows, cols)creates a grid of panels. Usesharey=Trueorsharex=Trueto ensure honest comparison across panels. Always callfig.tight_layout()to prevent overlap. -
Save at the right quality.
fig.savefig("chart.png", dpi=300, bbox_inches="tight")for print-quality PNG. Use SVG or PDF for vector formats that scale perfectly. Always usebbox_inches="tight"to trim excess whitespace.
matplotlib Cheat Sheet
Setup
import matplotlib.pyplot as plt
import numpy as np # often needed for positioning
Creating Figures and Axes
# Single panel
fig, ax = plt.subplots()
fig, ax = plt.subplots(figsize=(10, 6))
# Multiple panels
fig, axes = plt.subplots(1, 3) # 1 row, 3 cols
fig, axes = plt.subplots(2, 2) # 2x2 grid
fig, axes = plt.subplots(1, 3, sharey=True) # shared y-axis
fig, axes = plt.subplots(2, 1, sharex=True) # shared x-axis
Accessing Panels
# 1D array (single row or single column)
axes[0], axes[1], axes[2]
# 2D array (grid)
axes[0, 0] # top-left
axes[0, 1] # top-right
axes[1, 0] # bottom-left
axes[1, 1] # bottom-right
Line Chart
ax.plot(x, y)
ax.plot(x, y, color="steelblue", linewidth=2,
marker="o", markersize=6, label="Label",
linestyle="--")
Common markers: "o" circle, "s" square, "^" triangle, "D" diamond, "x" x-mark
Common linestyles: "-" solid, "--" dashed, ":" dotted, "-." dash-dot
Bar Chart
# Vertical bars
ax.bar(categories, values, color="steelblue")
# Horizontal bars
ax.barh(categories, values, color="steelblue")
# Highlighting specific bars
colors = ["tomato" if c == "Target" else "steelblue"
for c in categories]
ax.bar(categories, values, color=colors)
Scatter Plot
ax.scatter(x, y, color="steelblue", s=60, alpha=0.7)
# Color-coded by third variable
scatter = ax.scatter(x, y, c=third_var, cmap="viridis",
s=60, alpha=0.8, edgecolors="gray")
fig.colorbar(scatter, ax=ax, label="Third Variable")
Common colormaps:
- Sequential: "viridis", "Blues", "YlOrRd", "Greens"
- Diverging: "coolwarm", "RdBu", "PiYG"
- Categorical: use explicit color lists
Histogram
ax.hist(data, bins=20, color="steelblue",
edgecolor="white", alpha=0.8)
# Fixed range (important for comparison)
ax.hist(data, bins=20, range=(0, 100),
color="steelblue", edgecolor="white")
Titles and Labels
ax.set_title("Descriptive Finding Title", fontsize=13,
fontweight="bold")
ax.set_xlabel("X Label (units)")
ax.set_ylabel("Y Label (units)")
fig.suptitle("Figure-Level Title", fontsize=14, y=1.02)
Axis Control
ax.set_xlim(min_val, max_val)
ax.set_ylim(0, 100) # Zero baseline for bar charts!
ax.set_xticks([2015, 2018, 2021, 2024])
ax.set_xticklabels(labels, rotation=45, ha="right")
Legend
# Labels set during plotting
ax.plot(x, y, label="Series A")
ax.plot(x, y2, label="Series B")
ax.legend() # automatic placement
ax.legend(frameon=False) # no border
ax.legend(loc="upper left") # specific location
ax.legend(frameon=False, fontsize=11) # clean + readable
Gridlines
ax.grid(True, alpha=0.3) # faint
ax.grid(True, alpha=0.2, linestyle="--") # dashed, very faint
ax.grid(True, alpha=0.2, axis="y") # y-axis only
Spines (Borders)
ax.spines["top"].set_visible(False)
ax.spines["right"].set_visible(False)
# Keep bottom and left for axes
Annotations
# Text with arrow
ax.annotate("Important point",
xy=(x_data, y_data), # arrow points here
xytext=(x_text, y_text), # text placed here
fontsize=10, color="tomato",
arrowprops=dict(arrowstyle="->", color="tomato"))
# Horizontal reference line
ax.axhline(y=value, color="gray", linestyle="--", alpha=0.5,
label="Reference")
# Vertical reference line
ax.axvline(x=value, color="gray", linestyle=":", alpha=0.5)
# Plain text (no arrow)
ax.text(x, y, "Text here", fontsize=10, ha="center",
va="bottom", color="gray")
Value Labels on Bars
bars = ax.bar(categories, values, color="steelblue")
for bar, val in zip(bars, values):
ax.text(bar.get_x() + bar.get_width() / 2,
bar.get_height() + 1,
f"{val}%", ha="center", va="bottom", fontsize=10)
Grouped Bar Chart
import numpy as np
x = np.arange(len(categories))
width = 0.35
ax.bar(x - width/2, values_a, width, label="Group A",
color="steelblue")
ax.bar(x + width/2, values_b, width, label="Group B",
color="tomato")
ax.set_xticks(x)
ax.set_xticklabels(categories)
ax.legend(frameon=False)
Saving
fig.tight_layout() # ALWAYS call this first
# Raster (for screens, slides, notebooks)
fig.savefig("chart.png", dpi=150, bbox_inches="tight")
# High-res raster (for print)
fig.savefig("chart.png", dpi=300, bbox_inches="tight")
# Vector (for publications, web scaling)
fig.savefig("chart.svg", bbox_inches="tight")
fig.savefig("chart.pdf", bbox_inches="tight")
Display
plt.show() # In scripts, shows the figure window
# In Jupyter, figures display inline automatically
Common Mistakes Checklist
- [ ] Using pyplot (
plt.title()) when you should use OO (ax.set_title()) - [ ] Forgetting
fig.tight_layout()before save/show -- labels get cut off - [ ] Bar chart y-axis not starting at zero -- visually misleading
- [ ] Overlapping x-axis labels -- fix with rotation or horizontal bars
- [ ] Rainbow colors on bars that represent the same variable -- use one color
- [ ] Title says the topic, not the finding -- "Bar Chart" tells the reader nothing
- [ ] Missing axis labels or units -- "Rate" could mean anything; "Vaccination Rate (%)" is clear
Design Principles (from Chapter 14, applied here)
- Title states the finding: Not "Vaccination Rates" but "Africa Trails the Global Average by 27 Points"
- Y-axis at zero for bar charts. Always. No exceptions.
- Faint gridlines:
alpha=0.2or0.3. Grid should support, not compete. - Remove unnecessary spines: Top and right spines are almost never needed.
- Legend without border:
frameon=False. The border is non-data ink. - Use color intentionally: Color should encode a variable or highlight a data point, not decorate.
- Annotate the story: Reference lines, text labels, and arrows guide the reader to your finding.
What You Should Be Able to Do Now
- [ ] Create line charts, bar charts, scatter plots, and histograms using
fig, ax = plt.subplots() - [ ] Customize with titles, labels, colors, gridlines, and spine removal
- [ ] Build multi-panel figures with
plt.subplots(rows, cols)and shared axes - [ ] Annotate charts with
annotate(),axhline(),axvline(), andtext() - [ ] Work with pandas DataFrames directly in matplotlib
- [ ] Save figures as PNG (at 300 DPI), SVG, or PDF with
savefig() - [ ] Avoid the five common mistakes listed in this chapter
If you can do all of this, you have a solid matplotlib foundation. In Chapter 16, seaborn will make many of these operations simpler and add statistical visualization capabilities. But you'll always be able to drop down to matplotlib when you need full control — and you'll understand what seaborn is doing under the hood.
Next: Chapter 16 — Statistical Visualization with seaborn