> "The five essential chart types answer eighty percent of visualization questions. The specialized chart types answer the rest."
Learning Objectives
- Create heatmaps with ax.imshow() and ax.pcolormesh() including colorbars, annotations, and appropriate colormaps
- Create contour plots with ax.contour() and ax.contourf() for continuous 2D data
- Create polar plots with add_subplot(projection='polar') for cyclical data
- Create error bar plots with ax.errorbar() including asymmetric errors and confidence bands
- Create filled area charts with ax.fill_between() for ranges, confidence bands, and difference highlighting
- Select the appropriate specialized chart type for a given data scenario
- Combine specialized charts with standard charts in multi-panel figures
In This Chapter
- 14.1 Heatmaps: ax.imshow() and ax.pcolormesh()
- 14.2 Contour Plots: ax.contour() and ax.contourf()
- 14.3 Polar Plots: Cyclical Data in Its Natural Form
- 14.4 Error Bars and Confidence Bands: Making Uncertainty Visible
- 14.5 Vector Fields: quiver and streamplot
- 14.6 Less-Common but Useful Chart Types
- 14.7 Specialized Charts and the Chapter 5 Framework
- 14.8 Heatmaps and Colormaps: A Practical Guide
- 14.8 Common Pitfalls with Specialized Charts
- 14.8 Choosing the Right Specialized Chart Type
- 14.8 Annotated Heatmaps in Detail
- 14.8 Colorbar Customization
- 14.8 Date and Time Axes
- 14.9 Log and Symlog Scales
- 14.10 Combining Specialized Charts in Multi-Panel Figures
- Chapter Summary
- Spaced Review: Concepts from Chapters 1-13
Chapter 14: Specialized matplotlib Charts
"The five essential chart types answer eighty percent of visualization questions. The specialized chart types answer the rest." — Approximately, the working consensus in data visualization education.
Chapter 11 taught the five essential chart types — line, bar, scatter, histogram, and box plot — that cover most real-world visualization work. Chapter 12 taught you how to customize them. Chapter 13 taught you how to compose them into multi-panel figures. But some data shapes do not fit any of the five essentials, and for those you need specialized chart types: heatmaps for 2D tabular data, contours for continuous 2D surfaces, polar plots for cyclical data, and error bars for uncertainty visualization.
This chapter introduces these specialized types. Unlike the essential chart types, which each answer a general question (comparison, distribution, relationship, trend), specialized types each answer a specific data-shape question. The chart selection framework from Chapter 5 expands to accommodate them: if your data is a 2D table with rows and columns of numerical values, the right chart is often a heatmap. If your data is a continuous 2D surface (like a topographic elevation map or a density estimate), contours are the right answer. If your data is cyclical (time of day, day of week, compass direction), polar plots visualize it in its natural form.
The chapter is structured around chart types rather than concepts, because each specialized type has its own API and its own set of parameters to learn. After a brief introduction to each, we apply it to a specific climate or public-health example to show the pattern in context. By the end of the chapter, you will know when to reach for each specialized type and how to implement it in matplotlib.
A note on organization: this chapter does not have a single threshold concept because the material is additive — each specialized type extends your vocabulary without requiring a conceptual shift. The chapter is shorter and less dense than some earlier chapters for the same reason. Focus on learning the specific chart types, not on building a new mental model.
14.1 Heatmaps: ax.imshow() and ax.pcolormesh()
A heatmap is a 2D grid where each cell's color encodes a numerical value. Heatmaps are the right chart type for 2D tabular data: correlation matrices, confusion matrices, gene expression tables, monthly temperature by year, and any other situation where you have rows × columns of numbers.
imshow for Regular Grids
matplotlib's simplest heatmap function is ax.imshow, which displays a 2D numpy array as a grid of colored cells:
import numpy as np
import matplotlib.pyplot as plt
# Create a 2D array (rows = years, columns = months)
data = np.random.randn(10, 12)
fig, ax = plt.subplots(figsize=(10, 6))
im = ax.imshow(data, cmap="RdBu_r", aspect="auto")
fig.colorbar(im, ax=ax, label="Value")
ax.set_xticks(range(12))
ax.set_xticklabels(["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"])
ax.set_yticks(range(10))
ax.set_yticklabels(range(2015, 2025))
ax.set_title("Monthly Anomalies by Year")
Key parameters of imshow:
- First argument: the 2D array. Each element becomes one cell in the heatmap.
cmap: the colormap. For data with a meaningful midpoint (like temperature anomalies), use a diverging colormap (RdBu_r,coolwarm,BrBG). For sequential data (counts, densities), use a sequential colormap (viridis,plasma).aspect: controls the aspect ratio of cells."auto"stretches the cells to fill the Axes (default for data visualization)."equal"makes cells square (default for images). Use"auto"for most heatmaps.vmin,vmax: the value range that maps to the colormap's extremes. For diverging colormaps around zero, setvmin=-abs_max, vmax=abs_maxso the neutral color aligns with zero.interpolation: how to render cells at sub-pixel sizes. Use"nearest"for discrete cell values (which is what you want for a heatmap, not a blurred version).extent: a 4-tuple[left, right, bottom, top]giving the data coordinates of the heatmap's edges. Useful when the rows and columns correspond to specific numerical values rather than integer indices.
pcolormesh for Irregular Grids
ax.imshow assumes the grid is regular — all cells the same size. For irregular grids (unequal row heights or column widths), use ax.pcolormesh:
# x and y are the edge coordinates of the cells
x_edges = np.linspace(0, 10, 13)
y_edges = np.linspace(0, 5, 11)
fig, ax = plt.subplots(figsize=(10, 6))
pcm = ax.pcolormesh(x_edges, y_edges, data, cmap="viridis", shading="auto")
fig.colorbar(pcm, ax=ax)
pcolormesh takes the edge coordinates explicitly, which means you can have cells of different sizes. It is more flexible than imshow but slightly more verbose. For most regular-grid heatmaps, imshow is simpler.
Annotated Heatmaps
For small heatmaps (say, 10×12 or smaller), you often want to show the numerical value in each cell. This is done with a loop that calls ax.text:
fig, ax = plt.subplots(figsize=(12, 8))
im = ax.imshow(data, cmap="RdBu_r", aspect="auto")
fig.colorbar(im, ax=ax)
for i in range(data.shape[0]):
for j in range(data.shape[1]):
ax.text(j, i, f"{data[i, j]:.2f}", ha="center", va="center", fontsize=8,
color="white" if abs(data[i, j]) > 1 else "black")
The color of the text adapts based on the background — white on dark cells, black on light cells — so the annotations remain readable. The conditional "white" if abs(data[i, j]) > 1 else "black" is a simple heuristic; for more sophisticated color-aware annotation, look at seaborn's heatmap function (Chapter 17).
Climate Heatmap: Years × Months
The canonical climate heatmap shows monthly temperature anomalies arranged as a years × months grid, with warmer years showing as rows of red cells and cooler years as rows of blue:
# Assume climate_monthly is a pandas DataFrame with columns year, month, anomaly
pivot = climate_monthly.pivot(index="year", columns="month", values="anomaly")
fig, ax = plt.subplots(figsize=(12, 8))
abs_max = max(abs(pivot.min().min()), abs(pivot.max().max()))
im = ax.imshow(pivot.values, cmap="RdBu_r", aspect="auto", vmin=-abs_max, vmax=abs_max)
ax.set_xticks(range(12))
ax.set_xticklabels(["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"])
ax.set_yticks(range(len(pivot)))
ax.set_yticklabels(pivot.index)
ax.set_title("Monthly Temperature Anomalies, 1880-2024")
fig.colorbar(im, ax=ax, label="Anomaly (°C)")
The symmetric vmin=-abs_max, vmax=abs_max ensures the neutral midpoint of the diverging colormap aligns with zero (the baseline temperature). The result is a heatmap where warm anomalies are red, cool anomalies are blue, and the progression from mostly-blue (early 20th century) to mostly-red (recent decades) tells the warming story in a single image.
14.2 Contour Plots: ax.contour() and ax.contourf()
A contour plot shows a continuous 2D surface using lines of equal value (contour lines) or filled regions between contours. This is the standard way to visualize topographic maps, weather maps, density estimates, and any other continuous field defined on a 2D grid.
Basic Contour Plot
import numpy as np
# Create a 2D grid and a function defined on it
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
Z = np.exp(-(X**2 + Y**2)) # a 2D Gaussian
fig, ax = plt.subplots(figsize=(8, 6))
cs = ax.contour(X, Y, Z, levels=10, cmap="viridis")
ax.clabel(cs, inline=True, fontsize=8) # label contours with their values
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.set_title("Contour Plot of a 2D Gaussian")
ax.contour(X, Y, Z, levels=N) draws N contour lines at automatic values. The X, Y, and Z arrays come from np.meshgrid — X and Y define the grid coordinates, and Z is the function value at each grid point. ax.clabel(cs, inline=True) labels the contours with their numerical values directly on the lines.
Filled Contour Plot
ax.contourf (note the f at the end) fills the regions between contours with colors from a colormap:
fig, ax = plt.subplots(figsize=(8, 6))
cf = ax.contourf(X, Y, Z, levels=20, cmap="viridis")
fig.colorbar(cf, ax=ax, label="Value")
ax.set_title("Filled Contour Plot")
contourf creates a dense, smoothly-colored display similar to a heatmap. The difference is that contour plots interpolate between grid points to create smooth level curves, while heatmaps display discrete grid cells. For continuous data where smoothness matters, contour is better; for discrete tabular data, heatmap is better.
When to Use Which
- imshow / pcolormesh (heatmap): your data is a 2D table with discrete cells (month × year, category × category, gene × sample). Each cell has a specific value.
- contour / contourf (contour plot): your data is a smooth continuous function defined on a 2D grid (topography, temperature field, density estimate). You can interpolate between grid points.
If you are not sure, try both. The heatmap preserves the grid structure; the contour plot smooths it. Whichever tells the story more clearly is the right choice.
14.3 Polar Plots: Cyclical Data in Its Natural Form
Some data is cyclical — time of day, day of week, direction, seasons — and visualizing it on a linear axis hides the cyclic structure. A polar plot uses radial coordinates (angle + radius) to display cyclical data in its native circular form.
Creating a Polar Axes
fig = plt.figure(figsize=(8, 8))
ax = fig.add_subplot(111, projection="polar")
# 24 hours of data
theta = np.linspace(0, 2 * np.pi, 24, endpoint=False)
r = np.random.rand(24) * 10 + 5
ax.bar(theta, r, width=2 * np.pi / 24, alpha=0.6, color="steelblue")
ax.set_theta_zero_location("N") # 0 degrees at the top
ax.set_theta_direction(-1) # clockwise (like a clock)
ax.set_title("Activity by Hour of Day")
The projection="polar" argument to add_subplot (or plt.subplots(..., subplot_kw={"projection": "polar"})) creates an Axes with polar coordinates. The x-axis becomes the angular coordinate (theta), measured in radians, and the y-axis becomes the radial coordinate.
ax.set_theta_zero_location("N") puts 0 degrees at the top (North), which is what you want for clock-style or compass-style displays. ax.set_theta_direction(-1) makes the angles go clockwise rather than the default counterclockwise.
Common Polar Chart Types
Wind rose (distribution of wind direction):
directions = np.random.choice(16, 1000) # 16 compass directions
counts = np.bincount(directions, minlength=16)
theta = np.linspace(0, 2 * np.pi, 16, endpoint=False)
fig = plt.figure(figsize=(8, 8))
ax = fig.add_subplot(111, projection="polar")
ax.bar(theta, counts, width=2 * np.pi / 16, alpha=0.7, color="steelblue")
ax.set_theta_zero_location("N")
ax.set_theta_direction(-1)
ax.set_title("Wind Direction Frequency")
Radar chart (multi-dimensional comparison):
categories = ["Speed", "Strength", "Agility", "Endurance", "Skill", "Strategy"]
values = [8, 6, 9, 7, 8, 7]
theta = np.linspace(0, 2 * np.pi, len(categories), endpoint=False)
values = np.append(values, values[0]) # close the loop
theta = np.append(theta, theta[0])
fig = plt.figure(figsize=(8, 8))
ax = fig.add_subplot(111, projection="polar")
ax.plot(theta, values, linewidth=2)
ax.fill(theta, values, alpha=0.25)
ax.set_xticks(theta[:-1])
ax.set_xticklabels(categories)
ax.set_title("Player Profile")
Radar charts are controversial — they can distort comparisons because the perceived area is not a meaningful quantity — but they are a common convention in sports, military, and competitive analysis. Use them with awareness of the perceptual issues.
When Polar Is the Wrong Choice
Polar plots are powerful for truly cyclical data but should not be used for linear data. A polar plot of year-over-year revenue growth would wrap the years in a circle for no reason, and the reader would lose the natural left-to-right reading of time. Use polar only when the cyclic structure is meaningful: time of day, day of week, month of year, compass direction, angle.
14.4 Error Bars and Confidence Bands: Making Uncertainty Visible
Chapter 4 established that hiding uncertainty is a form of visualization dishonesty. Real measurements have noise; real estimates have confidence intervals; real forecasts have ranges. Specialized matplotlib methods make uncertainty visible.
Error Bars on Individual Points
fig, ax = plt.subplots(figsize=(10, 6))
x = np.arange(10)
y = np.random.randn(10) + 3
yerr = np.random.rand(10) * 0.5 + 0.2 # different error for each point
ax.errorbar(
x, y,
yerr=yerr,
fmt="o",
color="#1f77b4",
ecolor="gray",
elinewidth=0.8,
capsize=4,
markersize=8,
label="Data with error",
)
ax.legend()
Key parameters of ax.errorbar:
yerr: error in the y direction. Can be a single number (same error for every point), a 1D array (per-point error), or a 2-element list[lower_errors, upper_errors]for asymmetric errors.xerr: error in the x direction (rarely needed).fmt: format string combining marker and line style."o"is circles with no line;"o-"is circles connected by a line.ecolor: the color of the error bars (usually a neutral gray to distinguish from the data markers).elinewidth: the thickness of the error bar lines.capsize: the size of the little caps at the ends of the error bars.0removes them;3-5is typical.
Asymmetric Error Bars
For confidence intervals that are not symmetric around the estimate:
lower_errors = [0.2, 0.3, 0.5, 0.1, 0.4]
upper_errors = [0.5, 0.4, 0.3, 0.6, 0.3]
ax.errorbar(x, y, yerr=[lower_errors, upper_errors], fmt="o")
Pass a 2D list with the lower and upper errors separately.
Confidence Bands with fill_between
For continuous confidence bands (a range around a smooth function), use ax.fill_between:
fig, ax = plt.subplots(figsize=(10, 5))
x = np.linspace(0, 10, 100)
y = np.sin(x)
lower = y - 0.2
upper = y + 0.2
ax.plot(x, y, color="#1f77b4", linewidth=1.5, label="Estimate")
ax.fill_between(x, lower, upper, alpha=0.2, color="#1f77b4", label="95% CI")
ax.legend()
ax.fill_between(x, lower, upper) fills the vertical region between the lower and upper arrays. Combined with a central line, it produces the "line with shaded confidence band" pattern standard for forecasts, climate reconstructions, and model fits.
Highlighting Regions with fill_between and where
The where parameter lets you fill only certain regions — for example, coloring positive and negative values differently:
fig, ax = plt.subplots(figsize=(12, 4))
x = np.linspace(0, 20, 1000)
y = np.sin(x) + 0.2 * np.sin(3 * x)
ax.plot(x, y, color="black", linewidth=0.8)
ax.fill_between(x, y, 0, where=(y >= 0), color="#d62728", alpha=0.6, label="Positive")
ax.fill_between(x, y, 0, where=(y < 0), color="#1f77b4", alpha=0.6, label="Negative")
ax.axhline(0, color="black", linewidth=0.5)
ax.legend()
The where=(y >= 0) parameter creates a boolean mask and fills only the regions where the mask is True. This is the standard technique for warming-stripes-style displays and for highlighting above/below threshold regions.
14.5 Vector Fields: quiver and streamplot
For data that has both a magnitude and a direction at every point — wind, flow, gradient, force fields — matplotlib provides ax.quiver and ax.streamplot.
Quiver Plots
# Create a grid and a vector field
x = np.linspace(-2, 2, 10)
y = np.linspace(-2, 2, 10)
X, Y = np.meshgrid(x, y)
U = -Y # x-component of the vector
V = X # y-component of the vector
fig, ax = plt.subplots(figsize=(8, 8))
ax.quiver(X, Y, U, V, color="#1f77b4")
ax.set_aspect("equal")
ax.set_title("Rotational Vector Field")
ax.quiver(X, Y, U, V) draws an arrow at each (X, Y) point with horizontal component U and vertical component V. The arrow length is proportional to the vector magnitude, and the direction is the vector direction. This is the standard way to visualize wind fields, gradient fields, and other directional data.
Streamplots
ax.streamplot is similar but draws continuous streamlines instead of individual arrows, which is often more readable for complex flow patterns:
fig, ax = plt.subplots(figsize=(8, 8))
ax.streamplot(X, Y, U, V, density=1.5, color="#1f77b4")
ax.set_aspect("equal")
ax.set_title("Streamlines of the Same Field")
The density parameter controls how many streamlines are drawn. Higher density = more streamlines = denser visualization.
Both quiver and streamplot are specialized for vector data. You will use them rarely, but when you need them, there is no good substitute.
14.6 Less-Common but Useful Chart Types
A few additional specialized chart types are worth knowing about, even though they appear less often in practice.
Violin Plots
ax.violinplot shows distribution shapes — similar to box plots but with the full distribution visualized as a symmetric shape around a central axis:
fig, ax = plt.subplots(figsize=(8, 5))
data = [np.random.normal(0, 1, 100), np.random.normal(2, 1, 100), np.random.normal(0, 2, 100)]
ax.violinplot(data, showmeans=True, showmedians=True)
ax.set_xticks([1, 2, 3])
ax.set_xticklabels(["Group A", "Group B", "Group C"])
Violin plots are useful when you want to see the full shape of the distribution (not just summary statistics) while comparing across groups. They are richer than box plots but more complex to read. seaborn's sns.violinplot has a simpler API.
Stem Plots
ax.stem draws vertical stems from a baseline to each data point, with a marker at the top:
x = np.arange(20)
y = np.random.randn(20)
fig, ax = plt.subplots(figsize=(10, 5))
ax.stem(x, y)
Stem plots are a discrete alternative to line charts — they emphasize individual data points rather than the continuity between them. Useful for signal processing, time-series with distinct events, or any context where each data point is a standalone value.
Stack Plots
ax.stackplot is like a stacked area chart — showing how multiple series stack up to a total over time:
years = np.arange(2015, 2025)
category_a = np.random.rand(10) * 10 + 20
category_b = np.random.rand(10) * 10 + 15
category_c = np.random.rand(10) * 10 + 10
fig, ax = plt.subplots(figsize=(10, 5))
ax.stackplot(years, category_a, category_b, category_c, labels=["A", "B", "C"], alpha=0.8)
ax.legend(loc="upper left")
ax.set_title("Stacked Components Over Time")
Stack plots show composition and totals simultaneously. The same caveats apply as to stacked bars: the first (bottom) series is easy to read, but the middle and top series are hard to compare because they do not share a common baseline.
14.7 Specialized Charts and the Chapter 5 Framework
Chapter 5 introduced the chart selection matrix mapping question types to chart types. The specialized types in this chapter extend that framework. For each question type, here are the standard choices plus the specialized alternatives.
Comparison across categories: bar chart is still the default. Specialized alternatives include lollipop charts (stem plots), radar charts (polar) for multi-dimensional comparison, and heatmaps for two-category comparisons.
Distribution of one variable: histogram and box plot from Chapter 11 are standard. Specialized alternatives include violin plots (show full shape), density plots (smoother than histograms), and rug plots (show individual observations). For multi-variable distributions, 2D density plots via contourf or hexbin.
Relationship between two variables: scatter plot from Chapter 11 is standard. Specialized alternatives include heatmaps with both dimensions binned (useful for very large datasets where individual points overplot), contour density plots (same data but smoothed), and 2D histograms.
Change over time: line chart is standard. Specialized alternatives include area charts for emphasis (fill_between), stacked area charts for composition over time, and warming-stripes-style bar charts for simple year-by-year categorical displays.
Composition: stacked bar and pie charts from Chapter 11 are standard. Specialized alternatives include stackplot for time-indexed composition, treemap (not built into matplotlib; see the squarify package), and sunburst charts.
Spatial pattern: choropleth maps (covered in Chapter 23). Contour maps (topographic-style) for continuous spatial fields. Hexbin maps for spatial density.
Cyclical pattern: polar plots, wind roses, radial clocks. These have no standard non-specialized alternative — polar is the native form for cyclical data.
The overall pattern: specialized chart types do not replace the Chapter 11 essentials; they extend the vocabulary for specific data shapes that the essentials cannot handle gracefully. When you are picking a chart type, start with the Chapter 5 matrix for the general question. Then check whether a specialized type better fits the specific shape of your data. If yes, use the specialized type; if no, stick with the essentials.
14.8 Heatmaps and Colormaps: A Practical Guide
Chapter 3 established the theory of color palettes; this section applies that theory to the specific decision of "which colormap should I use for my heatmap?"
The Sequential-Diverging-Qualitative Decision
Before choosing a specific colormap, classify your data:
Sequential data has a natural low-to-high ordering with no meaningful midpoint. Examples: count of events, density, population size, temperature (when not compared to a baseline), rainfall, concentration. Use a sequential colormap — darker means more.
Diverging data has a meaningful midpoint with values extending in both directions. Examples: temperature anomaly (positive and negative from baseline), correlation coefficient (between -1 and +1), profit and loss, rate change relative to last year. Use a diverging colormap — the midpoint is neutral, and the two sides are different hues.
Qualitative data has distinct categories with no inherent order. Examples: product types, country names, experimental conditions. You usually do not use a continuous colormap for qualitative data; you use a categorical palette where each category gets its own distinct color.
Specific Colormap Recommendations
Sequential colormaps:
- "viridis" — matplotlib's modern default. Perceptually uniform, colorblind-safe, prints well in grayscale. Use this when you have no other preference.
- "plasma" — similar to viridis but with more red/purple. Good for warmer aesthetic.
- "cividis" — designed for extreme colorblind safety. Slightly less vivid than viridis.
- "Blues", "Greens", "Oranges", "Greys" — single-hue sequential palettes from ColorBrewer. Good when you want a specific color identity.
- "YlOrRd", "BuPu" — multi-hue sequential palettes from ColorBrewer. More variety than single-hue but still sequential.
Diverging colormaps:
- "RdBu_r" — red to blue (reversed so warm is red). Classic for anomaly data.
- "coolwarm" — cool to warm with smooth transitions. Similar to RdBu_r but with different hue choices.
- "BrBG", "PiYG" — ColorBrewer diverging palettes with different hue pairs for non-temperature data.
- "PuOr" — purple to orange, colorblind-safer than red/blue for some types of color blindness.
Qualitative colormaps (for categorical data):
- "tab10", "tab20" — matplotlib default qualitative palettes.
- "Set2", "Set3" — ColorBrewer qualitative palettes.
- "Paired" — colors in pairs, useful when categories come in related groups.
Avoid These
"jet"— the legacy rainbow colormap. Perceptually non-uniform (the green region looks the same at multiple values), not colorblind-safe, and misleading for sequential data. Do not use."hsv"— full rainbow. Same problems as jet."rainbow"— another variant of the same broken concept.- Any colormap you picked because it "looked nice" without checking perceptual uniformity. If the colormap is not on matplotlib's "Perceptually Uniform Sequential" or "Diverging" lists, check its luminance profile before committing.
The Grayscale Test
Chapter 3 introduced the grayscale test: take your colormap, convert it to grayscale, and check whether it still reads correctly. Sequential colormaps should go from light to dark (or dark to light) monotonically — no reversals. Diverging colormaps should have their midpoint as medium gray and two different-brightness extremes.
You can apply the grayscale test in matplotlib by converting the chart to grayscale after rendering, or by looking at the colormap's "L" channel in a color picker. For serious publication work, always run the test. For casual work, trust that viridis and RdBu_r are well-tested and move on.
Custom Colormaps for Brand or Publication
If you need a specific brand color in your heatmap:
from matplotlib.colors import LinearSegmentedColormap
# A custom sequential from white to a brand blue
brand_cmap = LinearSegmentedColormap.from_list("brand_blue", ["white", "#0066cc"])
ax.imshow(data, cmap=brand_cmap)
LinearSegmentedColormap.from_list creates a smooth gradient between the specified colors. For a diverging palette, specify three colors: ["#cc0000", "white", "#0066cc"].
For multi-color gradients:
brand_diverging = LinearSegmentedColormap.from_list(
"brand_div",
["#8b0000", "#cc5500", "#ffffff", "#0066cc", "#003366"],
)
The resulting colormap can be used anywhere a cmap name can be used: ax.imshow(data, cmap=brand_diverging).
14.8 Common Pitfalls with Specialized Charts
Each specialized chart type has its own failure modes. This section catalogs the most common ones and their fixes.
Heatmap Pitfalls
Wrong colormap for the data type. Using a sequential colormap for diverging data hides the midpoint. Using a diverging colormap for sequential data wastes the "neutral" color on a meaningless region. Fix: match the colormap type to the data (sequential for ordered, diverging for around-a-midpoint, qualitative for categorical).
Asymmetric vmin/vmax on diverging data. If your data ranges from -1 to +3 and you let matplotlib autoscale the colorbar, the neutral color (white for RdBu_r) ends up at +1, not at zero. The eye reads the wrong value as "neutral." Fix: always set symmetric vmin and vmax for diverging colormaps: vmin=-abs_max, vmax=abs_max.
Interpolation blurring discrete cells. Default matplotlib interpolation can blur the edges of heatmap cells, making them look fuzzy. Fix: set interpolation="nearest" in the imshow call to force sharp cell boundaries.
Too many cells to annotate. Annotating a 50×50 heatmap produces unreadable text. Fix: do not annotate large heatmaps. The color pattern is the signal; numbers are too small to read at that scale.
Wrong aspect ratio. Default aspect="equal" makes cells square regardless of the row/column counts, which can stretch the heatmap into odd shapes. Fix: use aspect="auto" for tabular data where the cell shape should adapt to the Axes.
Contour Plot Pitfalls
Too many levels. Default matplotlib contour plots use too many levels for visual clarity. Fix: specify levels=N explicitly (typically 10-20 for filled contours, 5-10 for line contours).
Missing colorbar. Filled contour plots without a colorbar leave the reader unable to read values. Fix: always add a colorbar to contourf plots.
Sparse data. Contour plots assume smooth underlying functions. If your data is noisy or sparse, contours can produce misleading shapes. Fix: smooth the data first (e.g., with scipy.ndimage.gaussian_filter) or use a heatmap instead.
Polar Plot Pitfalls
Wrong zero location. The default set_theta_zero_location is "E" (East / right), which feels wrong for compass or clock-style data. Fix: explicitly set ax.set_theta_zero_location("N") for compass and clock data.
Wrong direction. The default is counterclockwise, which is wrong for clock-style data. Fix: ax.set_theta_direction(-1) for clockwise.
Area distortion in radar charts. A radar chart's apparent "area" is not a meaningful quantity — it depends on the specific choice of categories and their order. Fix: be aware of the distortion and do not interpret area; use radar charts as general shape indicators, not precise measurements.
Linear data in polar form. Polar plots are for cyclical data only. Using them for non-cyclical data (year-over-year growth, age, etc.) wraps the data inappropriately. Fix: use a Cartesian plot instead.
Error Bar Pitfalls
Overlapping error bars. In dense scatter plots with error bars, the bars can overlap and clutter the display. Fix: reduce capsize or elinewidth, use alpha on error bars, or use confidence bands (fill_between) instead.
Asymmetric errors drawn as symmetric. If your errors are asymmetric (e.g., a log-normal distribution) but you pass a single yerr array, the bars will be symmetric, misrepresenting the data. Fix: pass yerr=[lower_errors, upper_errors] as a 2-element list for asymmetric errors.
Missing capsize. Default errorbar has no caps, so the error bar ends blend with the data markers and are hard to see. Fix: add capsize=3 or capsize=5 for visible caps.
Confidence bands without the central line. A fill_between without the central estimate is ambiguous — the reader does not know where the actual estimate is. Fix: always plot the central line along with the band.
General Pitfalls
Wrong chart type for the question. The most common pitfall is using a heatmap when a contour plot would be clearer (or vice versa), or using a polar plot for non-cyclical data, or using error bars when a box plot would be more appropriate. Fix: think about the data shape and the question before picking the chart type. The decision guide in the next section helps.
Insufficient annotation. Specialized charts often need more explanatory annotation than standard charts because they are less familiar to readers. Fix: add a subtitle explaining what the chart shows ("Each cell represents a year × month combination"), label the colorbar with units, and include a source attribution.
Inconsistent styling. Specialized charts are easy to produce but easy to leave in default matplotlib style. Fix: apply the same declutter and typography principles from Chapters 6, 7, and 12 to every specialized chart. The style function pattern from Chapter 12 works for these too.
14.8 Choosing the Right Specialized Chart Type
Before writing code, identify which specialized type fits your data and question. This section is a decision guide.
By Data Shape
2D tabular data (rows × columns of numerical values):
- Small table (< 20 rows × 20 cols): annotated heatmap with both colors and numbers.
- Large table: unannotated heatmap; the color pattern is the signal.
- Diverging around a midpoint: diverging colormap (RdBu_r, coolwarm).
- Sequential (non-negative): sequential colormap (viridis, Blues).
Continuous 2D surface (a function defined over a 2D grid):
- Smooth function: contour or contourf.
- Discrete grid: imshow.
- With both levels and density: contourf with a contour overlay.
Cyclical data (time of day, day of week, month, compass direction): - Distribution of angles: polar bar chart (wind rose). - Categorical comparison along a cycle: polar bar or polar line. - Multi-dimensional comparison: radar chart (with awareness of the perceptual caveats).
Data with uncertainty:
- Discrete points with error: errorbar with yerr.
- Continuous function with bounds: fill_between with a central line.
- Multiple error bands: multiple fill_between calls with different alphas.
Directional data (vectors at each point):
- Discrete arrows: quiver.
- Continuous flow: streamplot.
Wide-range data:
- Data spanning many orders of magnitude: log scale via set_yscale("log").
- Data crossing zero with wide range: symlog via set_yscale("symlog").
By Question Type
"Where are the patterns in my 2D data?" → Heatmap or contourf.
"How does a quantity vary smoothly across a 2D plane?" → Contour or contourf.
"What is the distribution of this cyclical variable?" → Polar bar chart.
"How uncertain is my estimate?" → errorbar or fill_between with a central line.
"How does this vector field flow?" → quiver or streamplot.
"How does my exponential data trend?" → Line chart with log y-axis.
These are heuristics, not rules. A specific dataset may fit multiple categories, in which case try each and see which communicates best. The goal is to match the chart type to the data shape so that the natural structure of the data is visible in the visualization.
14.8 Annotated Heatmaps in Detail
Heatmaps with numerical annotations are one of the most information-dense chart types. They appear in correlation matrices, confusion matrices, gene expression plots, and any table where the numerical values matter in addition to the color pattern. This section expands on Section 14.1 with practical details.
The Annotation Loop
def plot_annotated_heatmap(data, row_labels, col_labels, cmap="RdBu_r", figsize=(10, 8)):
fig, ax = plt.subplots(figsize=figsize)
im = ax.imshow(data, cmap=cmap, aspect="auto")
fig.colorbar(im, ax=ax)
ax.set_xticks(range(len(col_labels)))
ax.set_xticklabels(col_labels, rotation=45, ha="right")
ax.set_yticks(range(len(row_labels)))
ax.set_yticklabels(row_labels)
# Annotation loop
threshold = im.norm(data).mean()
for i in range(data.shape[0]):
for j in range(data.shape[1]):
value = data[i, j]
# Choose text color based on background luminance
color = "white" if im.norm(value) > threshold else "black"
ax.text(j, i, f"{value:.2f}", ha="center", va="center",
fontsize=9, color=color)
return fig, ax
The im.norm(value) calls convert the data value to its normalized position on the colormap (0 to 1). Comparing against the mean of the normalized data gives a threshold for text color — values that fall in the "dark" half of the colormap get white text, and values in the "light" half get black text. This ensures the annotations remain readable regardless of the background color.
Correlation Matrix Example
import numpy as np
# Create a correlation matrix from random data
data = np.random.randn(100, 6)
corr = np.corrcoef(data.T)
variables = ["Age", "Income", "Education", "Health", "Happiness", "Mobility"]
fig, ax = plot_annotated_heatmap(corr, variables, variables, cmap="RdBu_r")
ax.set_title("Correlation Matrix")
# For correlation matrices, set vmin/vmax symmetrically
im = ax.get_images()[0]
im.set_clim(-1, 1)
For correlation matrices, you always want vmin=-1, vmax=1 because correlations are bounded. The diverging colormap (RdBu_r) naturally handles the meaningful midpoint (zero correlation = white). The diagonal will always be 1.0 (perfect self-correlation), so it will show as the darkest red cells.
Confusion Matrix Example
from sklearn.metrics import confusion_matrix
y_true = [0, 1, 2, 0, 1, 2, 0, 1, 2] * 10
y_pred = [0, 1, 1, 0, 1, 2, 0, 2, 2] * 10
cm = confusion_matrix(y_true, y_pred)
classes = ["Class A", "Class B", "Class C"]
fig, ax = plt.subplots(figsize=(7, 6))
im = ax.imshow(cm, cmap="Blues", aspect="auto")
fig.colorbar(im, ax=ax)
ax.set_xticks(range(len(classes)))
ax.set_xticklabels(classes)
ax.set_yticks(range(len(classes)))
ax.set_yticklabels(classes)
ax.set_xlabel("Predicted")
ax.set_ylabel("True")
for i in range(cm.shape[0]):
for j in range(cm.shape[1]):
ax.text(j, i, str(cm[i, j]), ha="center", va="center",
color="white" if cm[i, j] > cm.max() / 2 else "black",
fontsize=12)
ax.set_title("Confusion Matrix")
For confusion matrices, a sequential colormap (Blues, Greens, Greys) is appropriate because the counts are non-negative. Darker cells represent higher counts. Annotating the exact counts is essential because the eye cannot distinguish nuances of color as precisely as it can read numbers.
14.8 Colorbar Customization
When a chart uses a colormap (heatmaps, scatters with color encoding, contour plots), the colorbar is how the reader decodes the color-to-value mapping. Default colorbars work but look generic; customization produces a more polished result.
Colorbar Basics
fig, ax = plt.subplots(figsize=(8, 6))
im = ax.imshow(data, cmap="viridis")
cbar = fig.colorbar(im, ax=ax)
The fig.colorbar(im, ax=ax) call attaches a colorbar to the plot. The first argument is the return value of imshow, pcolormesh, contourf, or scatter (anything with a ScalarMappable). The ax=ax argument tells matplotlib which Axes the colorbar is attached to.
Colorbar Orientation and Position
cbar = fig.colorbar(im, ax=ax, orientation="horizontal", location="bottom")
orientation="horizontal" places the colorbar horizontally (usually below the plot). location="bottom" specifies where in the figure the colorbar appears. The defaults are vertical and to the right of the plot.
Colorbar Label and Ticks
cbar = fig.colorbar(im, ax=ax)
cbar.set_label("Temperature Anomaly (°C)", fontsize=11)
cbar.set_ticks([-2, -1, 0, 1, 2])
cbar.ax.tick_params(labelsize=9)
The set_label method adds a title to the colorbar. set_ticks sets specific tick positions (useful when you want evenly-spaced or meaningful values). cbar.ax.tick_params configures the colorbar's own tick labels.
Shrinking and Padding
cbar = fig.colorbar(im, ax=ax, shrink=0.8, pad=0.02)
shrink=0.8 makes the colorbar 80% of the Axes height, which looks less dominating than a full-height bar. pad=0.02 reduces the space between the Axes and the colorbar.
Extend Arrows
For data that is clipped at the colormap extremes (values below vmin or above vmax), add arrows at the ends of the colorbar to indicate the clipping:
cbar = fig.colorbar(im, ax=ax, extend="both")
extend="both" adds arrows at both ends. extend="min" or "max" adds only one. The arrows tell the reader that some data points are below/above the displayed range.
Colorbar for a Specific Range
To show only a portion of the colormap range in the colorbar:
im = ax.imshow(data, cmap="viridis", vmin=0, vmax=100)
cbar = fig.colorbar(im, ax=ax)
Setting vmin and vmax on the imshow call determines both the displayed color range and the colorbar range. This is how you enforce consistent color mapping across multiple plots that share a colorbar — set the same vmin and vmax on all of them.
14.8 Date and Time Axes
matplotlib has special support for date and time axes, but the default behavior is often not ideal. Knowing the date-handling methods is worth a section.
Basic Date Axis
import pandas as pd
dates = pd.date_range("2020-01-01", "2024-12-31", freq="D")
values = np.cumsum(np.random.randn(len(dates)))
fig, ax = plt.subplots(figsize=(12, 4))
ax.plot(dates, values)
matplotlib recognizes the pandas DatetimeIndex and formats the x-axis as dates automatically. The default tick spacing adapts to the data range.
Custom Date Formatting
For better control, use matplotlib.dates:
import matplotlib.dates as mdates
fig, ax = plt.subplots(figsize=(12, 4))
ax.plot(dates, values)
# Set tick spacing
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y"))
# Minor ticks at each month
ax.xaxis.set_minor_locator(mdates.MonthLocator())
YearLocator() places major ticks at the start of each year. DateFormatter("%Y") formats the ticks as 4-digit years. MonthLocator() places minor ticks at the start of each month (shown as smaller marks between the year labels).
The strftime-style format strings work the same as in Python's date formatting: %Y for year, %m for month number, %b for month name, %d for day, %H:%M for time, etc.
Auto Date Formatting
AutoDateLocator and ConciseDateFormatter (matplotlib 3.1+) produce reasonable defaults automatically:
locator = mdates.AutoDateLocator()
formatter = mdates.ConciseDateFormatter(locator)
ax.xaxis.set_major_locator(locator)
ax.xaxis.set_major_formatter(formatter)
The concise formatter avoids redundant year labels and produces cleaner output than the default. Worth using as the default for any date-heavy chart.
Date Axis Pitfalls
Pitfall 1: overlapping date labels. Long date strings overlap on narrow charts. Fix with fig.autofmt_xdate() (auto-rotate) or plt.setp(ax.get_xticklabels(), rotation=45, ha="right").
Pitfall 2: timezone issues. If your dates are timezone-aware but matplotlib is not expecting it, the plot can shift. Convert to UTC or naive datetimes before plotting if you see unexpected offsets.
Pitfall 3: string dates. If your dates are strings like "2024-01-01" rather than actual datetime objects, matplotlib will treat them as categorical and space them evenly regardless of actual date spacing. Convert to pd.to_datetime first.
14.9 Log and Symlog Scales
For data that spans many orders of magnitude, linear scales hide variation at the low end. Log scales compress the range and make exponential trends appear as straight lines.
Basic Log Scale
fig, ax = plt.subplots(figsize=(10, 6))
ax.plot(x, y)
ax.set_yscale("log")
ax.set_yscale("log") changes the y-axis to a base-10 logarithmic scale. Values of 1, 10, 100, 1000 become equally-spaced on the axis. Exponential growth (y = a * exp(kx)) becomes a straight line. Power-law data becomes a straight line when both axes are log.
Log-Log Plots
ax.set_xscale("log")
ax.set_yscale("log")
Both axes logarithmic. Useful for data that follows a power law, for plotting over many orders of magnitude, or for astronomical data where both quantities span huge ranges.
Symlog for Data That Includes Zero and Negatives
Log scales cannot handle zero or negative values (log of zero is undefined). For data that includes these, use symlog — a symmetric log scale that is linear near zero and logarithmic elsewhere:
ax.set_yscale("symlog", linthresh=1)
linthresh is the threshold below which the scale is linear. Below ±1, the axis is linear; beyond ±1, it is logarithmic. This lets you visualize data that crosses zero without losing information.
Custom Log Tick Formatting
By default, log axes use scientific notation (1e0, 1e1, 1e2). For reader-friendly labels:
from matplotlib.ticker import ScalarFormatter
ax.yaxis.set_major_formatter(ScalarFormatter())
ax.ticklabel_format(axis="y", style="plain")
This shows "1, 10, 100, 1000" instead of "10^0, 10^1, 10^2, 10^3".
When Log Scales Work (and When They Do Not)
Log scales are right when:
- Data spans 2+ orders of magnitude.
- Growth is exponential or multiplicative (stock prices, bacterial growth, pandemic cases).
- Power laws are the underlying relationship.
Log scales are wrong when:
- Data is contained within a narrow range (log just distorts it).
- The audience is not trained to read log scales (Chapter 5's FT pandemic chart case study argues for log scales in public contexts only when the designer is willing to teach the reader).
- The data includes zero or negative values and you are not using symlog.
14.10 Combining Specialized Charts in Multi-Panel Figures
Specialized chart types often appear alongside standard chart types in a single multi-panel figure. The Chapter 13 layout techniques apply directly.
fig = plt.figure(figsize=(14, 8), constrained_layout=True)
gs = fig.add_gridspec(2, 2)
# Panel A: heatmap
ax_a = fig.add_subplot(gs[0, 0])
im = ax_a.imshow(heatmap_data, cmap="RdBu_r", aspect="auto")
fig.colorbar(im, ax=ax_a)
ax_a.set_title("(a) Monthly Anomalies")
# Panel B: line chart with confidence band
ax_b = fig.add_subplot(gs[0, 1])
ax_b.plot(years, means, color="#1f77b4")
ax_b.fill_between(years, means - sems, means + sems, alpha=0.2, color="#1f77b4")
ax_b.set_title("(b) Annual Mean with 95% CI")
# Panel C: contour plot
ax_c = fig.add_subplot(gs[1, 0])
cf = ax_c.contourf(X, Y, Z, levels=20, cmap="viridis")
fig.colorbar(cf, ax=ax_c)
ax_c.set_title("(c) 2D Density")
# Panel D: polar plot
ax_d = fig.add_subplot(gs[1, 1], projection="polar")
ax_d.bar(theta, hourly_counts, width=2 * np.pi / 24, alpha=0.7)
ax_d.set_title("(d) Hourly Activity")
Notice that fig.add_subplot(gs[1, 1], projection="polar") creates a polar-projected Axes inside a specific GridSpec cell. You can mix projections across panels in the same figure — some Cartesian, some polar — and they all live within the same GridSpec structure.
Chapter Summary
This chapter extended the Chapter 11 chart vocabulary with specialized types for specific data shapes. Heatmaps (ax.imshow, ax.pcolormesh) visualize 2D tabular data with color. Contour plots (ax.contour, ax.contourf) visualize continuous 2D surfaces with level curves. Polar plots (via projection="polar") visualize cyclical data in its natural circular form. Error bars (ax.errorbar) and confidence bands (ax.fill_between) visualize uncertainty. Vector fields (ax.quiver, ax.streamplot) visualize directional data.
Each specialized type has a specific data shape it serves best. Knowing which type to use for which shape is the skill this chapter builds. The matplotlib syntax varies, but the underlying logic is the same: pick the chart type that matches the data shape, then apply the customization principles from Chapter 12.
Specialized charts often appear alongside standard charts in multi-panel figures, using the GridSpec techniques from Chapter 13. Mixing chart types in the same figure is common and useful when a single story requires multiple perspectives on the same data.
Next in Chapter 15: animation and interactivity. Static charts are the baseline; moving and interactive charts add dimensions that static charts cannot express. The chapter covers matplotlib's FuncAnimation API, exporting to GIF and MP4, and the limitations of matplotlib interactivity compared to dedicated interactive tools.
Spaced Review: Concepts from Chapters 1-13
-
Chapter 3: The choice of colormap for a heatmap depends on the data type. Which cmap would you use for a correlation matrix? For a count matrix? For temperature anomalies?
-
Chapter 4: Error bars and confidence bands make uncertainty visible. How does this connect to the ethical principles from Chapter 4?
-
Chapter 5: When would you choose a heatmap over a grid of small-multiple line charts? What question does each answer better?
-
Chapter 8: Polar plots visualize cyclical data. What happens to the Chapter 8 Z-pattern reading order when the panel is polar rather than Cartesian?
-
Chapter 12: Specialized chart types still need customization (titles, colorbars, labels). Which Chapter 12 techniques apply directly to heatmaps and contour plots?
-
Chapter 13: Multi-panel figures can mix chart types. Why does
fig.add_subplot(gs[1, 1], projection="polar")work, and what does it produce?