Case Study 1: The Climate Stripes and the Monthly Anomaly Heatmap

DataField.Dev

Case Study 1: The Climate Stripes and the Monthly Anomaly Heatmap

Ed Hawkins's warming stripes (covered in Chapter 11 Case Study 2) are a minimalist bar chart of annual temperature anomalies. The next natural step — a heatmap of monthly anomalies — adds a dimension that the stripes cannot show. This case study walks through how a heatmap reveals patterns that a line chart or a bar chart cannot.

The Situation

Chapter 11's Case Study 2 covered Ed Hawkins's 2018 "warming stripes" — a horizontal sequence of colored bars, each representing one year's average temperature anomaly, colored from blue (cold) to red (warm). The chart became one of the most widely-shared climate visualizations ever. Its power was in its minimalism: no axes, no labels, just a sequence of colors that told the warming story in about a second.

But warming stripes show only annual averages. Each year is one color. The within-year variation — seasonal patterns, specific warm or cold months, the timing of extreme events — is compressed away. For some audiences, this is the right trade-off: the annual signal is what matters, and the compression makes the chart instantly readable. For other audiences, the month-level detail matters: climate scientists want to know whether the warming is uniform across seasons or concentrated in specific months, whether extreme events are clustering in particular parts of the year, whether the annual mean masks significant changes in variability.

For those audiences, a monthly anomaly heatmap is the right chart. Instead of one color per year, it shows 12 colors per year — one for each month — arranged as a grid with years on one axis and months on the other. The chart has 144 cells per year of data (years × 12 months), and the pattern across cells tells a much richer story than the annual-average stripes.

Heatmaps of monthly anomaly data are not new. They appear in climate science papers, IPCC reports, and scientific data journalism. What makes them worth including as a case study here is the direct connection to the warming stripes — they are the "more detailed" version of the same idea, and they use specialized matplotlib methods (ax.imshow with a diverging colormap) that we covered in this chapter.

The Data

The underlying data for a monthly anomaly heatmap is the monthly global temperature anomaly record from NASA GISS, NOAA, or Berkeley Earth. For each year from ~1880 to the present and each month from January to December, the dataset contains a single number: the temperature anomaly for that month relative to a baseline period (often 1951-1980 or 1901-2000).

The data is typically distributed as a CSV or text file with columns year, month, anomaly, or as a wide-form table with years as rows and months as columns. For the heatmap, the wide form is closer to what matplotlib needs — a 2D array with dimensions (years, months).

import pandas as pd
import numpy as np

# Assume the data is in long form
climate = pd.read_csv("monthly_anomalies.csv")  # columns: year, month, anomaly

# Pivot to wide form
pivot = climate.pivot(index="year", columns="month", values="anomaly")

# Now pivot is indexed by year with columns 1-12 for months
# Shape: (num_years, 12)

For this case study, assume you have pivot as a pandas DataFrame of shape (145, 12) covering 1880-2024.

The Visualization

The core code for a monthly anomaly heatmap:

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(12, 10), constrained_layout=True)

# Diverging colormap, symmetric around zero
abs_max = abs(pivot.values).max()
im = ax.imshow(
    pivot.values,
    cmap="RdBu_r",
    aspect="auto",
    vmin=-abs_max,
    vmax=abs_max,
    interpolation="nearest",
)

# Colorbar with label
cbar = fig.colorbar(im, ax=ax, shrink=0.8)
cbar.set_label("Temperature Anomaly (°C)", fontsize=11)

# Month labels on the x-axis (columns)
month_names = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
ax.set_xticks(range(12))
ax.set_xticklabels(month_names)

# Year labels on the y-axis (every 10 years to avoid clutter)
year_ticks = range(0, len(pivot), 10)
ax.set_yticks(year_ticks)
ax.set_yticklabels(pivot.index[year_ticks])

# Title
ax.set_title(
    "Monthly Temperature Anomalies, 1880-2024",
    fontsize=14,
    loc="left",
    fontweight="semibold",
    pad=12,
)

# Axis labels
ax.set_xlabel("Month", fontsize=11)
ax.set_ylabel("Year", fontsize=11)

# Declutter
ax.tick_params(axis="both", labelsize=9)

# Source attribution
fig.text(0.125, 0.02, "Source: NASA GISS", fontsize=8, color="gray", style="italic")

fig.savefig("climate_heatmap.png", dpi=300, bbox_inches="tight")

The call is straightforward: ax.imshow(pivot.values) with a diverging colormap, symmetric vmin/vmax, and the standard matplotlib styling pattern. Every line implements a specific principle from earlier chapters:

cmap="RdBu_r": a diverging palette from Chapter 3, reversed so warm is red.
vmin=-abs_max, vmax=abs_max: symmetric around zero so the neutral midpoint of the colormap aligns with the baseline.
aspect="auto": cells stretch to fill the Axes (appropriate for tabular data, not image-like data).
interpolation="nearest": keeps cell boundaries sharp rather than blurring between them.
Colorbar with label: essential for decoding color to value.
Month and year labels: without them, the axes are meaningless.

What the Heatmap Shows

The monthly anomaly heatmap reveals several patterns that annual-average charts cannot:

1. The warming is across all months. Looking at the heatmap, the blue cells (cool months) are concentrated in the early rows (earlier years), and the red cells (warm months) dominate the later rows. This pattern is visible for every month column, not just for the annual average. The warming is systematic across seasons, not concentrated in a specific time of year.

2. The warming accelerated in specific decades. The bottom rows (most recent years) are almost entirely red for every month. The transition from mixed colors to dominant red happens somewhere in the mid-20th century. The exact transition point is visible in the heatmap as a horizontal "boundary" where the color shifts.

3. Individual extreme months are visible. The 2015-2016 El Niño event shows as a cluster of darkly-colored cells in that region of the heatmap. The 1997-98 El Niño shows similarly. These single-month extremes are visible as dark cells; a line chart of annual means would smooth them away.

4. Seasonal asymmetries become visible. Some months (like winter months in northern hemisphere) warm more than others. The heatmap shows this as columns (months) that are more intensely red than other columns. A line chart of annual means cannot show asymmetry between months.

5. The baseline period is visible. Rows in the 1951-1980 range (the baseline) are close to the neutral color, because the baseline is defined as zero. Rows before and after are colored relative to this neutral period. This visual baseline is automatic because of the symmetric vmin/vmax.

Each of these patterns is a story that the heatmap tells and that annual-average warming stripes cannot. The trade-off is complexity: the heatmap takes longer to read, requires labeled axes, and is harder to share on social media. For a climate scientist or a serious reader, the trade-off is worth it. For a viral social media post, warming stripes win.

The Design Decisions

Several choices shape the effectiveness of the chart.

The colormap. RdBu_r (reversed Red-Blue) is a diverging palette with red for warm and blue for cold. Some climate scientists prefer other diverging palettes (like BrBG for brown-blue-green, or PuOr for purple-orange) for specific aesthetic or colorblind-safety reasons. RdBu_r is the default because it matches the intuitive "warm = red, cool = blue" semantics.

The symmetric vmin/vmax. Without this, the colormap midpoint does not align with zero (the baseline), and the reader misinterprets the neutral color. Symmetric scaling is non-negotiable for diverging data.

The year labels every 10 years. Showing every year's label would clutter the y-axis unreadably. Showing only every 10th year gives the reader enough anchor points to locate specific years without overwhelming the chart.

The aspect ratio. A 12 (months) × 145 (years) grid is tall and narrow at natural cell sizes. Setting aspect="auto" lets matplotlib stretch the cells to fill the Axes, producing a more reasonable shape. Without aspect="auto", the chart would be a long thin vertical strip.

The figure size. figsize=(12, 10) is tall enough to show the year range clearly without squeezing the month columns. For shorter time ranges (50 years, 100 years), a smaller figsize would work.

The colorbar label. "Temperature Anomaly (°C)" tells the reader what the colors mean. Without this, the colorbar shows numbers without context.

The source attribution. Always include it.

Why This Chart Matters

The monthly anomaly heatmap is not as viral as the warming stripes, but it is more informative. It is the form of the chart that appears in scientific papers, that climate researchers use in their own work, and that tells the full warming story rather than just the headline. The stripes are for the public; the heatmap is for the people who want to know more.

The existence of both forms — a minimalist version for one audience and a detailed version for another — is itself a lesson about data visualization. The same underlying data can be shown at different levels of compression, and the right level depends on what the reader is trying to learn. A chart maker who produces only one form leaves some readers unserved. A chart maker who produces both — stripes for viral sharing, heatmap for serious analysis — reaches a wider audience with the right level of detail for each.

Lessons for Practice

1. Heatmaps reveal patterns that averages hide. Whenever you have 2D tabular data (rows × columns of numbers), consider whether a heatmap would show patterns that a summary statistic or a line chart would miss. For climate data, the heatmap shows within-year patterns that annual means hide. For business data, a heatmap of weekly revenue by category might show patterns that monthly totals hide.

2. Diverging colormaps require symmetric vmin/vmax. This is the single most important rule for heatmaps of diverging data. Get this wrong and the chart is misleading.

3. Label the axes carefully. Heatmaps are dense with information and need clear axis labels, tick labels at reasonable spacing, and a colorbar with a clear label. Unlabeled heatmaps are almost useless because the color pattern alone does not tell the reader what the cells represent.

4. Consider the trade-off between precision and memorability. The warming stripes are more memorable but less precise. The monthly heatmap is more precise but less memorable. Know which audience and purpose each serves, and do not try to force one form to do the other's job.

5. The same data can support multiple chart types. Climate anomaly data supports line charts (annual mean over time), stripes (annual mean as colored bars), heatmaps (monthly detail), and many others. The right chart depends on the question. The Chapter 5 framework applies — start with the question, not the chart type.

Discussion Questions

On the warming stripes vs. heatmap trade-off. Which form would you show to a general public audience, a policy maker, a climate scientist, and a child? Justify each choice.
On the diverging colormap. RdBu_r is standard, but some scientists argue that a sequential palette (only warming colors) is more honest because it does not imply a symmetric scale when the data is asymmetric. What do you think?
On the visible baseline. The 1951-1980 baseline period appears as a neutral-colored band in the heatmap. Is this baseline visible enough in the chart for readers who do not know what it is? Should it be annotated explicitly?
On seasonal asymmetry. The heatmap can show that some months warm more than others. How would you annotate this finding on the chart itself? What specific annotation would add the most value?
On El Niño and extreme events. The 2015-16 and 1997-98 El Niño events appear as dark clusters. Should these be annotated directly on the heatmap as context markers, or left for the reader to discover?
On the two-form strategy. The case study argues that producing both the warming stripes and the monthly heatmap is better than producing just one. Is this "two charts for two audiences" strategy always appropriate, or only for specific high-stakes topics?

Heatmaps of 2D temporal data are a specific but important specialization of the chart vocabulary. The monthly climate anomaly heatmap is the canonical example — more detailed than the warming stripes, more familiar than custom flow maps, and directly supported by matplotlib's ax.imshow function. Every pattern in this chart is visible because of specific design decisions: diverging colormap, symmetric scaling, clear labeling. Apply the same decisions to your own 2D data and the results will be equally legible.