Case Study 2: GridSpec in Scientific Publications

Scientific papers have a specific figure format: one or two complex multi-panel figures that carry most of the paper's findings. These figures are read slowly, studied carefully, and cited for years. Building them is one of the most common uses of matplotlib GridSpec in professional practice. This case study walks through the pattern.


The Situation

A scientific paper in a journal like Nature, Science, PNAS, or PLOS ONE typically includes 4-10 figures. Most of those figures are multi-panel — a single figure with several related sub-panels labeled (a), (b), (c), (d) that together tell one aspect of the research story. The multi-panel format lets authors pack a lot of information into a small number of figures, which matters because journals often limit figure counts and because readers benefit from seeing related findings together.

A typical multi-panel scientific figure might have:

  • Panel (a): The main result (a scatter plot, a line chart, or a heatmap showing the key finding).
  • Panel (b): A subset analysis (the same finding broken down by group).
  • Panel (c): A statistical test (a box plot or violin plot showing the distribution).
  • Panel (d): A visualization of the method (a diagram of the experimental setup or the analytical pipeline).

Each panel is a distinct chart with its own axes, title, and labels. But they share a caption and are referenced as "Figure 1a," "Figure 1b," and so on. The reader is expected to view the whole figure at once and understand how the panels connect.

Scientific figures are produced almost exclusively in matplotlib (in the Python world) or ggplot2 (in R). The multi-panel layout is usually built with GridSpec or plt.subplots. The specific constraints of scientific publication — single-column vs. double-column widths, journal-specific style guides, black-and-white compatibility requirements — shape the design in ways that general-purpose data journalism does not have to worry about.

This case study examines the scientific multi-panel figure as a specific use case, walks through how to build one in matplotlib, and identifies the conventions that scientific figures follow.

The Pattern

Scientific multi-panel figures follow a consistent structural pattern:

1. Panel labels (a), (b), (c), (d). Each panel gets a letter label in the upper-left corner, typically in bold. Readers can reference "Figure 2c" and know exactly which panel is being discussed.

2. Small, dense fonts. Scientific figures are often printed at one-third or one-half of a page, so font sizes are small (8-10 pt for most text) and information density is high.

3. Grayscale-compatible design. Many readers still print papers in black and white. Charts should be interpretable without color — using line styles, marker shapes, or patterns as well as color for categorical encoding.

4. Error bars and confidence intervals. Scientific figures almost always show uncertainty. Every data point is accompanied by some indicator of its variability.

5. Statistical annotations. p-values, asterisks indicating significance levels (*p < 0.05, **p < 0.01, ***p < 0.001), and sometimes test statistics are annotated directly on the charts.

6. Captions that describe the method. The figure caption describes exactly what each panel shows, what statistical test was used, how many data points are included, and any specific notes. The caption does the work that axis labels alone cannot do.

7. Consistency within the figure. All panels in a figure typically use the same color scheme, the same font sizes, and the same style conventions. This creates visual unity across the panels.

The Data

For this case study, imagine a hypothetical paper analyzing the effect of a new drug on cognitive performance in mice. The data includes:

  • Control group: 40 mice, measured before and after a placebo.
  • Treatment group: 40 mice, measured before and after the drug.
  • Time series: Cognitive scores measured weekly for 8 weeks.
  • Subgroup analysis: Male vs. female mice, old vs. young mice.

The paper's main figure needs to show: (a) the main effect as a box plot, (b) the time course as a line chart with error bars, (c) the subgroup analysis as a forest plot, and (d) the distribution of individual changes as a histogram.

The Code

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)

# Synthetic data
n_mice = 40
control_before = np.random.normal(10, 2, n_mice)
control_after = control_before + np.random.normal(0.1, 0.5, n_mice)
treatment_before = np.random.normal(10, 2, n_mice)
treatment_after = treatment_before + np.random.normal(2.0, 1.0, n_mice)

weeks = np.arange(1, 9)
control_time = 10 + np.cumsum(np.random.randn(len(weeks)) * 0.1)
treatment_time = 10 + np.cumsum(np.random.randn(len(weeks)) * 0.1 + 0.3)

# Set publication-friendly rcParams
plt.rcParams.update({
    "font.size": 9,
    "axes.titlesize": 10,
    "axes.labelsize": 9,
    "xtick.labelsize": 8,
    "ytick.labelsize": 8,
    "legend.fontsize": 8,
    "figure.titlesize": 11,
})

# Create the figure with GridSpec
fig = plt.figure(figsize=(8, 6), constrained_layout=True)
gs = fig.add_gridspec(2, 2, hspace=0.35, wspace=0.3)

# Panel (a): Box plot of before-after changes
ax_a = fig.add_subplot(gs[0, 0])
control_change = control_after - control_before
treatment_change = treatment_after - treatment_before
bp = ax_a.boxplot(
    [control_change, treatment_change],
    labels=["Control", "Treatment"],
    patch_artist=True,
    medianprops={"color": "black"},
)
for patch, color in zip(bp["boxes"], ["#7f7f7f", "#d62728"]):
    patch.set_facecolor(color)
    patch.set_alpha(0.6)
ax_a.set_ylabel("Change in score")
ax_a.axhline(0, color="black", linewidth=0.5, linestyle="--")
ax_a.text(-0.15, 1.05, "(a)", transform=ax_a.transAxes, fontweight="bold", fontsize=11)
# Significance annotation
ax_a.text(1.5, max(treatment_change) + 0.5, "***", ha="center", fontsize=10)
ax_a.plot([1, 2], [max(treatment_change) + 0.3, max(treatment_change) + 0.3], color="black", linewidth=0.8)

# Panel (b): Time series with error bars
ax_b = fig.add_subplot(gs[0, 1])
ax_b.errorbar(weeks, control_time, yerr=0.5, fmt="o-", color="#7f7f7f", label="Control", capsize=3, linewidth=1.2)
ax_b.errorbar(weeks, treatment_time, yerr=0.5, fmt="s-", color="#d62728", label="Treatment", capsize=3, linewidth=1.2)
ax_b.set_xlabel("Week")
ax_b.set_ylabel("Cognitive score")
ax_b.legend(loc="upper left", frameon=False)
ax_b.text(-0.15, 1.05, "(b)", transform=ax_b.transAxes, fontweight="bold", fontsize=11)

# Panel (c): Forest plot of subgroup effects (using synthetic effect sizes and CIs)
ax_c = fig.add_subplot(gs[1, 0])
subgroups = ["Male (young)", "Male (old)", "Female (young)", "Female (old)", "Overall"]
effects = [1.8, 2.1, 2.3, 1.9, 2.0]
ci_lower = [1.1, 1.2, 1.5, 1.0, 1.6]
ci_upper = [2.5, 3.0, 3.1, 2.8, 2.4]
y_pos = np.arange(len(subgroups))
ax_c.errorbar(
    effects,
    y_pos,
    xerr=[[e - lo for e, lo in zip(effects, ci_lower)],
          [up - e for e, up in zip(effects, ci_upper)]],
    fmt="s",
    color="#d62728",
    capsize=3,
    markersize=7,
)
ax_c.axvline(0, color="black", linewidth=0.5, linestyle="--")
ax_c.set_yticks(y_pos)
ax_c.set_yticklabels(subgroups)
ax_c.set_xlabel("Effect size (95% CI)")
ax_c.invert_yaxis()
ax_c.text(-0.3, 1.05, "(c)", transform=ax_c.transAxes, fontweight="bold", fontsize=11)

# Panel (d): Histogram of individual changes
ax_d = fig.add_subplot(gs[1, 1])
ax_d.hist(control_change, bins=15, alpha=0.6, label="Control", color="#7f7f7f", edgecolor="white")
ax_d.hist(treatment_change, bins=15, alpha=0.6, label="Treatment", color="#d62728", edgecolor="white")
ax_d.set_xlabel("Change in score")
ax_d.set_ylabel("Frequency")
ax_d.legend(frameon=False)
ax_d.text(-0.15, 1.05, "(d)", transform=ax_d.transAxes, fontweight="bold", fontsize=11)

# Declutter every panel
for ax in [ax_a, ax_b, ax_c, ax_d]:
    ax.spines["top"].set_visible(False)
    ax.spines["right"].set_visible(False)

# Figure-level caption information is usually in the paper itself, not on the figure
fig.savefig("figure_1.pdf", dpi=300, bbox_inches="tight")

The Decomposition

Walk through the layout decomposition:

1. Four panels in a 2×2 grid. The standard arrangement for a four-panel figure. nrows=2, ncols=2.

2. Equal-sized panels. No width_ratios or height_ratios needed.

3. Panel labels (a), (b), (c), (d). Each panel has a bold letter label in its upper-left corner, placed with ax.text(..., transform=ax.transAxes). The transform=ax.transAxes makes the coordinates axes-relative, so the label is placed at the same relative position on every panel.

4. Small fonts. plt.rcParams sets global font sizes appropriate for a scientific figure (8-10 pt).

5. Figure size. figsize=(8, 6) is roughly appropriate for a double-column figure in a scientific journal (7.2 inches wide for double-column, 3.5 inches wide for single-column). The exact size depends on the journal's style guide.

6. PDF output. Scientific publications usually require PDF or EPS vector output. Save as PDF with fig.savefig("figure_1.pdf", dpi=300, bbox_inches="tight"). Setting mpl.rcParams["pdf.fonttype"] = 42 (as Chapter 12 discussed) ensures fonts embed correctly.

7. Consistent declutter. A loop removes the top and right spines from every panel, keeping the declutter consistent across the figure.

Why It Works

The scientific multi-panel figure pattern succeeds because:

1. It packs many findings into one figure. Four panels × one figure = four findings the reader absorbs together. This is more efficient than four separate figures and lets readers see the relationships between the panels.

2. Panel labels enable precise citation. The paper text can reference "Figure 1a" and readers know exactly where to look. This is a small but important convention that scientific communication depends on.

3. Consistent style within the figure creates unity. All panels use the same colors (gray for control, red for treatment), the same fonts, the same decluttering. The figure reads as one coherent object rather than four separate charts.

4. Statistical elements are woven into the charts. Error bars, confidence intervals, significance asterisks, and p-values are all on the charts, not just in the caption. Readers can assess the statistical claims directly.

5. The figure is self-contained for informed readers. A reader who has read the methods section can understand the figure without going back to the text. The labels, legends, and annotations carry the context the reader needs.

6. Black-and-white compatibility. The color choice (gray and red) reads clearly even if printed in grayscale. This is a Chapter 3 principle applied to scientific work.

7. PDF with embedded fonts ensures reproducibility. The saved figure will render identically on any system, preserving the fonts, the spacing, and the visual details.

Lessons for Practice

1. Scientific figures are dense. Use small fonts (8-10 pt) and pack as much information as possible while keeping the charts readable. This is the opposite of news graphics (which use large fonts) but appropriate for the scientific context.

2. Every panel gets a letter label. Use ax.text(-0.15, 1.05, "(a)", transform=ax.transAxes, fontweight="bold") or similar. The specific position varies, but the convention is unmistakable.

3. Error bars are not optional. Scientific claims require uncertainty visualization. Use ax.errorbar or ax.fill_between on every panel that shows estimates or measurements.

4. Use the journal's style guide. Every journal has specific requirements for figure sizes (single-column width, double-column width), font sizes, color requirements, and file formats. Check the guide before finalizing any figure.

5. PDF is the preferred output. For publication submission, PDF with mpl.rcParams["pdf.fonttype"] = 42 produces figures that journals can process. PNG at 300 DPI is an acceptable alternative if the journal requires raster.

6. Consistency across panels creates unity. Use the same colors, fonts, and styles across all panels in the same figure. Declutter with a loop. This is the Chapter 8 similarity principle applied to scientific publication.

7. The caption does work the chart cannot. Scientific figure captions are long (often 5-10 lines) because they describe what each panel shows, what tests were performed, what sample sizes are involved, and what the specific markings mean. Do not try to put all of this on the chart itself.


Discussion Questions

  1. On information density. Scientific figures are much denser than news graphics. Is this appropriate for scientific audiences, or should scientific communication adopt the lower-density conventions of data journalism?

  2. On panel labels. The (a), (b), (c), (d) convention is universal in scientific publication. Is this a useful convention for non-scientific multi-panel figures, or is it specific to the citation needs of academic writing?

  3. On statistical annotations. Asterisks for significance levels are standard in scientific figures. Some authors argue they should be deprecated in favor of explicit p-values or confidence intervals. What do you think?

  4. On the relationship to Chapter 7. Chapter 7 advocated action titles that state findings. Scientific figures typically use descriptive panel titles. Is this a legitimate exception to the action title rule?

  5. On color vs. grayscale. Scientific figures should work in black and white because many readers still print papers. Should the matplotlib defaults encourage grayscale-compatible design more explicitly?

  6. On reproducibility. Scientific figures should ideally be reproducible from code. Many published figures are edited in Illustrator after matplotlib produces the initial version. Is this acceptable, or should the final figure be pure matplotlib?


Scientific multi-panel figures are one of the most common uses of matplotlib GridSpec in professional practice. The pattern — 2×2 or 2×3 grid, panel letter labels, small fonts, error bars, consistent styling — is an application of the design-to-code translation skill to a specific set of conventions. Once you can build a four-panel scientific figure fluently, you can produce any multi-panel layout a paper or report needs.