Case Study 1: Reproducing a Pudding-Style Dashboard Layout

DataField.Dev

Case Study 1: Reproducing a Pudding-Style Dashboard Layout

The Pudding publishes visual essays with distinctive multi-panel dashboards. The layouts are elaborate by design journalism standards, but every element decomposes into matplotlib primitives once you know how to read them. This case study walks through the reproduction.

The Situation

The Pudding is a visual essay publication that specializes in data-driven stories about culture, society, and the arts. Their pieces routinely feature multi-panel figures with unusual layouts: a hero chart at the top, several supporting panels of different sizes, inline small multiples, and annotations woven through the composition. The layouts are designed for web display, often with scroll-triggered animation, but the underlying structure is something any matplotlib user can reproduce as a static figure.

For this case study, imagine a Pudding-style dashboard about global music streaming trends. The story has several parts: a headline chart showing total streams over time, a small multiple showing trends for five top genres, a ranked bar chart of the top artists, and a regional breakdown map. A single figure containing all of these components is not a small multiple (the chart types vary), not a pure dashboard (the panels tell one connected story), and not a standard small multiple (the panel sizes are intentionally unequal). It is a composed layout that expresses the narrative through structure.

Reproducing this kind of layout in matplotlib is the central exercise of this case study. The specific content is less important than the process: decompose the layout into GridSpec primitives, write the code, and verify that the result matches the sketch. This is the "layout is code" threshold concept from Section 13 applied to a specific realistic example.

The Data

For the case study, we will use synthetic data that approximates what a music streaming story might contain:

Years 2015 to 2024 with total streaming numbers (in billions).
Five genres (pop, hip-hop, rock, electronic, country) with annual streams for each.
Top 10 artists ranked by total streams in 2024.
Geographic data for five regions with total streams.

import numpy as np
import pandas as pd

np.random.seed(42)

years = np.arange(2015, 2025)
total_streams = np.cumsum(np.random.randn(len(years)) * 50 + 150) + 1000

genres = ["Pop", "Hip-Hop", "Rock", "Electronic", "Country"]
genre_data = {g: np.cumsum(np.random.randn(len(years)) * 30 + 80) + 200 for g in genres}

top_artists = pd.DataFrame({
    "artist": [f"Artist {i}" for i in range(1, 11)],
    "streams": sorted(np.random.randint(500, 3000, 10), reverse=True),
})

regions = ["North America", "Europe", "Asia", "Latin America", "Africa"]
region_streams = [450, 380, 620, 210, 150]

This data is deliberately synthetic. The real Pudding data would come from Spotify API, Billboard charts, or industry reports. For the layout exercise, the specific numbers do not matter.

The Layout

The target composition:

Top row: A hero line chart showing total streams over time, spanning the full width of the figure.
Middle row: A 1×5 small multiple showing each genre's trajectory — five small panels side by side.
Bottom left: A horizontal bar chart of the top 10 artists.
Bottom right: A regional breakdown bar chart.

Visualizing this as a grid: 3 rows, with the top row taller (hero), the middle row shorter (small multiple), and the bottom row asymmetric (bar chart in left half, bar chart in right half).

The Code

import matplotlib.pyplot as plt

fig = plt.figure(figsize=(16, 12), constrained_layout=True)

# Outer GridSpec: 3 rows with unequal heights
gs = fig.add_gridspec(
    nrows=3, ncols=5,
    height_ratios=[2, 1, 2],
    hspace=0.35,
    wspace=0.3,
)

# Hero chart spanning the top row
ax_hero = fig.add_subplot(gs[0, :])
ax_hero.plot(years, total_streams, color="#1f77b4", linewidth=2.5)
ax_hero.fill_between(years, total_streams, alpha=0.2, color="#1f77b4")
ax_hero.set_title("Global Streaming Growth, 2015-2024", fontsize=16, loc="left", fontweight="semibold", pad=12)
ax_hero.set_ylabel("Total Streams (billions)", fontsize=11)
ax_hero.spines["top"].set_visible(False)
ax_hero.spines["right"].set_visible(False)

# Middle row: small multiple of genres (5 panels)
for i, (genre, data) in enumerate(genre_data.items()):
    ax = fig.add_subplot(gs[1, i])
    ax.plot(years, data, color="#d62728" if genre == "Pop" else "#999999", linewidth=1.5)
    ax.set_title(genre, fontsize=10, loc="left")
    ax.set_ylim(0, 600)  # shared limit for fair comparison
    ax.tick_params(axis="both", labelsize=8)
    if i > 0:
        ax.set_yticklabels([])  # hide y-labels on all but the first panel
    ax.spines["top"].set_visible(False)
    ax.spines["right"].set_visible(False)

# Bottom left: top artists bar chart (spans 2 columns of the bottom row)
ax_artists = fig.add_subplot(gs[2, 0:3])
top_artists_sorted = top_artists.sort_values("streams")  # ascending for barh so biggest is at top
ax_artists.barh(top_artists_sorted["artist"], top_artists_sorted["streams"], color="#2ca02c")
ax_artists.set_title("Top 10 Artists by Total Streams", fontsize=12, loc="left", fontweight="semibold")
ax_artists.set_xlabel("Streams (millions)", fontsize=10)
ax_artists.spines["top"].set_visible(False)
ax_artists.spines["right"].set_visible(False)

# Bottom right: regional breakdown bar chart (spans 2 columns)
ax_regions = fig.add_subplot(gs[2, 3:5])
ax_regions.bar(regions, region_streams, color="#ff7f0e")
ax_regions.set_title("Streams by Region", fontsize=12, loc="left", fontweight="semibold")
ax_regions.set_ylabel("Total Streams (billions)", fontsize=10)
ax_regions.tick_params(axis="x", rotation=30)
ax_regions.spines["top"].set_visible(False)
ax_regions.spines["right"].set_visible(False)

# Figure-level title (optional — the hero chart already has an action title)
# fig.suptitle("Streaming Trends Dashboard", fontsize=18, fontweight="bold", y=1.02)

fig.savefig("pudding_dashboard.png", dpi=300, bbox_inches="tight")

The Decomposition

Walk through how the layout was decomposed into GridSpec primitives:

Grid structure. The layout has three rows with clearly different heights. The top row (hero) is tallest, the middle row (small multiple) is shortest, the bottom row (bar charts) is medium. This suggests nrows=3, height_ratios=[2, 1, 2].

Columns. The middle row needs 5 panels side by side, so at minimum 5 columns. The top and bottom rows can span those 5 columns using slice notation. ncols=5.

Hero spanning cell. The hero spans the top row across all columns. Use gs[0, :] to create an Axes at row 0, all columns.

Small multiple row. Five individual Axes at row 1, each in a separate column. Use a loop: for i in range(5): fig.add_subplot(gs[1, i]).

Bottom asymmetric split. The bottom row is split into two uneven parts. The artists chart gets the left portion (3 columns), and the regions chart gets the right portion (2 columns). Use gs[2, 0:3] and gs[2, 3:5].

Shared y-axis on the small multiple. Each genre panel should use the same y-range for fair comparison. Set ax.set_ylim(0, 600) on each, and hide y-tick labels on all but the first panel to reduce clutter.

Declutter. Every panel gets the top and right spines removed. This is the Chapter 6 declutter principle applied at scale across the dashboard.

Why It Works

The reproduction succeeds because every element is a known primitive:

plt.figure + fig.add_gridspec creates the structural container.
Slice notation (gs[0, :], gs[1, i], gs[2, 0:3]) selects specific regions of the grid for individual Axes.
height_ratios control the vertical emphasis (hero is tall, small multiple is short).
Loops with add_subplot create the small multiple panels.
ax.set_ylim on the small multiple enforces a shared scale for comparison.
Decluttering loop removes top and right spines on every panel.
constrained_layout=True handles spacing between the panels automatically.

Every one of these is a tool you now know. The only difficulty is the sequence: read the sketch, decompose the structure, write the code in the right order. With practice, this translation becomes automatic.

Lessons for Practice

1. Start with a paper sketch. Do not start coding until you have a sketch of the layout on paper. Once you have the sketch, the matplotlib code writes itself through the five-step decomposition.

2. Use constrained_layout=True for any multi-panel figure. Without it, you will fight with spacing problems. With it, matplotlib handles most of the spacing automatically.

3. The small multiple is a loop. Whenever you have repeated panels with the same chart type and different data, use a loop over the panels and the data. Do not hand-code each panel.

4. Share scales explicitly. For comparison to work, panels with the same metric need the same scale. Set ax.set_ylim or use sharex/sharey when creating the subplots.

5. Declutter every panel. A dashboard with 10 panels means 10 opportunities for chart-junk. Apply the decluttering loop consistently across every panel.

6. One action title is enough. A dashboard figure does not need an action title on every panel. Usually one figure-level action title (via fig.suptitle or on the hero panel) is sufficient. Other panels can have simple descriptive titles.

7. Test with real-looking data. Synthetic data is fine for layout testing, but the final dashboard should be tested with real data to check for edge cases (wide ranges, missing values, outliers) that synthetic data does not have.

Discussion Questions

On decomposing designs. Pick a published multi-panel figure from a news outlet and decompose it using the five-step process from Section 13.8. Can every layout be decomposed this way?
On the hero chart. The hero spans the full width in this layout. Would it be more effective as a larger square chart in the top-left with a smaller row of supporting charts next to it? How does the composition affect the story?
On small multiples within a dashboard. The middle row is a small multiple (5 genres). Is this an appropriate use of small multiples in an otherwise heterogeneous dashboard? Where does the boundary fall?
On the declutter loop. Every panel gets the same declutter treatment. Should any panel be treated differently — for example, should the hero panel have a different style from the supporting panels? What is the argument for and against consistency?
On static vs. interactive. The Pudding publishes interactive versions of these layouts with scroll-triggered animation. Does the interactive version communicate something the static matplotlib version cannot? What are the trade-offs?
On reproducibility. The matplotlib version of this dashboard is static and reproducible. The Pudding's interactive version is tied to specific web technologies. Which is more valuable for long-term preservation of the data story?

Building complex multi-panel figures is one of those skills that seems intimidating until you see it done once, and then becomes straightforward. The threshold concept is that layout is code. Once you internalize it, any published dashboard becomes a reference you can reproduce.