Chapter 7: Typography, Annotation, and the Words on Your Chart

49 min read

> "A chart is not self-explanatory unless it explains itself."

Learning Objectives

Select appropriate fonts for data visualization: sans-serif for screen, limited font families, legible sizes
Apply a title hierarchy that makes a chart self-explanatory — main title (the insight), subtitle (the context), axis labels (the data)
Write effective chart titles that state the finding, not just the topic
Add annotations — callout arrows, text boxes, highlighted regions — that direct the reader's eye to the key finding
Format axis labels, tick labels, and legends for maximum readability
Explain why direct labeling is often superior to a legend, and identify when each is appropriate
Apply source attribution and footnotes as a standard practice for published visualizations
Distinguish between an action title that states the finding and an editorialized title that overstates the evidence

In This Chapter

7.1 The Self-Explanatory Chart Standard
7.2 Typography Basics for Data Visualization
7.3 The Title: Action Titles vs. Descriptive Titles
7.4 Annotation: The Text That Does the Most Work
7.5 Axis Labels, Tick Labels, and Number Formatting
7.6 Direct Labeling vs. Legends
7.7 Source Attribution and Context Notes
7.8 Bringing It Together: The Climate Plot with Words
Chapter Summary
Spaced Review: Concepts from Chapters 1-6

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 7: Typography, Annotation, and the Words on Your Chart

"A chart is not self-explanatory unless it explains itself." — Dona Wong, The Wall Street Journal Guide to Information Graphics

The chart you made in Chapter 6 is cleaner than it was. The top spine is gone. The gridlines have been lightened. The 3D effects are deleted. The decorative borders are gone. What remains is the data — bars, lines, dots, cells — on a clean plotting area with minimal structural noise.

And that clean chart, sitting alone on a page with no words, is almost unreadable.

This is the paradox of decluttering. When you remove chart-junk, you do not make the chart self-explanatory. You make it readable, in the sense that the eye can find the data without fighting through visual noise. But "readable" and "understandable" are not the same thing. A viewer who sees a clean time series line with no title, no axis labels, no units, no source, and no annotation knows only one thing: some quantity changed over time. What quantity? On what scale? From what baseline? During what period? With what consequence? These are questions the decluttered chart cannot answer by itself. They are the work of words.

This chapter is about those words. Not just any words — the specific kinds of text that turn a correct chart into a self-explanatory one. The title that states the finding. The subtitle that provides context. The axis labels that name the quantities and their units. The annotation that points at the most important feature and explains why it matters. The source attribution that lets the viewer trust the data. Each of these elements is non-data ink. Each was supposed to survive the declutter procedure from Chapter 6 for exactly one reason: the chart is incomprehensible without them.

The words on a chart are not decoration. They are not ornamentation. They are not editorial flair. They are the bridge between the data and the reader's understanding, and when they are missing or poorly chosen, the chart fails to communicate even when every bar and line is drawn perfectly. This chapter teaches you how to choose and place those words so that your charts can be read, understood, and remembered by people who have never seen them before.

No code in this chapter. Part II is still library-agnostic. The matplotlib function calls — set_title, set_xlabel, annotate, FuncFormatter, legend — are waiting for you in Chapters 10 and 12. The principles are what matter here, and the principles are what will survive when you move between matplotlib, seaborn, Plotly, and Altair in Parts III through V.

7.1 The Self-Explanatory Chart Standard

What It Means

A self-explanatory chart is one that can be read and understood without any surrounding text. Drop it into an email. Paste it onto a slide. Share it on social media. Remove it from the context of the article or report it was designed for. The chart still communicates. A reader who has never seen the chart before can look at it for five to ten seconds and come away knowing what it is about, what the data shows, where the data came from, and what the main finding is.

This is a high bar. Most charts do not meet it. The typical published chart relies on surrounding prose to provide context: a sentence in the accompanying article that explains what the chart is showing, a footnote that states the source, a caption that clarifies the units. Strip the chart from that context and it becomes ambiguous. The reader no longer knows whether the y-axis is measured in thousands or millions, whether the time range is calendar year or fiscal year, whether the numbers represent gross or net, or which line is the actual data versus the forecast.

The self-explanatory standard asks you to build all that context directly into the chart. The title explains what the chart is about. The subtitle adds the specifics — time range, geographic scope, important caveats. The axis labels state the units inline. The annotation explains the key feature. The source attribution cites the data provenance. Every piece of information a reader needs to interpret the chart is on the chart itself.

Why is this standard worth pursuing? Because charts travel. A chart you make for a business report will end up in someone's slide deck, which will end up in a meeting summary, which will end up forwarded in an email thread to people who never read the original report. A chart you make for a news article will be screenshotted and shared on social media without the surrounding prose. A chart you put on a dashboard will be glanced at by dozens of people who do not have time to read the accompanying documentation. If your chart cannot stand alone, it fails at every step of this journey — and the reader, seeing only a decontextualized image, either misinterprets it or dismisses it.

The 5-Second Test

A useful heuristic: show your chart to someone who has not seen it before, and ask them to describe it in their own words after exactly five seconds. If they can state (1) what the chart is about, (2) what the main finding is, and (3) what the units are, the chart has passed the 5-second test. If they cannot, something is missing, and the missing thing is almost always in the words.

Five seconds is a deliberately short window. It matches the attention budget that most charts actually receive in their real-life use cases: a glance at a slide, a scroll through a dashboard, a flick past a chart in a social media feed. A chart that requires thirty seconds of careful study to understand may still be a good chart for a reader who has thirty seconds to give it. But most readers do not. Design for the realistic attention budget, which is short.

The 5-second test also clarifies what the words are for. They are not there to provide comprehensive documentation. They are there to enable a fast, correct, first-impression reading. The title tells the reader what the chart is about. The subtitle adds essential context. The axis labels state the units. The annotation highlights the finding. Everything else — the details, the methodology, the data lineage — can live in a caption or an appendix. The words on the chart itself are optimized for the five-second reading, not the five-minute one.

The Four Categories of Chart Words

The text that belongs on a chart falls into four categories, each with a specific job:

1. The title. One line at the top of the chart. States what the chart is about — or, better, what the reader should conclude. The single most important piece of text. We will cover titles in depth in Section 7.3.

2. The subtitle. One or two lines immediately below the title. Provides essential context: time range, geographic scope, units if they are not in the axis labels, important caveats. Rarely present on default plots, almost always worth adding.

3. The axis labels and tick labels. Words and numbers on the axes themselves. Name the quantities, state the units, format the values. Essential for quantitative reading. Default plotting libraries usually produce these automatically but rarely format them well.

4. The annotations. Text placed directly on the data — callout arrows pointing at specific features, shaded regions with labels, text overlays explaining an outlier or a turning point. Annotations are the text that does the most work per word. They appear only where they are needed and carry specific information about specific parts of the chart.

A fifth category, source attribution, is not a "chart word" in the narrative sense but belongs to the chart's text layer — a line at the bottom of the chart naming the data source. It is short, factual, and essential for trust.

Every category has its own design considerations, its own common mistakes, and its own role in making the chart self-explanatory. The rest of the chapter goes through each category in turn.

Check Your Understanding — Pick a chart you made recently. Imagine the chart appearing as a standalone image on social media, with no surrounding text. Can a viewer who has never seen it before answer these questions in five seconds: what is the chart about, what is the main finding, and what are the units? If not, list the words that are missing.

7.2 Typography Basics for Data Visualization

Before we talk about what the words should say, we need to talk about how they look. Typography — the choice of font, size, weight, and style — affects whether the words can be read at all and, if they can, whether they fight the data for attention or recede quietly into their supporting role.

The good news is that you do not need a typography degree to make reasonable choices for data visualization. A small number of principles cover the vast majority of situations. This section gives you those principles in the form of specific, actionable guidance.

Principle 1: Use a Single, Legible, Sans-Serif Font Family

Data visualization uses very little text compared to a book or an article. A typical chart might have a title, a subtitle, four to six axis labels, ten to twenty tick labels, and two or three annotations — thirty to fifty words total. The legibility of those words matters more than their stylistic flourish, because there are so few of them and each one carries disproportionate weight.

For screen and print data visualization, the default choice is a sans-serif font. Sans-serif fonts (Helvetica, Arial, Inter, Roboto, IBM Plex Sans, Source Sans Pro, Open Sans) are easier to read at small sizes on screens than serif fonts (Times New Roman, Georgia, Charter). The historical reasoning — that serifs help the eye follow a line of body text — applies to books and newspapers, not to short labels on a chart. On a chart, serifs are visual noise.

The matplotlib default font (DejaVu Sans) is acceptable but not beautiful. Most professional chart designers override it with something cleaner. A personal style file that sets the font to Inter, Source Sans Pro, or IBM Plex Sans is one of the single highest-value additions you can make to your visualization workflow. The chart does not need to use the same font as the surrounding document; it is its own visual object with its own typographic identity.

Use one font family per chart. Mixing sans-serif with serif, or mixing two different sans-serif fonts, looks amateurish and is almost never justified. If you want to create hierarchy, use different weights and sizes of the same font, not different fonts. The chart is not a fashion show.

Principle 2: Establish a Size Hierarchy

The text on a chart serves different purposes at different levels of importance. The title is more important than the subtitle, which is more important than the axis labels, which are more important than the tick labels, which are more important than the source attribution. A good chart makes this hierarchy visible through type size.

A typical hierarchy for a standalone chart, at normal resolution:

Title: 16–20 pt, bold or semi-bold weight
Subtitle: 12–14 pt, regular weight
Axis labels: 11–12 pt, regular weight
Tick labels: 9–10 pt, regular weight
Annotations: 9–11 pt, regular or italic
Source attribution: 8–9 pt, regular, often muted color

The specific numbers depend on the target size of the chart. For a slide (where the chart will be projected), every size goes up. For a dashboard tile (where the chart is small), every size goes down. But the ratios between sizes — the relative hierarchy — stays roughly the same. The title is always noticeably larger than the tick labels. The tick labels are always noticeably smaller than the axis labels.

The matplotlib default hierarchy is often too flat — titles, labels, and tick labels are all roughly the same size, which means the eye cannot tell what is important. Establishing an explicit hierarchy is another high-value addition to your personal style file.

Principle 3: Use Weight, Not Color, for Emphasis

When you want a word to stand out, the natural instinct is to make it a different color — red for emphasis, blue for importance, green for positive. Resist this instinct. Color is precious in data visualization because it carries encoding information (Chapter 3), and spending color on text emphasis means you have less color available for the data itself.

Instead, use weight (bold, semi-bold, regular, light) to create emphasis within the text layer. A bold title against a regular-weight subtitle creates hierarchy without using any color. A bold annotation on an otherwise regular-weight label makes the annotation stand out without competing with the data. Weight is a free emphasis channel that does not consume your color budget.

If you must use color for text emphasis — for example, to match an annotation to a specific data series — reserve it for the annotation itself and use a muted version of the data color rather than a saturated one. The text should feel related to the data, not competing with it.

Principle 4: Align Text Meaningfully

Default chart text placement is often haphazard: titles floating above the plotting area with random horizontal alignment, axis labels centered on the axis regardless of where the data actually is, annotations placed wherever the software decided to put them. Haphazard alignment is a form of visual noise that the reader's eye has to navigate.

Good typographic alignment is purposeful:

Titles and subtitles should left-align with the plotting area, not with the figure. A left-aligned title creates a consistent visual edge with the y-axis and the leftmost data, which anchors the chart. Centered titles float above the plotting area with no visual relationship to it.
Axis labels should sit cleanly on the axis they describe, with appropriate padding from the tick labels.
Annotations should align with the data they point to, not with some arbitrary grid position. A callout about the 2016 data point should be near the 2016 data point, with an arrow connecting them if necessary.
Tick labels should rotate only when overlap forces it. Rotated labels are harder to read than horizontal labels. If rotation is necessary, 45 degrees is more legible than 90. If 45-degree rotation is still not enough, consider abbreviating the labels or using horizontal bars instead of vertical bars.

Small alignment decisions compound into visible polish. A chart whose elements are aligned to clear vertical and horizontal lines looks professional; a chart whose elements float randomly looks like it was produced in a hurry by a computer.

Principle 5: Leave Air Around the Text

Every piece of text on a chart needs a margin of whitespace around it. A title crammed against the top of the plotting area looks claustrophobic. Axis labels touching the tick labels blur together. Annotations pressed against the data lines become hard to read. The solution is simple: add space.

Whitespace around text is not wasted space. It is structural space that separates the text visually from its neighbors and gives the reader's eye breathing room. Default plotting libraries often do not add enough whitespace, because they are optimized to fit the chart into the smallest possible pixel area. For publication-quality output, you typically want more padding than the default provides.

These five principles — single font family, size hierarchy, weight for emphasis, meaningful alignment, whitespace around text — are the typographic foundation. They are not controversial. They are not hard to apply. They are ignored by default plotting libraries, which is why your charts will look better than the defaults as soon as you start applying them consciously.

Check Your Understanding — Look at a default matplotlib chart. How many font sizes does it use? Is there a clear hierarchy between title, labels, and tick labels? Are any elements bolded for emphasis? What would you change to improve the typography?

7.3 The Title: Action Titles vs. Descriptive Titles

The title is the most important piece of text on the chart. It is the first thing the reader sees after glancing at the data itself. It is the one element that will survive every kind of decontextualization — a social media share, a slide embed, an email forward — because it sits directly above the chart, inseparable from the image. The title does more work per word than any other text on the chart, and the difference between a good title and a bad title is usually the difference between a chart that communicates and a chart that does not.

The central question of this section is: what should the title say?

The Descriptive Title: Topic Without Argument

The default title on most charts is a descriptive title: a short phrase that states what the chart is about. "Global Temperature Over Time." "Quarterly Revenue, 2020–2024." "Vaccination Rates by Country." These titles tell the reader what the chart is showing, but they do not tell the reader what to conclude. They are topic labels.

Descriptive titles are not wrong. They are appropriate in some contexts — exploratory charts in a Jupyter notebook, charts in a reference table where the reader will scan many titles, charts in an academic paper where the title is deliberately neutral and the analysis lives in the surrounding text. For these contexts, a descriptive title is enough: it identifies the chart without making claims.

The problem with descriptive titles is that they are the default. Most chart makers write a descriptive title without thinking about whether the context calls for something more. And for the majority of real-world use cases — business reports, news graphics, dashboards, social media posts — the descriptive title is insufficient. The reader wants to know what the chart is telling them, not just what the chart is about.

The Action Title: Finding as Headline

An action title states the finding. "Global Temperatures Have Risen 1.2 Degrees Since 1900." "Quarterly Revenue Grew 18% in 2024, the Best Year Since 2019." "Most Countries Now Exceed 80% Vaccination Coverage." These titles tell the reader what to conclude. They do not ask the reader to figure it out from the chart; they spell it out in words and let the chart serve as the evidence.

The difference is enormous. A reader who sees an action title and then looks at the chart will find the finding immediately — because the title has told them what to look for. A reader who sees a descriptive title and then looks at the chart has to work out the finding on their own, and in the 5-second reading window, they often do not. The action title converts the chart from a puzzle into a presentation.

Cole Nussbaumer Knaflic, in Storytelling with Data, calls this "the Big Idea" — the single sentence that states what the chart is saying, written as a declarative claim. She argues that every explanatory chart should have a Big Idea, and that the Big Idea should appear as the title. The practice is not universally adopted — academic charts still tend to use descriptive titles, and some corporate style guides prohibit "editorial" titles — but in the traditions of business communication and data journalism, the action title is now the standard.

Here is a specific transformation. A chart shows Meridian Corp's quarterly revenue for the past three years. Revenue has been growing slowly each quarter, with a noticeable acceleration in 2024.

Descriptive title: "Meridian Corp Quarterly Revenue, 2022–2024"
Action title: "Meridian Corp Revenue Growth Accelerated in 2024, Reaching $5.4B in Q4"

The action title tells an executive glancing at the slide what the chart is saying. They can then look at the chart to verify the claim — and the chart, which shows the bars growing taller toward the right, does verify it. The reader's attention is directed to the finding immediately, and the chart provides the visual confirmation.

Writing a Good Action Title

An action title is harder to write than a descriptive one because it requires you to know what the chart is telling you before you write the title. That is not always obvious. Sometimes a chart shows several findings at once, and you have to choose which one is the most important. Sometimes the finding depends on a comparison, a threshold, or a time window that is not obvious from the data alone. Writing a good action title is a disciplined act of data interpretation.

Some specific guidelines:

1. State the finding as a complete sentence, then shorten it. Start by writing what the chart shows as a sentence: "Global temperatures have risen by 1.2 degrees Celsius since the 1880s, with most of the increase occurring after 1980." Then shorten to the essential claim: "Global Temperatures Have Risen 1.2 Degrees Since 1900." The shortened version is the title. The longer version becomes the subtitle or the surrounding text.

2. Prefer specific numbers to vague words. "Revenue grew significantly" is vague. "Revenue grew 18%" is specific. A specific number gives the reader something concrete to anchor on and makes the action title feel earned. If you cannot cite a specific number, the finding may be less clear than you thought.

3. Use verbs of change for time-series charts. "Rose," "fell," "doubled," "accelerated," "plateaued," "reversed." These verbs carry the direction of the finding in a single word and make the title feel alive. Avoid flat verbs like "show" and "depict" — they turn an action title back into a descriptive one.

4. Keep the title to one line. A title that runs to two lines loses visual hierarchy and starts to compete with the subtitle. If your finding is genuinely too complex for one line, break it into a one-line title plus a subtitle that completes the thought.

5. Avoid jargon unless your audience is technical. "Y-over-Y Growth Decelerated by 240 BPS in Q3" is a valid finding for a finance audience but is incomprehensible to most readers. "Revenue Growth Slowed in Q3" makes the same claim in plain language. Match the vocabulary to the audience.

The Ethical Line: Finding vs. Editorial

The most common objection to action titles is that they are "editorial" — that stating a finding in the title introduces bias where a neutral descriptive title would not. This objection, honestly engaged with, is Chapter 4's territory.

Chapter 4 established that every chart is an editorial. There is no neutral chart. The choice of time range, baseline, color, and chart type are all editorial choices that shape the reader's interpretation. A descriptive title does not make a chart neutral; it only hides the editorial stance. An action title makes the editorial stance explicit — which is arguably the more honest approach, because the reader can now evaluate the claim directly.

The ethical line is not between action titles and descriptive titles. It is between action titles that state a defensible finding and action titles that overstate the evidence. "Global Temperatures Have Risen 1.2 Degrees Since 1900" is a defensible finding — the number is in the data and the time range is clearly specified. "Catastrophic Climate Emergency Destroys Planet" is an overstatement — it adds interpretation the data does not directly support.

Write action titles that your chart actually supports. If your chart shows a 5% revenue decline, the action title is "Revenue Fell 5% in Q3," not "Revenue Collapsed." If your chart shows a correlation of 0.4 between two variables, the action title is "Variables A and B Are Moderately Correlated," not "A Causes B." The discipline is to match the title to the evidence — not to suppress the finding, but to state it precisely.

When Descriptive Titles Are Still Right

Despite the push toward action titles, there are contexts where a descriptive title is the correct choice:

Exploratory charts in a Jupyter notebook. You are not sure what the finding is yet. The chart is a tool for you, not a communication to an external audience. A descriptive title ("Histogram of Session Durations") is appropriate because you do not yet know the story.
Reference tables and galleries. A table of charts meant to be browsed quickly benefits from consistent descriptive titles that let the reader scan. Adding an action title to every chart in a gallery is exhausting.
Academic publications with neutral conventions. Some journals explicitly require descriptive figure captions and reserve the finding for the prose. Follow the journal's style.
Dashboards where the chart is updated dynamically. A dashboard chart whose title changes as the data changes cannot have a static action title. Descriptive titles with the current value displayed as a separate number work better in this context.

The rule is not "always use action titles." The rule is "choose deliberately between descriptive and action titles based on context, and when in doubt, try the action title first and see whether it serves the reader."

Check Your Understanding — For each of the following descriptive titles, write an action title version that states a specific finding: (1) "Quarterly Revenue," (2) "Vaccination Rates by Country," (3) "Website Traffic by Source," (4) "Stock Price Over Time." You may need to invent plausible data to write the finding — that is fine, the exercise is to practice the transformation.

7.4 Annotation: The Text That Does the Most Work

If the title is the most important text on the chart, the annotation is the most useful. Titles tell the reader what the chart is about. Annotations tell the reader what specific parts of the chart mean. An annotation is a small piece of text, placed directly on the data, that explains a feature the chart maker wants to highlight.

Consider a line chart of Meridian Corp's stock price over the past five years. Without annotation, the reader sees a line going up and down. The line shows everything, but it does not explain anything. With annotation, the chart becomes a story: an arrow pointing at the drop in March 2020 with the label "COVID market crash," a small shaded region in 2022 labeled "Supply chain disruption," a callout at the peak in late 2024 labeled "All-time high: $186.30." The same data, the same chart type, the same line — but now the chart tells the viewer what to see.

Annotation is the single most underused technique in default plotting. Most charts in most reports have no annotation at all. The chart maker drew the data and stopped. The result is a chart that shows everything and explains nothing. Adding even one or two annotations transforms such a chart into something that communicates.

What Annotations Do

Annotations serve several specific purposes, each with its own design considerations:

1. Identifying outliers and inflection points. The data point that is unusually high, unusually low, or marks a change in trend. "2016: warmest year on record." "January 2022: peak subscribers." "March 2020: 10-year low." These annotations draw the eye to specific features and label them with specific meaning.

2. Providing context for unusual events. When the data reflects a known external event — a pandemic, a policy change, a product launch, a merger — the annotation names the event so the reader does not have to guess. "COVID-19 lockdown." "Brexit vote." "New product launch." Without these annotations, the reader sees a data anomaly without context and may draw wrong conclusions.

3. Marking thresholds and targets. A horizontal line at a goal value, labeled "Target: 80%." A vertical line at a deadline, labeled "Go-live." A shaded region for an acceptable range, labeled "Normal range." These annotations tell the reader what to compare the data against.

4. Highlighting the finding. When the chart's action title states a specific finding, an annotation in the body of the chart reinforces it. If the title says "Revenue Grew 18% in 2024," a callout on the 2024 bar saying "+18%" makes the finding impossible to miss.

5. Explaining methodology briefly. "Data excludes Q4 seasonal adjustments." "Values indexed to 2020 = 100." "Error bars show 95% confidence intervals." These annotations clarify how the data was processed, preventing misinterpretation.

Annotation Design Principles

An effective annotation is small, specific, and clearly connected to the feature it describes. A few principles:

1. Place annotations near the data they describe. A callout about the 2016 temperature spike should be near the 2016 data point. If necessary, use an arrow to connect the text to the data — but the connection should be obvious without any cognitive effort. A reader should not have to scan the chart to figure out which data point an annotation refers to.

2. Keep annotation text short. An annotation is not a paragraph. It is a phrase or a short sentence. "2016: warmest year on record" is a good annotation. "In 2016, the global average temperature anomaly reached +1.0 degrees Celsius, making it the warmest year in the instrumental record at the time, a record that was later broken by subsequent warm years." is a paragraph that belongs in a caption or the surrounding text, not on the chart.

3. Use leader lines or arrows sparingly. A leader line (a thin line connecting the annotation text to the data point) is useful when the annotation cannot be placed immediately adjacent to the data. Arrows are useful when direction matters — for example, pointing at a specific trend. But too many arrows turn the chart into a diagram with a forest of overlapping lines. If you find yourself drawing more than three or four arrows, the chart is probably trying to say too much at once.

4. Match annotation color to the data it describes. If the annotation refers to a specific line in a multi-line chart, draw the annotation text in a muted version of that line's color. This visual association helps the reader connect the annotation to the right data without any cognitive effort.

5. Distinguish annotations from decoration. Every annotation should carry information. If you find yourself adding text to a chart because the chart "looks empty," you are probably adding decoration rather than annotation, and the text will not earn its place. The declutter procedure from Chapter 6 applies to annotations as well as to structural elements.

6. Use shaded regions for ranges, arrows for points, callouts for specific labels. Different annotation types fit different features. A shaded vertical band is good for marking a time period (a recession, a campaign, a season). An arrow is good for pointing at a single dramatic point. A text callout is good for labeling a data feature with a name. Mixing these types is fine when each use is clearly serving a different purpose.

The Annotation Budget

Like the ink budget from Chapter 6, annotations have an implicit limit. A chart with one annotation has one very prominent feature. A chart with three annotations has three points of focus. A chart with ten annotations has no focus at all — the reader cannot tell which annotations are most important, and the chart becomes a wall of text competing with itself.

For most charts, one to three annotations is the right range. A single annotation calls out the main finding; a few additional annotations provide context for the most important secondary features. More than three annotations usually means the chart is trying to tell multiple stories at once, and the right response is often to split it into two charts, each focused on one story, each with its own annotations.

The Financial Times, The Economist, and The New York Times graphics teams tend to use annotations with discipline: usually one or two per chart, placed precisely, written tightly. Studying published charts from these outlets is a fast way to build intuition for how to use annotations effectively. Their charts read easily not because they have few annotations but because the annotations they do have are doing specific work.

Check Your Understanding — Look at a chart you made recently. Identify the most important feature of the data — the finding, the outlier, the turning point. Imagine adding a single annotation that explains that feature in under fifteen words. Write the annotation. Would the chart be more effective with that annotation than without it?

7.5 Axis Labels, Tick Labels, and Number Formatting

The axes are where quantitative data meets words. Even in a chart with no title, no subtitle, and no annotations, the axes have labels (by default) and tick labels (by default). Default axis labels, however, are almost always inadequate — not because they are wrong, but because they are minimal, unformatted, and often confusing. A significant fraction of "bad-looking" default charts look bad because their axis text is being left on default, and fixing the axis text is one of the highest-return small changes you can make.

Axis Labels: Name the Quantity, State the Unit

An axis label should answer two questions: what is this axis measuring, and what are the units. Most default axis labels only answer the first question. A chart of Meridian Corp revenue might have a y-axis labeled "Revenue," which tells the reader the dimension but not the scale. Is it dollars? Euros? Thousands of dollars? Millions? The unit matters for interpretation, and omitting it invites misreading.

Put the unit in the label itself, inline. "Revenue (USD millions)." "Temperature Anomaly (degrees Celsius)." "Session Duration (minutes)." "Vaccination Coverage (%)." The parenthetical unit is a convention that most readers recognize, and it eliminates any ambiguity. It also means the tick labels themselves can be simple numbers without unit symbols, which keeps the tick area clean.

If the unit is obvious from context (time on the x-axis of a time series, for example), you can sometimes omit the explicit unit — "Year" is clearer than "Year (calendar)." But when in doubt, put the unit in the label. The cost is a few extra characters; the benefit is eliminated ambiguity.

Tick Labels: Format for Human Reading

Default tick labels on a y-axis showing revenue might look like: 5000000, 10000000, 15000000, 20000000. These are the raw numerical values, but they are almost unreadable. The reader has to count digits to tell whether the axis runs to 5 million or 50 million or 500 million. Formatting is what turns these raw numbers into something humans can read at a glance.

A few specific formatting improvements:

1. Use thousands separators. 5,000,000 is more readable than 5000000. This single change dramatically improves legibility. Most plotting libraries support it but do not use it by default.

2. Scale to natural units and note the scaling. Instead of 5,000,000, show 5M or 5.0M with a label noting "in millions." Instead of 0.045, show 4.5% if the quantity is a proportion. The goal is to put the tick labels in the form the reader naturally thinks about the quantity.

3. Use the minimum number of digits needed. If the y-axis ranges from 5.0 to 10.0, show the ticks as "5, 6, 7, 8, 9, 10" — not "5.00, 6.00, 7.00, 8.00, 9.00, 10.00." Unnecessary decimal places are visual noise.

4. Right-align numerical tick labels on the y-axis. Numbers with the same number of digits should line up so the reader can compare them at a glance. Most plotting libraries do this by default, but a few do not.

5. Format dates sensibly on time axes. "2020-01-01" is technical and hard to read. "Jan 2020" or "2020" is more human. For a time range spanning years, show year labels. For a time range spanning months, show month labels. For a time range spanning days, show date labels with abbreviated months. The tick labels should reflect the time resolution the chart actually cares about.

6. Rotate tick labels only when necessary. Vertical x-axis tick labels (rotated 90 degrees) are harder to read than horizontal ones. If your labels are too long to fit horizontally, first try abbreviating them or using a horizontal bar chart. If rotation is still needed, 45 degrees is easier to read than 90 degrees.

The details are tedious, but each one makes a measurable difference. A chart with well-formatted tick labels looks professional; a chart with default tick labels looks amateurish. The difference is usually a matter of using a formatter from matplotlib.ticker (which we will cover in Chapter 12) to apply the correct format to the axis.

Tick Density: Fewer Ticks Than You Think

Default plotting libraries often put too many ticks on an axis. A y-axis from 0 to 100 might get ticks at every 10: 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 — eleven tick labels. This is more than the reader needs. Five or six tick marks is usually enough: 0, 20, 40, 60, 80, 100. The reader interpolates between the ticks.

Fewer ticks means less visual noise on the axis. The chart looks cleaner, the tick labels have more room to breathe, and the data is not competing with a forest of minor ticks. When in doubt, cut the number of ticks in half and see if the chart still reads correctly.

For time axes, the principle is the same. A chart showing five years of data does not need a tick label for every month — one label per year is usually enough, with optional minor ticks at quarters. A chart showing five days of data can label every day. Match the tick density to the time resolution that matters for the chart's story.

The Axis Does Not Always Need Labels

One of the more radical ideas from the Chapter 6 declutter discipline is that the axis itself is sometimes optional. If the chart has a small number of data points, each directly labeled with its value, the y-axis becomes redundant — the reader can read the values from the labels instead. If the chart is a sparkline (a small inline chart meant to show a trend), axes are distracting and are typically omitted entirely.

This does not mean axes are usually optional. For most charts with more than a handful of data points, the y-axis carries essential quantitative information and must be present. But for specific designs — small multiples, sparklines, heavily annotated figures — consider whether the axis can be reduced or removed to clean up the chart.

Check Your Understanding — Look at a default matplotlib or Excel chart of a continuous variable. Are the tick labels formatted for human reading (thousands separators, appropriate decimal places, natural units)? Are there more tick marks than the reader needs? What formatting changes would improve the chart without removing any data?

7.6 Direct Labeling vs. Legends

When a chart has multiple categorical series, the reader needs a way to tell which series is which. The default solution is a legend: a small box somewhere on or near the chart that pairs each color with a category name. The alternative is direct labeling: placing the category name directly next to its line, bar, or point in the chart itself.

Legends are the default, but direct labeling is usually better. This section explains why and when.

Why Legends Are the Default (and Why That Is a Problem)

Legends are the default because they are the easiest solution for a plotting library to implement. The library knows the category names from the data. It puts them in a box. It places the box somewhere reasonable (usually "best" or "upper right"). The chart has a legend, the categories are identified, and the library has done its job.

The problem is what happens in the reader's head. When the reader sees a line chart with a legend, their eye has to move between the chart and the legend repeatedly:

Look at the chart. See a red line going up and a blue line going down.
Move eyes to the legend. Read: "Red = Product A, Blue = Product B."
Move eyes back to the chart. Remember which line was red.
Form the conclusion: Product A is going up, Product B is going down.

That sequence takes time, and every eye movement is a small cost. For a chart with two series, the cost is manageable. For a chart with five or six series — especially if the series are similar colors, or if the lines cross each other — the reader's working memory gets overwhelmed and they start losing track of which line is which.

Direct Labeling: The Eye Stays on the Data

Direct labeling eliminates the eye movement. Instead of a legend box, each series gets its name placed directly next to the line, at the end of the line, or at a point where the line is most visually prominent. The reader sees a red line with "Product A" written next to it and a blue line with "Product B" written next to it, and the identification happens without the eye ever leaving the data.

The benefits are significant:

1. The chart becomes more readable in the first glance. The reader can identify each series immediately without a lookup step. The 5-second test is much easier to pass.

2. The chart becomes self-explanatory when decontextualized. A chart that relies on a legend may be cropped awkwardly when it is shared on social media or embedded in a slide. A chart with direct labels survives cropping because the labels are inside the plotting area, attached to the data.

3. The color burden is reduced. If each line is directly labeled, the reader does not need to distinguish all the colors from each other — they only need to match each label to its line. This means you can use muted colors or even shades of gray for secondary series without losing comprehensibility.

4. The chart feels cleaner. Legend boxes are rectangular artifacts that sit awkwardly within or next to the plotting area. Direct labels flow with the data. The chart looks less like a database output and more like a designed piece of communication.

When Direct Labeling Fails

Direct labeling is not always the right choice. There are specific cases where a legend is better:

1. Many series. With fifteen or twenty series, direct labeling becomes impossible because the labels overlap each other. A dense spaghetti chart with twenty lines cannot be directly labeled; you need a legend — or, better, you need a different chart type (small multiples, or a highlight pattern where most series are grayed out and one is emphasized).

2. Short or crossing series. If the lines are so short that there is no clean place to put a label, or if they cross each other at multiple points, direct labeling gets tangled. A legend off to the side is cleaner.

3. Bar charts with many categories. For a horizontal bar chart with many bars, the category labels are direct labels — they sit on the y-axis next to their bars. A separate legend would be redundant. But for a grouped or stacked bar chart, a small legend identifying the groupings may be necessary.

4. Interactive charts with hover. If the chart is interactive and the reader can hover over a line to see its name, a legend is less critical. Hover is a form of direct labeling that only appears on demand.

The Hybrid: Direct Labels for the Important Series

A powerful middle ground is to directly label only the series that matter and use a legend (or nothing) for the rest. If your chart has fifteen lines but the story is about three of them, directly label those three and let the other twelve fade into a background of muted gray lines. This technique, sometimes called the "highlight strategy," focuses the reader's attention on the relevant series without hiding the context.

The highlight strategy is common in news graphics. A chart of pandemic cases across fifty U.S. states might color only the reader's state in a bright color with a direct label, while the other forty-nine states appear as muted gray lines providing context. The reader sees their state immediately; the other states provide the comparison without demanding identification.

The Default: Try Direct Labels First

As a working rule, try direct labels first. If the chart has five or fewer series and they have clean endpoints, direct labels almost always look better than a legend. Delete the legend, label the series directly, and see if the chart improves. In most cases, it does. The exception is not the rule.

The matplotlib library makes direct labeling slightly harder than legends, because there is no single function for it — you have to call ax.text or ax.annotate for each label, calculate appropriate positions, and handle overlap manually. This friction is one reason legends remain the default in many workflows. In Chapter 12, we will cover the matplotlib code for direct labeling in detail. For now, the important point is the principle: direct labels are usually better, and you should reach for them whenever the chart allows.

Check Your Understanding — Look at a chart you have made that uses a legend. Could you replace the legend with direct labels? What would you have to change about the chart's layout to make room for the labels?

7.7 Source Attribution and Context Notes

The final category of chart text is the one that is easiest to forget and most important for trust: the source attribution. A single line at the bottom of the chart naming the data source. Without it, the reader has no way to evaluate whether the chart's claims are credible, whether the data is recent, or whether the methodology is defensible. With it, the reader has a foothold for skepticism and verification — which is, paradoxically, what makes the chart more trustworthy.

What a Source Attribution Includes

A source attribution is short and factual. At minimum, it names the organization or publication that provided the data. For many charts, that is enough: "Source: NOAA" or "Source: U.S. Bureau of Labor Statistics" tells the reader where to look if they want to verify the claim.

For higher-stakes contexts — academic publications, journalism, charts for policy audiences — the attribution should include:

The source organization (NOAA, World Bank, a specific journal, a specific report)
The specific dataset or report title, if relevant
The data vintage (the date the data was collected or the as-of date)
Any processing notes ("adjusted for inflation," "3-year rolling average")

A typical attribution for a climate chart might read: "Source: NOAA Global Temperature Anomalies, 1880–2024. Baseline: 1951–1980 average."

The attribution sits at the bottom of the chart, typically in small type (8–9 pt), often in a muted color that does not compete with the data or the title. It is non-data ink, but it is essential non-data ink — it is exactly the kind of ink that Chapter 6 said should survive decluttering.

Why Attribution Is More Than Citation

Source attribution is not just a citation convention — it is a trust mechanism. When the reader sees the source, they can judge whether the source is reputable. They can look up the original data if they want to verify. They can see whether the chart is based on official statistics or a single researcher's blog post. The attribution converts the chart from an unverifiable claim into a piece of evidence.

This is also why attribution should be on the chart itself, not in a caption or footnote. Captions and footnotes get separated from the chart when it is shared, screenshotted, or embedded. The attribution should travel with the chart, which means it has to be inside the image. A chart without an on-image attribution is one that can be shared in a way that strips its provenance, and a chart without provenance is one that can be casually distrusted.

Major data journalism outlets (New York Times, Financial Times, The Economist) always include source attribution on the chart itself. So do scientific visualizations in reputable outlets. The practice is a mark of credible visualization, and adopting it is a cheap way to make your charts feel more trustworthy.

Context Notes: When the Data Needs Explanation

Sometimes the data has a wrinkle that the reader needs to know about. The metric was redefined in 2018. The geographic boundaries changed in 2015. The measurement methodology was improved starting in 2020. Excluding these facts would not be a lie, but including them makes the chart honest in a way that raw data alone cannot be.

Context notes are short lines of text near the source attribution, explaining whatever the reader needs to know to interpret the data correctly. Examples:

"Methodology updated in 2018; earlier data estimated by back-calculation."
"Data excludes Alaska and Hawaii for consistency with pre-1950 records."
"Vaccination rates include only primary series; booster doses not counted."
"Revenue figures in constant 2020 dollars, adjusted for inflation."

These notes are not apologies. They are transparency statements. A chart that includes them is doing the work of responsible communication; a chart that omits them is leaving the reader to potentially draw the wrong conclusions.

The ethical connection to Chapter 4 is direct. Chapter 4 argued that context omission is one of the most common forms of visualization dishonesty — not because the omitted context is a lie, but because its absence lets the reader form an impression the full data would not support. Context notes are the antidote. They take the most important caveats and put them on the chart so the reader cannot miss them.

The Discipline of Always Including Attribution

Make source attribution non-negotiable. Every chart you produce for any audience beyond your own notebook should include a source attribution. The discipline is cheap to maintain — one line of text, often a simple fig.text or plt.figtext call in matplotlib — and the payoff is a consistent baseline of trust across all of your work.

As a practical matter, add attribution to your matplotlib style file or wrapper function so it appears automatically on every chart. Make it part of the default, not an extra step. Charts that are created without attribution are charts that will be shared without attribution, and that is how decontextualized data misinformation spreads.

Check Your Understanding — Look at the charts you have made in the past month. How many of them have on-image source attribution? If the number is less than "all of them," what would it take to make attribution part of your standard workflow?

7.8 Bringing It Together: The Climate Plot with Words

Throughout Part I and now Part II, we have been carrying the climate plot — the default matplotlib time series chart of global temperature anomalies from 1880 to 2024. In Chapter 6, we decluttered it: removed the top spine, lightened the gridlines, deleted the figure border, simplified the color palette. The result was a clean chart with no noise. This chapter gives that clean chart a voice.

The final version, after applying every principle from this chapter, looks like this (in description, since we are still in no-code Part II):

Title (action, left-aligned, 18pt semi-bold): "Global Temperatures Are Now 1.2°C Above the 20th-Century Average"

Subtitle (14pt regular, medium gray): "NASA GISS Surface Temperature Analysis, 1880–2024. Baseline: 1951–1980 average."

The plotting area: A clean line chart. The temperature anomaly line in a strong color (a warm red-orange for the recent period, muted gray for the historical stable period — or a single color throughout if preferred). A horizontal reference line at zero (the baseline). Faint horizontal gridlines at 0.5-degree intervals. No top spine, no right spine. Bottom and left spines in medium gray. Light tick marks.

Y-axis label (12pt regular, left-aligned to the axis): "Temperature Anomaly (°C)"

X-axis label: (usually omitted for time axes — year is self-explanatory)

Tick labels: Years at 20-year intervals (1880, 1900, 1920, …, 2020) along the x-axis. Temperature values at 0.5-degree intervals (–0.5, 0, 0.5, 1.0, 1.5) along the y-axis.

Annotation 1 (callout, 10pt, muted color, aligned to the data point): "2016: +1.01°C — record year at the time"

Annotation 2 (callout, 10pt, muted color): "2023: +1.18°C — currently the warmest year on record"

Annotation 3 (shaded horizontal band, labeled at the right): "1951–1980 baseline: 0°C"

Source attribution (8pt regular, muted color, bottom left): "Source: NASA GISS. Processed by the author. Last updated: December 2024."

No legend — the chart has a single series. No decorative borders. No drop shadows. Every word on the chart is doing specific work: the title states the finding, the subtitle provides the dataset and baseline, the axis label states the units, the annotations call out the two most important data points, and the source attribution enables verification.

A reader seeing this chart for the first time, with no surrounding context, can understand it in under five seconds. The title tells them what to conclude. The chart shows them the evidence. The annotations direct their attention to the most striking details. The source tells them to trust it. The chart is self-explanatory. That is the goal of this chapter, and it is achievable — not with elaborate design tools or expert training, but with a disciplined application of the principles we have covered.

The code to produce this chart in matplotlib is waiting in Chapter 12. For now, the principles are what matter. Every chart you make from this point forward should meet the self-explanatory standard. The title should state the finding. The annotations should direct the eye. The source should enable trust. The typography should create a hierarchy that the reader can scan without effort. These are the words on your chart, and they are as important as the chart itself.

Chapter Summary

This chapter argued that a decluttered chart is not the same as a self-explanatory chart. Removing chart-junk clears the visual noise; adding words tells the reader what the chart means. The five categories of chart text — title, subtitle, axis labels, annotations, source attribution — each do specific work that the decluttered chart alone cannot do.

The central idea is the action title: a title that states the finding rather than describing the topic. "Revenue Grew 18% in 2024" is an action title. "Quarterly Revenue" is a descriptive title. For most real-world communication contexts, the action title is better because it enables the reader to grasp the main message in the 5-second reading window. Writing a good action title is a disciplined act of interpretation, and it is probably the single highest-impact change most practitioners can make to their chart output.

Annotations are the next most important category. An annotation is a small piece of text placed directly on the data, calling out a specific feature and explaining why it matters. One to three annotations per chart is the usual range; more becomes overwhelming. Annotations are underused in default plotting because the libraries do not make them easy, but the cost of learning to use them is repaid many times over in reader comprehension.

Typography — font choice, size hierarchy, weight, alignment, whitespace — is the supporting craft. The five principles covered in Section 7.2 (single font family, size hierarchy, weight for emphasis, meaningful alignment, whitespace around text) are the typographic foundation for every chart you make.

Axis labels and tick labels deserve explicit attention. Default formatting is almost always inadequate — unit missing, tick values unformatted, tick density too high. A few specific formatting moves (thousands separators, natural units, human-readable dates, fewer ticks) turn default axes into professional ones without any design expertise.

Direct labeling is usually preferable to a separate legend, because it eliminates the eye movement between the data and the legend box. Legends remain appropriate for charts with many series, short or crossing lines, and interactive contexts where hover provides on-demand labeling.

Source attribution is non-negotiable. Every published chart should include an on-image source line that names the data provider, the vintage, and any essential methodology notes. Attribution is a trust mechanism, not just a citation convention.

The threshold concept is that the title states the finding. An action title is worth more than any other single design change, and once you internalize the distinction between descriptive and action titles, you will find that most of your charts can be improved in under a minute by rewriting the title.

Next in Chapter 8: layout, composition, and small multiples. Once individual charts are clean and self-explanatory, the next question is how to arrange multiple charts on a page or screen to tell a coherent story. The principles we have built so far — declutter, typography, annotation — set the stage for composition. Small multiples are the answer to most questions that one chart alone cannot answer.

Spaced Review: Concepts from Chapters 1-6

These questions reinforce ideas from earlier chapters. If any feel unfamiliar, revisit the relevant chapter before proceeding.

Chapter 1: The "visualization as argument" framework says every explanatory chart makes a claim. How does the action title relate to that framework? Is the action title the place where the claim becomes explicit?
Chapter 2: Pre-attentive processing happens in under 250 milliseconds. How does typography support or fight pre-attentive processing? What typographic choices help the reader form a correct first impression in that window?
Chapter 3: The luminance-first principle says a chart should be interpretable in grayscale. How does this principle apply to text on a chart? Should titles, annotations, and source attributions have the same luminance as the data?
Chapter 4: Chapter 4 argued that context omission is a form of visualization dishonesty. How do source attribution and context notes address this concern? Why is on-chart attribution more honest than attribution in a caption?
Chapter 5: The chart selection matrix tells you which chart type to use. Does the choice of title, annotations, and axis labels depend on the chart type? Are there typography moves that work for one chart type but not another?
Chapter 6: The declutter procedure removes non-data ink that does not earn its place. The words in this chapter are non-data ink. Why do they earn their place? Apply the maximal deletion test to them: what happens if you delete the title? The subtitle? The annotations? The source attribution?