Exercises: Why Visualization Matters

These exercises do not require any programming. They require thinking, observing, and writing. That is deliberate — the skills you build here are the foundation for every chart you will ever make.


Part A: Conceptual (8 problems)

A.1 ★☆☆ | Recall

Explain in two to three sentences what Anscombe's Quartet demonstrates. Why was it significant when it was published in 1973?

Guidance Focus on the relationship between summary statistics and the actual structure of data. What did all four datasets share, and what was different about them? Think about what someone relying only on statistics would conclude versus what someone looking at plots would see.

A.2 ★☆☆ | Recall

Define the term cognitive amplifier as used in this chapter. Give one example from everyday life (not related to data visualization) that fits the same definition.

Guidance The key idea is extending human cognitive capacity — not replacing it, but making it capable of more than it could do unaided. Think about tools that let you reason about things that would be impossible to hold in your head alone.

A.3 ★☆☆ | Recall

List three pre-attentive visual features described in the chapter. For each one, describe a specific situation in a chart where that feature would help a viewer spot a pattern.

Guidance Pre-attentive features are processed before conscious attention engages. Think about what "pops out" when you glance at a visual scene. The chapter lists color hue, size, orientation, and position.

A.4 ★★☆ | Understand

The chapter describes the "5-second rule." In your own words, explain why five seconds is the benchmark, and describe what happens cognitively in those five seconds when a viewer looks at a new chart.

Guidance Connect this to pre-attentive processing (which happens in under a second) and the additional time needed to read titles, orient to axes, and integrate the visual pattern into a coherent message. Why is this time constraint important for chart design?

A.5 ★★☆ | Understand

Explain the difference between exploratory and explanatory visualization. For each mode, identify the audience, the primary goal, and one design principle that matters most.

Guidance Think about who is looking at the chart and why. Consider what "success" looks like in each mode. For exploratory, success is discovering something new. For explanatory, success is communicating something specific to someone else.

A.6 ★★☆ | Understand

The chapter states that "summary statistics are lossy compressions of data." Explain what this metaphor means. What is being "compressed"? What is "lost"?

Guidance Think about what information a mean, standard deviation, and correlation preserve about a dataset, and what information they discard. The Datasaurus Dozen is a vivid example — what properties of the dinosaur-shaped dataset are invisible in its summary statistics?

A.7 ★★☆ | Understand

The chapter introduces the "visual argument" framework with three components: claim, evidence, and design choices as rhetoric. Explain why the word "rhetoric" is used here. Is it a negative term?

Guidance Rhetoric, in its classical sense, is the art of persuasion — making a case effectively. It is not inherently manipulative. Consider how a chart designer makes choices that direct attention, emphasize certain comparisons, and guide the viewer toward a conclusion. How is this similar to how a writer constructs a written argument?

A.8 ★★★ | Understand

The chapter claims that "visualization is not a way to present findings — it IS a way to think." Construct an argument for this position in four to five sentences, using at least one specific example from the chapter.

Guidance This is the threshold concept of the chapter. Think about how an analyst who only uses visualization at the end of their workflow differs from one who uses it at the beginning. What do they discover? What do they miss? The health insurance example from Section 1.1 is one concrete illustration you could use.

Part B: Applied Analysis (5 problems)

B.1 ★★☆ | Apply

You are given the following summary statistics for two datasets:

Property Dataset X Dataset Y
Mean 50.0 50.0
Std Dev 12.3 12.3
Median 49.5 50.2

A colleague says, "These datasets are essentially the same." Write a response explaining why this conclusion may be premature. What would you recommend as the next step?

Guidance Apply the lesson of Anscombe's Quartet directly. What kinds of structures (clusters, outliers, nonlinear patterns, gaps) could exist in data with these summary statistics? What specific type of visualization would you suggest, and why?

B.2 ★★☆ | Apply

Consider the following scenario: A news article states, "Crime in City X has increased by 40% over the last year." The article includes a bar chart comparing this year's crime count to last year's.

Identify at least three questions you would ask before accepting the chart's visual message at face value.

Guidance Think about the chart design choices (does the y-axis start at zero?), the data choices (what counts as "crime"? is the population of City X changing?), and the framing choices (why compare only two years?). What additional context would make the chart more informative or more honest?

B.3 ★★★ | Apply

Your manager asks you to create a dashboard showing 15 different metrics for the company's quarterly business review. Based on what you learned in this chapter, write a brief (150-200 word) memo explaining why showing all 15 metrics as charts may not be the best approach, and propose an alternative.

Guidance Apply the "not everything needs a chart" principle and the "dashboard of everything" pitfall. Think about which metrics genuinely benefit from visualization (trends, comparisons, distributions) versus which might be better served by a simple number or table. Also consider the cognitive load on the viewer.

B.4 ★★☆ | Apply

Classify each of the following as exploratory or explanatory visualization. Justify each answer in one sentence.

  1. A scatter plot matrix of all variables in a new dataset, generated as the first step of analysis.
  2. A polished bar chart in a company's annual report showing revenue by region.
  3. A histogram of model residuals checked during regression diagnostics.
  4. An infographic posted on a public health agency's website about vaccination rates.
  5. A quick box plot comparing distributions across groups before choosing a statistical test.
Guidance Ask yourself: Who is the audience? Is the goal discovery or communication? Is the chart rough and iterative or polished and purposeful?

B.5 ★★★ | Apply

Choose a chart you have seen in the news, in a textbook, or on social media within the past month. Apply the visual argument framework: (1) State the chart's claim in one sentence. (2) Identify the evidence (what data does the chart present?). (3) Describe at least two design choices and how they reinforce or undermine the claim.

Guidance If you cannot find a recent chart, search for "misleading chart" or "chart of the day" online. Be specific about the design choices — axis range, color, annotations, aspect ratio, chart type. Are these choices helping the viewer understand the data or distorting their perception?

Part C: Real-World Scenarios (5 problems)

C.1 ★★☆ | Analyze

You work for a hospital, and the CEO asks you to "make a chart showing that patient satisfaction has improved." When you look at the data, you find that satisfaction scores have gone from 7.2 to 7.4 on a 10-point scale over the past year.

Describe two different chart designs: one that would make the improvement look dramatic and one that would represent it honestly. What ethical considerations are at play?

Guidance Think about axis range, baseline, and visual proportion. What does a truncated axis do to the viewer's perception? What is your professional obligation when someone asks you to make data "look good"?

C.2 ★★★ | Analyze

A public health department wants to communicate the risk of a new disease to the general public. They have data showing that the disease affects 3 out of every 10,000 people. A colleague suggests expressing this as "a 0.03% risk" in text rather than creating a visualization.

Argue for and against using a visualization in this case. Under what circumstances would a number be better? Under what circumstances might a chart (or icon array) be more effective?

Guidance Consider the audience (general public, not statisticians), the goal (accurate risk perception), and the known biases in how people interpret small probabilities. An icon array showing 3 highlighted figures among 10,000 can make a tiny probability *feel* appropriately small. But it can also make it feel larger than it is. What matters most: accuracy or impact?

C.3 ★★☆ | Apply

A climate change skeptic shows you a chart of global temperatures from 2016 to 2022 and says, "See? Temperatures aren't rising." The chart shows a relatively flat line with some up-and-down variation.

Using concepts from this chapter, explain what is misleading about this argument. What would you show in response, and why?

Guidance This is the cherry-picked time frame problem. What happens when you show the full temperature record from 1880 to 2025? What does the short time frame hide? Also consider natural variability — short-term fluctuations are expected even within a long-term trend.

C.4 ★★★ | Analyze

A social media company publishes a chart showing "user engagement" increasing by 300% over three years. The chart uses a 3D area chart with perspective effects, making the recent growth appear enormous.

Identify at least three specific design problems with this chart. For each, explain how the design choice distorts the viewer's perception and what an honest alternative would be.

Guidance Apply the concepts of area distortion, 3D effects, and chart junk. Think about how perspective changes the perceived area of shapes in the front versus the back of a 3D chart. Consider also whether "engagement" is clearly defined and whether 300% growth from a small base is meaningful.

C.5 ★★☆ | Apply

You are preparing a report for your company's board of directors. You have five key findings from your quarterly analysis. For each finding below, recommend whether to use a chart, a number, or a table — and briefly explain why.

  1. Overall revenue is $12.4 million, up 8% from last quarter.
  2. Revenue by product line shows that two of five lines are declining while three are growing.
  3. Customer satisfaction scores by region show significant variation across 12 regions.
  4. The exact budget breakdown across 30 cost categories.
  5. Web traffic over the past 12 months shows a seasonal pattern with a sharp spike in November.
Guidance Apply the "when to visualize" and "when not to visualize" guidelines from Section 1.5. For each item, ask: Is there a pattern to show? A comparison to make? A distribution to reveal? Or is a simple number or precise lookup table the right tool?

Part D: Synthesis (4 problems)

D.1 ★★★ | Evaluate

The chapter presents Tufte's data-ink ratio and chart junk principles, but also notes that some researchers have found that decorative elements can aid memory and engagement. Take a position: should chart designers prioritize minimalism (Tufte's view) or allow decorative elements that might increase memorability? Support your argument with at least two specific reasons.

Guidance There is no single correct answer here. Consider the context: Who is the audience? What is the purpose of the chart? A chart in a scientific journal may need different design principles than a chart in a newspaper or on social media. Think about the trade-offs between clarity, memorability, and engagement.

D.2 ★★★ | Evaluate

The chapter introduces the idea that "every chart is an argument." A colleague pushes back: "I'm not arguing anything. I'm just showing the data objectively." Write a response (150-200 words) explaining why even a "neutral" chart involves choices that shape the message.

Guidance Think about all the decisions that go into making even a "simple" chart: what data to include, what chart type to use, what axis range to set, what to title it, what colors to use. Each of these choices could be made differently, and the resulting chart would tell a different visual story. "Objectivity" in chart-making is not the absence of choices but the presence of *good* choices made transparently.

D.3 ★★★ | Create

Write a one-paragraph description of a fictional scenario in which a well-designed visualization leads to a better decision than a table or summary statistic would have. Then write a second paragraph describing a scenario where a visualization is unnecessary and a simple number is the better communication tool. Make both scenarios as specific and realistic as possible.

Guidance For the first scenario, think about situations where patterns, outliers, or trends are the key information. For the second, think about situations where a single metric or comparison is all that matters. Real-world contexts (business, healthcare, education, government) make the scenarios more compelling.

D.4 ★★★ | Evaluate

Florence Nightingale used visualization to argue for military hospital reform. Some historians have noted that her charts, while effective, also exaggerated some differences through design choices. Does the fact that her charts served a humanitarian purpose justify design choices that might have amplified the visual message beyond what a strictly proportional chart would show? Discuss the ethical tensions in 200-250 words.

Guidance This is a genuine ethical dilemma. On one hand, strict proportionality is a principle of graphical integrity. On the other hand, Nightingale was trying to save lives, and a perfectly "accurate" chart might not have been compelling enough to move politicians to action. Consider whether the ends justify the means in data visualization, and what precedent this sets.

Part M: Mixed / Interleaved

This section is intentionally empty for Chapter 1. Interleaved review problems — which mix concepts from multiple chapters — will begin in Chapter 2. These problems are among the most valuable for long-term retention, as research on interleaving shows that mixing practice across topics strengthens the ability to discriminate between concepts and select appropriate strategies. Look forward to them.


Part E: Research / Extension (3 problems)

E.1 ★★★ | Research

Find the original Anscombe (1973) paper, "Graphs in Statistical Analysis," published in The American Statistician. Read it (it is only four pages). Then answer: What was Anscombe's stated purpose in creating the quartet? What specific audience was he addressing, and what behavior was he trying to change?

Guidance The paper is freely available through many university libraries and can be found with a web search. Pay attention to Anscombe's tone — he is writing for practicing statisticians and is making a case that they systematically undervalue graphical methods. How does his argument compare to the one made in this chapter, fifty years later?

E.2 ★★★ | Research

Look up the original Datasaurus Dozen paper by Matejka and Fitzmaurice (2017): "Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics" from the ACM CHI Conference. How did the authors generate datasets with identical statistics but different shapes? What algorithm did they use?

Guidance The paper describes a simulated annealing approach — an optimization technique that iteratively adjusts data points to match target statistics while moving toward a target shape. Understanding how the datasets were constructed deepens your appreciation of how much information summary statistics lose.

E.3 ★★★ | Research

Find three examples of data visualizations from the COVID-19 pandemic (2020-2023) that were widely shared on social media. For each, identify: (1) What claim was the chart making? (2) Was the chart exploratory or explanatory? (3) Did any design choices make the chart misleading or confusing? Write a brief (100-word) analysis of each.

Guidance The COVID-19 pandemic produced an unprecedented volume of public data visualization — from the Johns Hopkins dashboard to the Financial Times tracker to countless charts shared on Twitter/X. Look for examples that were praised for clarity, criticized for misleading design, or went viral because of their visual impact. Consider how the urgency of the situation affected the trade-off between speed and design quality.

Exercises completed. Return to these after finishing Chapter 2 — you will find that concepts from perception science deepen your answers to several of these problems.