Quiz: Why Visualization Matters
20 questions. Aim for mastery (18+). If you score below 14, revisit the relevant sections before moving to Chapter 2.
Multiple Choice (10 questions)
1. Anscombe's Quartet demonstrates that:
(a) Four datasets cannot share the same mean (b) Correlation always reveals the true relationship between variables (c) Datasets with identical summary statistics can have fundamentally different structures (d) Scatter plots are always superior to line charts
Answer
**(c)** Datasets with identical summary statistics can have fundamentally different structures. All four of Anscombe's datasets share nearly identical means, variances, correlations, and regression lines, yet when plotted they reveal four completely different patterns — linear, curved, linear-with-outlier, and clustered-with-outlier. The point is that summary statistics are "lossy compressions" that discard structural information.2. The Datasaurus Dozen extended Anscombe's work by:
(a) Proving that all scatter plots look the same (b) Showing that 13 visually distinct datasets can share the same summary statistics (c) Demonstrating that summary statistics are always useless (d) Replacing Anscombe's Quartet with a more accurate version
Answer
**(b)** Showing that 13 visually distinct datasets can share the same summary statistics. Matejka and Fitzmaurice (2017) created 13 datasets — including one shaped like a tyrannosaurus rex — all sharing the same mean, standard deviation, and correlation to two decimal places. This dramatically reinforced Anscombe's point at a much larger scale.3. The term "cognitive amplifier" in this chapter refers to:
(a) Software that automates data analysis (b) A tool that extends human cognitive capacity beyond its unaided limits (c) A type of chart that uses animation to enhance understanding (d) The tendency for people to overinterpret visual patterns
Answer
**(b)** A tool that extends human cognitive capacity beyond its unaided limits. Visualization is a cognitive amplifier in the same way that writing, mathematics, and maps are — it lets you perceive and reason about patterns that would be invisible or extremely difficult to detect from raw numbers alone. It transforms data from a format the brain handles poorly (tables) into one it handles superbly (spatial patterns).4. Pre-attentive processing in the context of data visualization means:
(a) The viewer must carefully study the chart before understanding it (b) Certain visual features are detected automatically, before conscious attention is engaged (c) Charts should be viewed before reading any accompanying text (d) The designer plans the chart layout before choosing the data
Answer
**(b)** Certain visual features are detected automatically, before conscious attention is engaged. Pre-attentive processing occurs in roughly 200-250 milliseconds and includes detection of color hue, size differences, orientation changes, and position anomalies. Effective charts exploit this mechanism; ineffective charts fight against it with visual clutter.5. William Playfair is significant in the history of data visualization because he:
(a) Invented the computer (b) Published the first peer-reviewed study on chart perception (c) Invented the line chart, bar chart, and pie chart (d) Developed the first statistical software
Answer
**(c)** Invented the line chart, bar chart, and pie chart. Playfair published his inventions in the *Commercial and Political Atlas* (1786) and *Statistical Breviary* (1801). Before Playfair, quantitative data was communicated almost exclusively through tables. His insight — using visual length and area to represent quantities — created the foundational chart types still used today.6. Florence Nightingale's coxcomb diagrams were important because:
(a) They were the first charts ever created (b) They demonstrated that artistic skill is required for effective visualization (c) They visually proved that preventable disease, not battlefield wounds, was the main cause of soldier deaths, driving policy reform (d) They showed that pie charts are the best way to display proportions
Answer
**(c)** They visually proved that preventable disease, not battlefield wounds, was the main cause of soldier deaths, driving policy reform. Nightingale's charts showed that deaths from preventable diseases dwarfed deaths from battle wounds — often by a factor of five or more. The visual impact convinced Parliament and Queen Victoria to establish a Royal Commission, leading to reforms that dramatically reduced mortality.7. Which of the following is the MOST common form of chart deception described in the chapter?
(a) Using 3D effects (b) Truncating the y-axis so it does not start at zero (c) Using too many colors (d) Omitting the chart title
Answer
**(b)** Truncating the y-axis so it does not start at zero. While all of these can be problematic, the chapter specifically identifies the truncated y-axis as "the single most common form of visual deception." It exaggerates the visual magnitude of differences, making small changes appear dramatic. This is sometimes deliberate manipulation but is more often a thoughtless acceptance of software defaults.8. According to the chapter, when is it appropriate NOT to create a visualization?
(a) When the audience is not technical (b) When the data is about a boring topic (c) When a single number communicates the message more efficiently than a chart could (d) When you do not have access to Python
Answer
**(c)** When a single number communicates the message more efficiently than a chart could. The chapter argues that charts earn their existence by showing patterns, comparisons, distributions, or relationships — things that numbers and sentences alone cannot efficiently convey. If the message is a single quantity ("Revenue grew 12%"), a number is often sufficient. Visualization is a tool, not a requirement.9. The "5-second rule" for chart design states that:
(a) You should spend no more than five seconds creating a chart (b) The viewer should grasp the main message of the chart within five seconds (c) Charts should contain no more than five data points (d) Five is the maximum number of colors that should be used in any chart
Answer
**(b)** The viewer should grasp the main message of the chart within five seconds. Five seconds accounts for pre-attentive processing (which happens in under a second), plus time to read the title, orient to the axes, and integrate the visual pattern into a coherent message. If a chart fails this test, it needs to be redesigned for clarity.10. The chapter describes Minard's chart of Napoleon's Russian campaign as encoding how many variables?
(a) Two (b) Four (c) Six (d) Eight
Answer
**(c)** Six. Minard's chart encodes army size (band width), location (position on map), direction of movement (color: tan for advance, black for retreat), temperature during retreat (line chart), dates, and geographic features (rivers). Tufte called it "probably the best statistical graphic ever drawn."True/False with Justification (4 questions)
For each statement, indicate True or False and provide a one- to two-sentence justification.
11. "Exploratory and explanatory visualizations should follow the same design standards."
Answer
**False.** Exploratory charts are private thinking tools where speed matters more than polish — default settings, missing labels, and rough formatting are acceptable. Explanatory charts are communication products for external audiences and require deliberate design: clear titles, labeled axes, purposeful color choices, and a single clear message. Applying explanatory standards to exploratory work wastes time; applying exploratory standards to explanatory work produces bad communication.12. "A high correlation coefficient (e.g., r = 0.82) guarantees that the relationship between two variables is linear."
Answer
**False.** Anscombe's Quartet proves this directly: all four datasets have a Pearson correlation of 0.816, but only Dataset I shows a truly linear relationship. Dataset II is curved, Dataset III is linear except for a single outlier, and Dataset IV has no meaningful linear pattern at all. The correlation coefficient measures the strength of linear association but cannot detect nonlinearity, outliers, or other structural features.13. "Visualization should typically be one of the first steps in data analysis, not the last."
Answer
**True.** The chapter explicitly warns against the common practice of saving visualization for the end — after cleaning, analysis, and modeling are complete. Plotting data early catches data quality issues, reveals unexpected patterns, generates hypotheses, and builds intuitions that guide the entire analysis. If you only visualize at the end, you risk confirming what you already believe rather than discovering what the data actually contains.14. "Edward Tufte argued that all non-data elements in a chart (gridlines, labels, borders) should be removed."
Answer
**False — this overstates Tufte's position.** Tufte argued for maximizing the data-ink ratio — the proportion of ink devoted to representing data versus structural and decorative elements. He advocated removing "chart junk" (purely decorative elements) but not all non-data elements. Labels, axis titles, and minimal gridlines serve a communicative purpose and are part of responsible chart design. The principle is to remove what adds no information, not to strip a chart bare.Short Answer (3 questions)
15. In three to four sentences, describe the three components of the "visual argument" framework introduced in this chapter.
Answer
The visual argument framework has three components. The **claim** is the single main assertion the chart makes — what the viewer should take away (e.g., "global temperatures have risen sharply since 1980"). The **evidence** is the data rendered visually — the points, lines, bars, or other marks that support the claim. The **design choices as rhetoric** are the decisions about axis range, color, scale, annotation, and chart type that direct the viewer's attention and shape how convincingly the evidence supports the claim.16. Explain why the chapter describes summary statistics as "lossy compressions." What analogy is being drawn, and what does it mean for how analysts should work with data?
Answer
The analogy is to data compression in computing. A lossy compression (like JPEG for images or MP3 for audio) reduces file size by discarding information that is considered less important. Similarly, summary statistics like the mean and standard deviation "compress" an entire distribution into a few numbers, discarding information about shape, clusters, gaps, outliers, and nonlinear patterns. The implication is that analysts should not rely solely on summary statistics to understand their data — they should visualize the data to recover the structural information that the statistics throw away.17. Name two historical figures discussed in the chapter (besides Tufte) and explain each person's specific contribution to data visualization in one to two sentences.
Answer
**William Playfair** (1759-1823) invented the line chart, bar chart, and pie chart, first publishing them in his *Commercial and Political Atlas* (1786). He established the foundational principle of using visual length and position to represent quantities. **Florence Nightingale** (1820-1910) created coxcomb diagrams (polar area charts) showing that preventable diseases, not battle wounds, were the primary cause of death among soldiers in the Crimean War. Her charts were sent to Parliament and Queen Victoria, directly contributing to military hospital reforms that saved lives. (Charles Joseph Minard is also acceptable — he created the famous 1869 chart of Napoleon's Russian campaign encoding six variables in a single image.)Applied Scenario (2 questions)
18. You are a data analyst at a retail company. Your manager shows you a bar chart from a competitor's annual report claiming "Customer satisfaction at an all-time high!" The chart shows satisfaction scores for the last three years: 8.1, 8.3, 8.5 on a scale of 1-10. The y-axis runs from 8.0 to 8.6.
(a) Identify the specific visualization technique that makes the improvement look dramatic. (b) Explain what the chart would look like if the y-axis started at zero. (c) Is the competitor's chart necessarily wrong? Under what circumstances might the truncated axis be justified?
Answer
**(a)** The chart uses a **truncated y-axis** — starting at 8.0 instead of 0, which makes the bars appear to grow dramatically (the 8.5 bar appears roughly five times taller than the 8.1 bar, when the actual difference is only 5%). **(b)** With a y-axis starting at zero, all three bars would be nearly identical in height — about 81-85% of the full scale. The differences would be barely perceptible, which accurately reflects the small magnitude of the change. **(c)** The chart is not necessarily wrong, but it is misleading without proper context. A truncated axis *can* be justified when (1) the audience understands the scale and knows the axis is truncated, (2) the chart includes clear labels showing the actual values, and (3) the goal is to highlight meaningful variation within a narrow range. However, using it in marketing material with the claim "all-time high!" is designed to exaggerate, not inform.19. A colleague is preparing for a presentation to the company's executive team. She has completed a thorough analysis of sales data and has 47 charts in her Jupyter notebook. She plans to include all 47 in her slide deck "so the executives can see everything."
Using concepts from this chapter, advise your colleague. In 100-150 words, explain what is wrong with this approach and what she should do instead.
Answer
Your colleague is confusing exploratory and explanatory visualization. The 47 notebook charts are **exploratory** — they were tools for her own analysis and discovery. Presenting all of them to executives treats exploratory output as explanatory communication, which will overwhelm the audience and obscure the key findings. She should instead identify the **three to five most important findings** from her analysis and create **new, purpose-built explanatory charts** for each. Each chart should have a clear claim (the one-sentence takeaway), a descriptive title, clean design, and should pass the 5-second rule. The remaining 42+ charts can live in an appendix or technical report for anyone who wants the details. The notebook is where the thinking happened. The presentation is where the communication happens. They require different charts with different design standards.Analysis (1 question)
20. The chapter's Progressive Project section argues that "the average global temperature rose 1.1 degrees Celsius" as a number is less compelling than the same information shown as a time series chart.
(a) Identify at least three specific types of information that the time series chart reveals which the single number does not. (b) Can you think of a scenario where the single number "1.1 degrees" would actually be more effective than a chart? Describe that scenario and explain why. (c) Connect this example to the chapter's threshold concept about visualization and thinking.
Answer
**(a)** The time series chart reveals: (1) **The pattern of change over time** — the warming is not gradual and steady but accelerates sharply after ~1980; (2) **Year-to-year variability** — natural fluctuations from El Nino events, volcanic eruptions, etc. are visible; (3) **The long baseline of relative stability** — decades of near-zero anomaly before the rise, which contextualizes how unusual the recent warming is; (4) **The rate of acceleration** — the steepening slope in recent decades; (5) **Individual outlier years** that stand out from the trend. **(b)** The single number would be more effective in a context where brevity is paramount and the audience needs a quick reference point — for example, in a tweet, a policy brief summary, or the executive summary of a report. If the audience already understands the trend and just needs the bottom line, "1.1 degrees since pre-industrial times" is efficient and memorable. It also works better in conversational contexts ("Did you know global temperature has risen 1.1 degrees?") where you cannot show a chart. **(c)** This connects directly to the threshold concept: "Visualization is not a way to present findings — it IS a way to think." The number 1.1 tells you a fact, but the chart changes how you *understand* that fact. Seeing the acceleration, the variability, and the long stable baseline generates new questions and insights that the number alone does not provoke. The chart is not just a prettier version of the number — it is a different cognitive experience that shapes what you discover and how you reason about the problem.Scoring Guide
| Score | Level | Recommendation |
|---|---|---|
| 18-20 | Mastery | Proceed to Chapter 2 with confidence. |
| 14-17 | Proficient | Proceed to Chapter 2. Review any missed questions. |
| 10-13 | Developing | Re-read Sections 1.1, 1.6, and 1.7 before continuing. |
| Below 10 | Review Needed | Return to the full chapter. Focus on the Check Your Understanding prompts. |