Chapter 14 Quiz: The Grammar of Graphics
Instructions: This quiz tests your understanding of Chapter 14. Answer all questions before checking the solutions. For multiple choice, select the best answer — some options may be partially correct. For short answer questions, aim for 2-4 clear sentences. Total points: 100.
Section 1: Multiple Choice (10 questions, 4 points each)
Question 1. Which of the following best describes the grammar of graphics?
- (A) A set of rules for making charts look professional
- (B) A framework that describes any chart as a combination of data, aesthetic mappings, geometric objects, scales, coordinate systems, and facets
- (C) A specific Python library for creating visualizations
- (D) A ranking of chart types from best to worst
Answer
**Correct: (B)** - **(A)** is too narrow — the grammar is about structure, not just appearance. - **(B)** captures the key idea: the grammar decomposes charts into composable components. - **(C)** is incorrect — the grammar of graphics is a conceptual framework, not a library. (ggplot2 in R implements it, but the grammar itself is independent of any tool.) - **(D)** is incorrect — the grammar describes how charts are built, not which are "better."Question 2. In the grammar of graphics, an "aesthetic mapping" connects:
- (A) A color palette to a chart's background
- (B) A variable in the data to a visual property of the chart (position, color, size, etc.)
- (C) Two geometric objects together (e.g., bars and lines)
- (D) A chart title to its subtitle
Answer
**Correct: (B)** An aesthetic mapping is the bridge between data and visual representation. When you say "map GDP to the x-axis and vaccination rate to the y-axis and region to color," you are specifying three aesthetic mappings. This is the heart of the grammar — it's what makes a chart encode data rather than just being a picture.Question 3. Elena wants to show how vaccination rates compare across six WHO regions. Which chart type is most appropriate?
- (A) Line chart
- (B) Scatter plot
- (C) Bar chart
- (D) Histogram
Answer
**Correct: (C)** A bar chart is the standard choice for comparing a numerical value across a small number of categories. Line charts imply continuity between points — inappropriate for categorical data like regions. Scatter plots require two continuous variables. Histograms show the distribution of a single continuous variable, not comparisons across categories.Question 4. According to Cleveland and McGill's research on graphical perception, which visual encoding do humans judge most accurately?
- (A) Color saturation
- (B) Angle (as in pie charts)
- (C) Area (as in bubble charts)
- (D) Position along a common scale (as in scatter plots)
Answer
**Correct: (D)** Cleveland and McGill's 1984 experiments ranked visual encodings from most to least accurately perceived: position along a common scale > position along non-aligned scales > length > angle > area > volume > color saturation. This is why scatter plots and dot plots are perceptually superior to pie charts and bubble charts for precise comparison.Question 5. Tufte's "data-ink ratio" principle recommends:
- (A) Using as many colors as possible to make charts visually appealing
- (B) Maximizing the proportion of visual elements that represent actual data
- (C) Adding gridlines, borders, and backgrounds to provide context
- (D) Using 3D effects to make bars and lines stand out
Answer
**Correct: (B)** The data-ink ratio = data ink / total ink. Tufte argues this ratio should be maximized by removing non-data elements (heavy gridlines, backgrounds, borders, decorative effects) that don't convey information. Options A, C, and D all describe additions that *decrease* the data-ink ratio.Question 6. What is the critical difference between a bar chart and a histogram?
- (A) Bar charts are vertical; histograms are horizontal
- (B) Bar charts show named categories; histograms show binned ranges of a continuous variable
- (C) Histograms can only show counts; bar charts can show any value
- (D) Bar charts use color; histograms do not
Answer
**Correct: (B)** A bar chart represents named, discrete categories (like "North America" and "Europe") that could be reordered without changing meaning. A histogram divides a continuous variable into bins (like 0-10, 10-20, 20-30) that have a fixed, meaningful order. The bars in a histogram are adjacent (no gaps) to convey the continuous nature of the underlying variable.Question 7. Which of the following is an example of "chartjunk" as defined by Tufte?
- (A) A descriptive title that states the chart's main finding
- (B) A 3D bevel effect applied to bar chart bars
- (C) Data point labels on the most important values
- (D) A single faint horizontal gridline at the average value
Answer
**Correct: (B)** Chartjunk consists of visual elements that do not convey data and may distract or distort. A 3D bevel effect adds visual complexity without information and can distort the perceived height of bars through perspective effects. Options A, C, and D all serve informational purposes — they help the reader understand the data.Question 8. You create a scatter plot showing the relationship between hours of study (x-axis) and exam score (y-axis) for 100 students, with each point colored by major (STEM vs. Humanities). How many aesthetic mappings does this chart have?
- (A) 1
- (B) 2
- (C) 3
- (D) 4
Answer
**Correct: (C)** The three aesthetic mappings are: (1) hours of study mapped to x-position, (2) exam score mapped to y-position, and (3) major mapped to color. Each connection between a data variable and a visual property counts as one aesthetic mapping.Question 9. Faceting (small multiples) works by:
- (A) Stacking multiple chart types on top of each other
- (B) Splitting data into subgroups and creating a separate panel for each group, with shared axes
- (C) Using color to distinguish different groups within a single chart
- (D) Animating a chart to show changes over time
Answer
**Correct: (B)** Faceting creates multiple mini-charts (one per subgroup), all sharing the same axis scales so they can be directly compared. This is distinct from using color within a single chart (C), which keeps everything in one panel. Tufte called small multiples "the best design solution for a wide range of problems in data display."Question 10. Which misleading technique is being used when a bar chart's y-axis starts at 47 instead of 0, making a difference between 48% and 52% look visually dramatic?
- (A) Cherry-picking the time range
- (B) Dual y-axes
- (C) Truncated y-axis
- (D) Area distortion
Answer
**Correct: (C)** A truncated y-axis starts at a value other than zero in a bar chart, visually amplifying small differences. Since bars encode values as *lengths*, a bar that appears twice as tall should represent a value that is twice as large. When the axis starts at 47, the bar at 52 appears more than four times as tall as the bar at 48, despite being only about 8% larger in actual value.Section 2: True or False (4 questions, 4 points each)
Question 11. True or False: A pie chart is technically a bar chart plotted in polar coordinates.
Answer
**True.** In the grammar of graphics framework, a pie chart can be described as a stacked bar chart where the coordinate system has been changed from Cartesian to polar. The "slices" are bars wrapped around a circle, with angles replacing lengths as the visual encoding. This insight — that changing the coordinate system transforms one chart type into another — is one of the powerful ideas in the grammar of graphics.Question 12. True or False: Line charts should always have a y-axis starting at zero.
Answer
**False.** The "start at zero" rule applies to bar charts, because bars encode values as *lengths* — a bar twice as tall must represent a value twice as large. Line charts encode values as *positions and slopes*, and the trend (the shape of the line) is often more important than the absolute distance from zero. A line chart of stock prices from $150 to $155 is perfectly fine with a y-axis from $148 to $157.Question 13. True or False: Exploratory visualization should be polished and carefully designed before being shared with an audience.
Answer
**False.** Exploratory visualization is created *for yourself* during analysis — it's meant to be quick, rough, and disposable. Speed matters more than beauty because you're making many charts to understand the data, most of which will be discarded. It's *explanatory* visualization that should be polished and carefully designed, because that's what you share with an audience.Question 14. True or False: If two variables show a strong pattern in a scatter plot, one must be causing the other.
Answer
**False.** Scatter plots reveal *associations* (correlations), not causes. Two variables can show a strong visual pattern because (a) one causes the other, (b) both are caused by a third variable (confounding), or (c) the pattern is coincidental. The principle "correlation does not imply causation" is one of the most important ideas in data science, and we'll explore it thoroughly in Chapter 24.Section 3: Short Answer (4 questions, 6 points each)
Question 15. In 2-3 sentences, explain the difference between exploratory and explanatory visualization. Give one example of each in the context of Elena's vaccination rate project.
Answer
**Exploratory visualization** is created during analysis for the analyst's own understanding — it's quick, rough, and meant for discovery. **Explanatory visualization** is designed for an audience to communicate a specific finding — it's polished, focused, and tells a clear story. Example exploratory: Elena makes a quick histogram of vaccination rates for all countries to check whether the distribution is normal or skewed — she doesn't add a title or adjust colors because she's just looking. Example explanatory: Elena creates a bar chart comparing WHO regions for her final report, with a descriptive title ("Sub-Saharan Africa Trails Other Regions by 30 Points"), clear axis labels, a reference line at the global average, and consistent colors matching her report's design.Question 16. Explain why a chart plan (sketching on paper before coding) is valuable. What specific problems does it prevent?
Answer
A chart plan separates *design thinking* from *coding*. When you sketch on paper, you focus on what the chart should communicate: which variables map to which visual properties, what chart type best answers your question, and what annotations are needed. This prevents two common problems: (1) fighting with syntax while simultaneously making design decisions, resulting in neither being done well, and (2) starting to code without a clear goal, leading to aimless plotting that wastes time. A plan also forces you to think about your audience and message before getting lost in technical details.Question 17. Name three common ways that charts can mislead viewers. For each, briefly describe the technique and explain how a viewer can defend against it.
Answer
(1) **Truncated y-axis**: Starting the y-axis at a value other than zero in a bar chart, making small differences look large. Defense: check where the axis starts and mentally recalibrate. (2) **Cherry-picked time range**: Starting or ending a time series at a strategically chosen point to support a desired narrative. Defense: ask "why does the chart start/end here?" and look for longer-range context. (3) **Dual y-axes**: Using two independently scaled y-axes to make unrelated variables appear correlated. Defense: check whether both axes are present and whether the scales could be manipulated to create a false visual relationship. (Other valid answers include area distortion, inconsistent bin widths, and omitted context/baselines.)Question 18. In 2-3 sentences, explain what Tufte means by "data-ink ratio" and give one concrete example of how to improve it in a typical chart.
Answer
The **data-ink ratio** is the proportion of a chart's visual content ("ink") that represents actual data, as opposed to non-data elements like borders, backgrounds, heavy gridlines, and decorative effects. Tufte argues this ratio should be maximized: remove anything that doesn't convey data, so the reader's attention goes straight to the information. Concrete example: A default Excel bar chart often includes a dark border around the plot area, a colored background, and heavy gridlines. Removing the border, changing the background to white, and fading the gridlines to light gray eliminates non-data ink without losing any information, directing the eye to the bars themselves.Section 4: Applied Scenarios (2 questions, 8 points each)
Question 19. Jordan is analyzing grade distributions at their university. They have data for 2,000 students across 4 departments (Biology, English, Computer Science, History) with columns for: department, course_level (100/200/300/400), grade (A/B/C/D/F), and GPA.
Jordan wants to answer three questions. For each, recommend a chart type, specify the aesthetic mappings (what maps to x, y, color, etc.), and briefly justify your choice.
(a) "How do average GPAs compare across the four departments?"
(b) "Is there a relationship between course level and GPA?"
(c) "What does the overall distribution of grades look like?"
Answer
**(a)** Chart type: Bar chart. x = department (4 categories), y = mean GPA. Color: single color (no variable mapped to color, since the comparison is across departments and that's already encoded in x-position). Justification: comparison across a small number of categories is a bar chart's primary strength. **(b)** Chart type: Box plot (or bar chart of means). x = course level (4 levels: 100, 200, 300, 400), y = GPA. Color: optionally by department if Jordan wants to see whether the pattern differs across departments. Justification: course level has a natural ordering but is discrete (4 levels), and the distribution of GPA within each level matters — a box plot shows both the central tendency and spread. A scatter plot would also work if individual points are plotted with jitter. **(c)** Chart type: Bar chart (of grade counts). x = grade (A/B/C/D/F, ordered), y = count. Justification: grades are ordinal categories (not continuous), so a histogram is technically inappropriate — a bar chart with the grades in order is the right choice. Note: if the question were about GPA (a continuous value), a histogram would be appropriate instead.Question 20. You are reviewing a colleague's chart for a quarterly business report. The chart is a 3D pie chart with 8 slices, no data labels, a gradient fill on each slice, a dark background, and a title that reads "Revenue Breakdown." Using concepts from this chapter, write a critique of at least four specific issues with this chart and recommend specific improvements for each.
Answer
**Issue 1: 3D effect.** The 3D perspective distorts slice sizes — slices in the front appear larger than equally-sized slices in the back. This is chartjunk that actively misleads. **Improvement:** Remove the 3D effect entirely. Better yet, replace the pie chart with a horizontal bar chart. **Issue 2: Too many slices (8).** Pie charts become difficult to read with more than 3-4 slices because the human eye cannot accurately compare angles for similar-sized slices. **Improvement:** If keeping a pie chart, group the smallest categories into "Other" to reduce to 3-4 slices. Better: use a horizontal bar chart where all 8 categories can be compared easily via length. **Issue 3: No data labels.** Without labels showing the percentage or dollar value of each slice, the reader cannot determine precise values and must try to estimate from angles. **Improvement:** Add percentage labels to each slice (or, in the bar chart alternative, show values at the end of each bar). **Issue 4: Gradient fill.** Gradient fills create visual ambiguity — it's unclear where the "top" of a visual element is. They also reduce the data-ink ratio by adding non-data visual complexity. **Improvement:** Use solid, flat colors from a well-chosen categorical palette. **Issue 5: Dark background.** Tufte: non-data ink should be minimized. A dark background draws attention to itself rather than the data and can reduce contrast. **Improvement:** Use a white or very light background. **Issue 6: Uninformative title.** "Revenue Breakdown" describes the chart type, not the finding. **Improvement:** Use a descriptive title that states the key takeaway, e.g., "Enterprise Software Accounts for 45% of Revenue, Up from 32% Last Year."Scoring Guide
| Section | Points |
|---|---|
| Multiple Choice (10 x 4) | 40 |
| True/False (4 x 4) | 16 |
| Short Answer (4 x 6) | 24 |
| Applied Scenarios (2 x 8) | 16 |
| Total | 100 |
90-100: Excellent command of visualization principles. You're ready for Chapter 15. 80-89: Strong understanding with minor gaps. Review any questions you missed before moving on. 70-79: Adequate understanding. Revisit the sections related to questions you missed, especially the grammar components and misleading chart techniques. Below 70: Review the chapter more carefully before proceeding. The grammar of graphics framework is foundational for everything in Part III.
End of Chapter 14 Quiz