Exercises — Chapter 5: Choosing the Right Chart

DataField.Dev

Exercises — Chapter 5: Choosing the Right Chart

Part A: Conceptual (8 exercises)

These exercises test your understanding of the data type classification, question type classification, and chart selection framework. No tools or code required.

Exercise A-1: Data Type Classification

For each of the following variables, classify it as categorical, continuous/quantitative, temporal, spatial, or network. If a variable could be classified in more than one way, state both and explain when each classification would apply.

a) Customer satisfaction rating on a 1-to-5 star scale b) Latitude and longitude of weather stations c) Invoice date for each transaction d) Department name (Engineering, Sales, Marketing, Legal, HR) e) Blood pressure reading in mmHg f) Twitter follower/following relationships between 500 accounts g) U.S. state FIPS codes (01 = Alabama, 02 = Alaska, ...) h) Revenue in dollars i) Day of the week (Monday through Sunday) j) IP address of a website visitor

Exercise A-2: Question Type Classification

Classify each of the following analytical questions into one of the six question types (comparison, distribution, relationship, composition, change over time, spatial pattern). Some questions may involve more than one type — identify the primary type and explain why.

a) "Has employee satisfaction improved since we launched the wellness program in January?" b) "Which of our five product lines accounts for the largest share of total revenue?" c) "Is there a relationship between advertising spend and new customer acquisition?" d) "Are salary levels normally distributed in our company, or is there a long right tail?" e) "Which U.S. states have the highest rates of childhood obesity?" f) "How does this quarter's revenue compare to the same quarter last year for each region?" g) "What percentage of our website traffic comes from organic search versus paid versus direct?" h) "How has the age distribution of our customer base shifted over the past five years?"

Exercise A-3: Matrix Lookup Practice

For each scenario below, identify (1) the data type(s) involved, (2) the question type, (3) the chart types recommended by the selection matrix, and (4) your recommended chart type with a one-sentence justification.

a) A hospital wants to compare average wait times across six departments. b) A climate scientist wants to show how global CO2 concentration has changed from 1960 to 2024. c) A marketing analyst wants to understand whether session duration is correlated with purchase amount. d) A nonprofit wants to show what fraction of its annual budget goes to programs, administration, and fundraising. e) A public health researcher wants to visualize the geographic distribution of malaria case rates across African countries. f) An HR department wants to understand the distribution of employee tenure in years across the organization.

Exercise A-4: Same Data, Different Questions

The Meridian Corp dataset contains columns for: product_line (5 categories), region (4 categories), quarterly_revenue (continuous), order_date (temporal), and customer_satisfaction_score (continuous, 1-10 scale).

Write six different analytical questions about this dataset — one for each question type (comparison, distribution, relationship, composition, change over time, spatial pattern). For each question, specify which chart type you would use and identify which columns from the dataset are involved.

Exercise A-5: The Threshold Concept Test

A junior analyst presents you with the following statement: "I have a dataset with 50,000 rows of transaction data. It has revenue, product category, date, and region columns. I want to make a visualization. What chart should I use?"

a) Explain why this question is unanswerable as stated. b) Write three different questions the analyst could ask about this data, each requiring a different chart type. c) For each question, walk through the two-input framework (data type + question type) to arrive at the recommended chart.

Exercise A-6: Chart Type Distinctions

Explain the difference between each pair of chart types below. For each pair, state what question type each serves, when you would choose one over the other, and what would go wrong if you used the wrong one.

a) Bar chart vs. histogram b) Line chart vs. area chart c) Scatter plot vs. bubble chart d) Pie chart vs. treemap e) Box plot vs. violin plot f) Choropleth vs. dot map

Exercise A-7: The N/A Cells

The chart selection matrix contains several N/A cells — combinations of data type and question type that do not produce meaningful charts.

a) Explain why "1 Categorical + Relationship" is N/A. What additional data would you need to make a relationship chart possible? b) Explain why "1 Continuous + Change Over Time" is N/A. What additional data would you need? c) A manager asks you to "show how the product categories relate to each other." You have only a categorical variable (product names) and a count of sales. Explain why a scatter plot is inappropriate and suggest what the manager might actually want (reframe the question into a valid question type).

Exercise A-8: The Seven-Category Limit Revisited

In Chapter 3, we learned that viewers can reliably distinguish about seven to eight categorical hues. In this chapter, we learned that spaghetti charts fail when too many series overlap, and that pie charts fail with too many slices.

a) Identify the common perceptual principle that underlies all three limitations. b) For each situation (too many colors, too many line series, too many pie slices), state the recommended maximum and the alternative strategy when you exceed it. c) A dataset has 25 categories. Propose three different design strategies for visualizing this data, each using a different approach to handle the category overload.

Part B: Applied (5 exercises)

These exercises require you to critique specific chart choices and propose improvements using the framework.

Exercise B-1: The Quarterly Business Review

A Meridian Corp analyst presents these three charts in a quarterly business review:

Chart 1: A 3D pie chart with 12 slices showing revenue share by customer industry segment. The smallest slice is 1.2%, labeled in 7-point font.
Chart 2: A dual-axis chart with quarterly revenue (bar chart, left axis, $0-$50M) and headcount (line chart, right axis, 0-500 employees). The line appears to track the bars closely, and the analyst states that "headcount growth is driving revenue growth."
Chart 3: A line chart with 9 overlapping lines showing monthly revenue for each of the company's 9 product lines. The legend lists colors, but three of the lines overlap so closely they appear as one thick line.

For each chart: a) Identify the specific mistake(s) using the terminology from Section 5.5. b) Identify what question the chart is trying to answer and its question type. c) Propose a redesigned chart with a specific chart type and explain why it is superior.

Exercise B-2: Critique the Public Health Report

A World Health Organization report contains a visualization of vaccination rates across 50 countries. The chart is a scatter plot with country name on the x-axis (alphabetical order) and vaccination rate (0-100%) on the y-axis. Each point is a different color corresponding to the WHO region.

a) What is wrong with the x-axis encoding? b) What question type is a scatter plot best suited for, and does this visualization match that question type? c) Propose two alternative visualizations — one for comparison across countries and one for spatial pattern analysis — and explain why each is a better fit.

A social media analytics dashboard contains four charts, all using the same data (daily metrics over 90 days for five channels: organic, paid, email, referral, direct):

Chart 1: A stacked area chart showing total sessions by channel.
Chart 2: A grouped bar chart showing average conversion rate by channel (5 bars, one snapshot).
Chart 3: A pie chart showing the share of total conversions by channel.
Chart 4: A table of numbers showing all metrics for all channels for the most recent 7 days.

a) For each chart, identify the question type it answers and whether the chart type is appropriate. b) Identify two charts that could be improved and propose specific changes. c) The dashboard is missing a question type. Which one? Design a fifth chart to fill the gap (state the question, data types, and chart type).

Exercise B-4: The Conference Poster

A researcher is designing a conference poster that must contain four visualizations on a single 48x36 inch poster:

Temperature trend over 50 years (change over time)
Distribution of annual temperature anomalies (distribution)
Correlation between CO2 and temperature (relationship)
Regional temperature differences across 8 regions (comparison)

For each visualization: a) Recommend a chart type using the decision tree. b) Describe one specific consideration for the poster medium (as opposed to a slide or interactive dashboard). c) Suggest how the four charts should be arranged on the poster to create a coherent visual narrative.

Exercise B-5: The Wrong Tool for the Job

For each scenario below, the analyst has chosen a chart type. Determine whether the choice is correct or incorrect. If incorrect, name the mistake (from Section 5.5), explain why it fails, and recommend the correct chart type.

a) A histogram showing the distribution of quarterly revenue across 20 quarters. b) A line chart showing customer satisfaction scores for five product lines at a single point in time. c) A bar chart (sorted by value) showing the top 10 countries by GDP. d) A scatter plot showing the relationship between advertising spend and sales revenue across 200 ad campaigns. e) A stacked bar chart showing how three product lines' revenue shares have changed over eight quarters.

Part C: Real-World (5 exercises)

These exercises require you to find and analyze actual published visualizations. Document your sources.

Exercise C-1: Find the Mismatch

Find a published chart (news article, corporate report, academic paper, or blog post) where the chart type does not match the question being asked. Possible mismatches include: a pie chart used for comparison, a bar chart used for time trends, or a scatter plot used when the data is categorical.

a) Describe the chart and the question it appears to be answering. b) Classify the data types and question type. c) Look up the appropriate chart type in the selection matrix. d) Explain specifically why the original chart type is a poor fit and how the recommended type would improve it. e) Record the source URL or citation.

Exercise C-2: The Decision Framework in Action

Choose any dataset you have access to (a CSV from Kaggle, your own work data, a government open data portal). Apply the complete decision framework:

a) List the columns and classify each by data type. b) Write three distinct analytical questions about the data. c) For each question, walk through the decision tree (Step 1 through Step 3) and arrive at a recommended chart type. d) For one of the three, sketch (on paper or in a drawing tool) what the chart would look like. Label axes, note color encoding, and annotate.

Exercise C-3: FiveThirtyEight Chart Audit

Go to FiveThirtyEight.com (or its archive) and find three different articles that contain data visualizations.

a) For each visualization, identify the question type and chart type used. b) Determine whether the chart type aligns with the recommendation from the selection matrix. c) Identify any instances where FiveThirtyEight deviated from standard chart types. Were these deviations justified? Explain. d) Across the three articles, what is FiveThirtyEight's most commonly used chart type? Why does this chart type dominate in journalism for a general audience?

Exercise C-4: Dashboard Decomposition

Find an online dashboard (a government data portal, a public Tableau dashboard, or a company's public-facing metrics page) that contains at least four different chart types.

a) List each chart and its chart type. b) For each chart, identify the question it answers and the data types involved. c) Evaluate whether each chart type is the best choice for its question using the selection matrix. d) Identify any redundancy (multiple charts answering the same question type) or gaps (question types not addressed).

Exercise C-5: Before and After in the Wild

Find a published "chart makeover" — a case where someone redesigned a chart to make it more effective. (Sources: PolicyViz, Storytelling with Data community, Reddit r/dataisbeautiful makeover threads, or any data journalism redesign.)

a) Describe the original chart and the redesigned version. b) Classify the question type for both versions. c) Did the makeover change the chart type, or keep the same type but improve the design? If the type changed, explain why the new type is better matched to the question. d) Apply the decision framework to the data described — does the framework arrive at the same chart type as the makeover?

Part D: Synthesis (4 exercises)

These exercises require integrating the decision framework with concepts from Chapters 1-4.

Exercise D-1: The Complete Climate Analysis Plan

The climate dataset contains: year (1880-2024), global mean temperature anomaly, Northern Hemisphere anomaly, Southern Hemisphere anomaly, CO2 concentration (1960-2024), and sea level (1880-2024).

You are preparing a report for a non-technical audience (a city council considering climate adaptation policy). Design a visualization plan:

a) Write five distinct questions the city council might have about this data. b) For each question, classify data types and question type, and use the matrix to select a chart type. c) For each chart, specify the color palette type (sequential, diverging, categorical — from Chapter 3) and justify the choice. d) Arrange the five charts in a narrative order (building toward a conclusion — preview of Chapter 9). Explain the logic of your sequencing. e) For each chart, note any ethical considerations from Chapter 4 (axis choices, framing, context).

Exercise D-2: The Meridian Corp Executive Dashboard

Design a one-page executive dashboard for Meridian Corp's CEO. The dashboard must answer these five questions:

"Are we on track for our annual revenue target?" (comparison: actual vs. target)
"Which product line is growing fastest?" (change over time by category)
"What is our revenue mix by region?" (composition)
"Is deal size correlated with customer satisfaction?" (relationship)
"How do our sales vary geographically?" (spatial pattern)

For each question: a) Specify the chart type and justify it using the framework. b) State whether the chart type is appropriate for an executive audience (Section 5.6). c) Describe the chart in enough detail that a designer could create it: what goes on each axis, what color encoding is used, what annotations are needed. d) Sketch a layout showing how all five charts fit on a single page.

Exercise D-3: Cross-Chapter Integration — Encoding Meets Chart Selection

In Chapter 2, we learned the visual encoding hierarchy: position > length > angle > area > color saturation. In this chapter, we learned which chart types to use for which questions.

a) For each of the six chart types below, identify the primary visual encoding channel used and its rank in the Cleveland-McGill hierarchy: - Bar chart - Pie chart - Scatter plot - Line chart - Bubble chart (3rd variable as size) - Choropleth map

b) Using your analysis, explain why bar charts and scatter plots are recommended more often than pie charts and bubble charts in the selection matrix. c) The matrix recommends scatter plots for relationship questions. What pre-attentive feature (from Chapter 2) makes the correlation between two variables immediately visible in a scatter plot?

Exercise D-4: The Full Pipeline — From Dataset to Chart Recommendation

You receive a CSV file with the following columns for 500 hospital patients:

patient_id (integer)
age (integer)
gender (M/F/Other)
department (Emergency, Cardiology, Orthopedics, Neurology, General)
wait_time_minutes (float)
satisfaction_score (float, 1.0-5.0)
admission_date (date)
zip_code (5-digit code)
readmitted_30_days (boolean)

a) Classify each column by data type. b) Generate eight distinct analytical questions (at least one of each question type: comparison, distribution, relationship, composition, change over time, spatial). c) For each question, use the decision tree to select a chart type. d) Identify which two charts would be most informative for a hospital administrator trying to reduce wait times and improve satisfaction. Justify your selections.

Part M: Mixed Review — Chapters 1-4 (3 exercises)

These exercises integrate concepts from earlier chapters with the chart selection framework.

Exercise M-1: Anscombe Meets the Framework

In Chapter 1, we learned that Anscombe's Quartet — four datasets with identical summary statistics — reveals radically different patterns when plotted.

a) What question type is "what does this data look like?" — and why does that question type specifically reveal what summary statistics hide? b) For each of the four Anscombe datasets, if you did not know the true pattern and were simply told "here is a dataset with two continuous variables and a correlation of 0.82," what chart type would the framework recommend? Would that chart type correctly reveal the hidden patterns? c) What does this exercise tell you about the relationship between chart selection and exploratory analysis?

Exercise M-2: Ethical Chart Selection

In Chapter 4, we discussed how chart design choices can mislead viewers. In this chapter, we have focused on choosing the right chart type for a question. But chart type itself can be a tool of deception.

a) Give an example of how choosing a pie chart instead of a bar chart could mislead the viewer (not just reduce accuracy). b) Give an example of how choosing a line chart for categorical data could imply a relationship that does not exist. c) A politician presents a 3D bar chart showing economic growth across four years. The 3D perspective makes the most recent year's bar appear tallest. Based on Chapters 4 and 5, write a two-paragraph critique of this visualization.

Exercise M-3: Color and Chart Type Interaction

Using concepts from Chapter 3 (color palette types) and this chapter (chart selection), answer the following:

a) Why does a choropleth map require a sequential or diverging palette but never a categorical palette for the data variable? (The answer involves both perceptual science from Chapter 3 and the data type classification from this chapter.) b) A grouped bar chart compares revenue across 6 product lines. The analyst uses a sequential palette (light blue to dark blue) for the 6 categories. What is wrong with this choice, and what palette type should be used? c) A line chart shows temperature anomaly over time. The analyst colors the line using a diverging palette — blue for negative anomalies, red for positive anomalies — varying the color along the line. Is this a good or bad color choice? Justify using principles from both Chapter 3 and this chapter.

Solutions to selected exercises are available in the Appendix.

Exercises — Chapter 5: Choosing the Right Chart

Part A: Conceptual (8 exercises)

Exercise A-1: Data Type Classification

Exercise A-2: Question Type Classification

Exercise A-3: Matrix Lookup Practice

Exercise A-4: Same Data, Different Questions

Exercise A-5: The Threshold Concept Test

Exercise A-6: Chart Type Distinctions

Exercise A-7: The N/A Cells

Exercise A-8: The Seven-Category Limit Revisited

Part B: Applied (5 exercises)

Exercise B-1: The Quarterly Business Review

Exercise B-2: Critique the Public Health Report

Exercise B-3: The Social Media Dashboard

Exercise B-4: The Conference Poster

Exercise B-5: The Wrong Tool for the Job

Part C: Real-World (5 exercises)

Exercise C-1: Find the Mismatch

Exercise C-2: The Decision Framework in Action

Exercise C-3: FiveThirtyEight Chart Audit

Exercise C-4: Dashboard Decomposition

Exercise C-5: Before and After in the Wild

Part D: Synthesis (4 exercises)

Exercise D-1: The Complete Climate Analysis Plan

Exercise D-2: The Meridian Corp Executive Dashboard

Exercise D-3: Cross-Chapter Integration — Encoding Meets Chart Selection

Exercise D-4: The Full Pipeline — From Dataset to Chart Recommendation

Part M: Mixed Review — Chapters 1-4 (3 exercises)

Exercise M-1: Anscombe Meets the Framework

Exercise M-2: Ethical Chart Selection

Exercise M-3: Color and Chart Type Interaction