Chapter 17 Exercises: Interactive Visualization — plotly, Dashboard Thinking

Contributors to Introduction to Data Science

Chapter 17 Exercises: Interactive Visualization — plotly, Dashboard Thinking

How to use these exercises: Work through the sections in order. Parts A-D focus on Chapter 17 material, building from recall to original creation. Part E applies your skills to new datasets. Part M mixes in concepts from earlier chapters. You will need Python with plotly and pandas installed. Some exercises also require Dash.

Difficulty key: 1-star: Foundational | 2-star: Intermediate | 3-star: Advanced | 4-star: Extension

Part A: Conceptual Understanding (1-star)

Exercise 17.1 — Static vs. interactive

Name three scenarios where interactive visualization is clearly better than static, and three scenarios where static visualization is clearly better. For each, explain why.

Guidance

Interactive is better when: (1) you have dense data where tooltips identify individual points, (2) stakeholders need to explore and filter without code, (3) temporal data benefits from animation. Static is better when: (1) the output is a printed report or PDF, (2) you need exact control over the narrative, (3) the chart must be reproducible as an image for citation. The underlying principle is that interactive charts enable exploration while static charts control the message.

Exercise 17.2 — plotly.express vs. plotly.graph_objects

Explain the relationship between plotly.express and plotly.graph_objects. When would you use each? How does this parallel the seaborn-matplotlib relationship?

Guidance

`plotly.express` is a high-level wrapper that creates common chart types with minimal code, similar to seaborn. `plotly.graph_objects` is the low-level API giving full control over every trace, axis, and annotation, similar to matplotlib. Use `plotly.express` for 90% of tasks; drop to `graph_objects` when you need to combine different chart types, add custom annotations, or build non-standard layouts. The parallel is exact: seaborn wraps matplotlib, `plotly.express` wraps `graph_objects`.

Exercise 17.3 — Choropleth requirements

What data does plotly need to create a choropleth map? Name the minimum required columns and explain why each is necessary. What happens if a country in your DataFrame does not have a matching ISO code?

Guidance

Minimum requirements: (1) a column of geographic identifiers — typically ISO 3166-1 alpha-3 codes, matched via the `locations` parameter; (2) a column of numeric values, matched via the `color` parameter. plotly matches your codes to its built-in country geometries. If a code is missing or invalid, that country appears as the default background color (usually light gray) with no data. If your data has country names but not ISO codes, you need to merge with a lookup table first.

Exercise 17.4 — Dashboard callback model

In your own words, explain how Dash callbacks work. What are Inputs, Outputs, and the decorated function? Why does Dash use this pattern instead of, say, letting you modify charts directly?

Guidance

A callback is a decorated Python function that Dash calls automatically when a specified Input changes (e.g., a dropdown selection). The function receives the current Input values as arguments, processes them, and returns new values for the specified Outputs (e.g., the `figure` property of a Graph component). Dash uses this reactive pattern because the dashboard runs as a web application — the Python code runs on a server, and the browser communicates via HTTP. Direct mutation would require the Python code to push updates to every connected browser, which is more complex.

Exercise 17.5 — Animation best practices

Your colleague creates an animated scatter plot with 100 time frames, and the axes rescale on every frame. The animation is confusing and disorienting. Identify two problems and explain how to fix each.

Guidance

Problem 1: Too many frames (100) makes the animation too long and tedious. Fix: aggregate to fewer time points (e.g., every 5 years) or use a larger step size. Problem 2: Rescaling axes makes it impossible to track dots across frames because the coordinate system changes. Fix: set `range_x` and `range_y` to fixed values that encompass the full data range across all frames.

Part B: Applied Skills (2-star)

Exercise 17.6 — Interactive scatter plot

Create an interactive scatter plot of GDP per capita vs. vaccination coverage using plotly.express. Include: - Color by region - Size by population - Country name as hover title - Custom tooltip showing coverage to one decimal place and GDP with commas

Guidance

Use `px.scatter()` with `color="region"`, `size="population"`, `hover_name="country"`, and `hover_data` with format strings. Set `size_max=40` to prevent huge bubbles and `opacity=0.7` for better visibility of overlapping points.

Exercise 17.7 — Interactive line chart

Create a line chart showing mean vaccination coverage over time for each region. Add markers at each data point. Customize the layout with a title, axis labels, and the "plotly_white" template.

Guidance

First aggregate: `yearly = df.groupby(["year", "region"], as_index=False)["coverage_pct"].mean()`. Then use `px.line(yearly, x="year", y="coverage_pct", color="region", markers=True)`. Apply `fig.update_layout(template="plotly_white", ...)`.

Exercise 17.8 — Choropleth map

Create a choropleth map showing vaccination coverage for the most recent year in your dataset. Use the "natural earth" projection and the "RdYlGn" color scale (red for low, green for high). Set the color range to [50, 100]. Export the result as an HTML file.

Guidance

Filter to the latest year, then use `px.choropleth()` with `locations="iso_alpha"`, `color="coverage_pct"`, `color_continuous_scale="RdYlGn"`, `range_color=[50, 100]`, `projection="natural earth"`. Export with `fig.write_html("map.html")`.

Exercise 17.9 — Animated scatter plot

Create an animated scatter plot showing GDP vs. coverage over time, with: - Each country tracked across frames - Fixed axis ranges - Color by region, size by population - A descriptive title

Play the animation and describe one pattern you observe.

Guidance

Use `px.scatter()` with `animation_frame="year"`, `animation_group="country"`. Set `range_x` and `range_y` to cover the full data range. The student might observe countries moving rightward (GDP growth) and upward (coverage improvement), or note that some countries regress.

Exercise 17.10 — Animated choropleth

Create an animated choropleth map with a year slider. Use the "YlGnBu" color scale and scope the map to Africa only. What visual story does the animation tell about vaccination progress on the continent?

Guidance

Use `px.choropleth()` with `animation_frame="year"`, `scope="africa"`. The animation should reveal improving coverage in many African countries over time, with some persistent laggards. The student should observe both general progress and geographic patterns (e.g., Southern Africa may improve faster than Central Africa).

Exercise 17.11 — Faceted interactive plot

Create a faceted scatter plot using px.scatter() with facet_col="income_group" and facet_col_wrap=2. Each panel should show GDP vs. coverage for one income group, colored by region. How does the GDP-coverage relationship differ across income groups?

Guidance

The relationship likely differs: in low-income countries, even small GDP increases correspond to coverage gains. In high-income countries, GDP variation is large but coverage is uniformly high. The faceting reveals these within-group patterns that are hidden in an aggregate plot.

Exercise 17.12 — Custom tooltips and formatting

Create a bar chart of mean vaccination coverage by region, with: - Values displayed on each bar (using text_auto) - Custom hover template showing the region name, mean coverage, and number of countries - A horizontal layout (swap x and y) - Sorted from highest to lowest

Guidance

Compute means and counts: `grouped = df.groupby("region").agg(mean_cov=..., n_countries=("country", "nunique"))`. Use `px.bar(grouped, y="region", x="mean_cov", orientation="h", text_auto=".1f")`. Sort the DataFrame before plotting. For custom hover, use `fig.update_traces(hovertemplate=...)`.

Exercise 17.13 — HTML export comparison

Export the same chart as HTML in two ways: (1) with include_plotlyjs=True and (2) with include_plotlyjs="cdn". Compare the file sizes. When would you use each approach?

Guidance

The self-contained version embeds the full plotly.js library (~3-5 MB). The CDN version is much smaller (~10-50 KB) but requires internet access to load the JavaScript library when opened. Use self-contained for sharing via email or USB; use CDN for hosting on a website where bandwidth matters.

Part C: Real-World Applications (2-3 star)

Exercise 17.14 — COVID-style time series dashboard (3-star)

Create a multi-chart interactive notebook (not a full Dash app) showing a health metric over time: 1. An animated choropleth showing the metric by country over years 2. A line chart with one line per region 3. A scatter plot of GDP vs. the metric for the latest year

Export all three as HTML files. Write a one-paragraph narrative connecting the three views.

Guidance

Use the vaccination dataset. The three charts provide different perspectives: geographic (where), temporal (when), and economic (why). The narrative should describe what patterns emerge from seeing the same data three ways — for example, regions that start low but improve rapidly (visible in map and line chart) are often middle-income countries experiencing economic growth (visible in scatter).

Exercise 17.15 — Interactive box plot exploration (2-star)

Create an interactive box plot of vaccination coverage by region, with additional hover_data showing each country's name. Compare this to a static seaborn box plot of the same data. What information is easier to extract from each version?

Guidance

The interactive version lets you hover over outlier points to identify which country they represent — something impossible in the static version. The static version (especially seaborn's) may have cleaner aesthetics and integrates better into a paper. The student should recognize that interactivity makes identification of specific observations (countries) much easier, while static plots are better for showing the overall distributional comparison at a glance.

Exercise 17.16 — Building a simple Dash dashboard (3-star)

Build a Dash app with: - A dropdown to select a WHO region - A line chart showing coverage over time for countries in that region - A bar chart showing the latest-year coverage for each country in that region

When the dropdown changes, both charts should update. Include a title and clear labels.

Guidance

Follow the template from Section 17.8. Define `app.layout` with an `html.H1`, a `dcc.Dropdown`, and two `dcc.Graph` components. Write a `@app.callback` with one Input (dropdown) and two Outputs (figures). Inside the callback, filter `df` by the selected region and create both charts with `plotly.express`. Test with `app.run(debug=True)`.

Exercise 17.17 — Dashboard with slider and dropdown (3-star)

Extend the dashboard from Exercise 17.16 to include a year slider. The scatter plot should show GDP vs. coverage for the selected region and year. The line chart should highlight the selected year with a vertical marker.

Guidance

Add a `dcc.Slider` to the layout. Update the callback to accept two Inputs. In the callback, filter by both region and year for the scatter. For the line chart, add `fig.add_vline(x=selected_year, line_dash="dash")` to highlight the selected year. This demonstrates multi-input callbacks.

Exercise 17.18 — Comparing plotly templates (2-star)

Create the same scatter plot using five different plotly templates ("plotly", "plotly_white", "plotly_dark", "ggplot2", "simple_white"). For each, note: (a) background color, (b) grid style, (c) font style. Which would you choose for a presentation on a projector? For a web dashboard? For a formal report?

Guidance

Create the same chart five times, changing only the `template` parameter. `"plotly_dark"` works well on projectors (good contrast). `"plotly_white"` or `"simple_white"` are clean for dashboards. `"simple_white"` or `"ggplot2"` work for formal reports. The student should observe that template choice, like seaborn theme choice, depends on the medium and audience.

Part D: Synthesis and Extension (3-4 star)

Exercise 17.19 — Multi-view coordinated exploration (3-star)

Create three plotly charts of the same dataset: 1. A choropleth map 2. A scatter plot 3. A bar chart

Export all three as HTML files. Then, write a paragraph describing how you would ideally coordinate these three views (e.g., clicking a country on the map highlights it in the scatter plot). Discuss why this coordination is valuable and what tools you would need.

Guidance

In standalone plotly, coordination between separate charts requires Dash callbacks. The student should describe cross-filtering: selecting a country on the map filters the scatter and bar to highlight that country. This is valuable because the user can connect geographic, statistical, and categorical views of the same observation. Tools needed: Dash with callbacks that share state, or a notebook with plotly `FigureWidget` and event handlers.

Exercise 17.20 — The complete interactive report (4-star)

Build a comprehensive interactive report for the vaccination dataset that includes: 1. An animated world choropleth map 2. Regional trend lines 3. A GDP vs. coverage scatter (latest year) with regression trendline 4. A histogram of coverage with region overlay 5. A summary bar chart of mean coverage by income group

Export all five as HTML files. Write an introduction paragraph that guides the reader through the five visualizations in a logical order.

Guidance

This is an integrative exercise. The narrative order should be: (1) map for geographic overview, (2) trends for temporal context, (3) scatter for the GDP-coverage relationship, (4) histogram for distributional detail, (5) bar chart for income-group summary. The introduction should tell the reader what question each chart answers and what to look for.

Exercise 17.21 — plotly.graph_objects deep dive (4-star)

Recreate one of your plotly.express charts using plotly.graph_objects directly. You will need to use go.Figure(), go.Scatter() (or similar trace types), and fig.update_layout(). Compare the code length and flexibility. When would the extra verbosity be worth it?

Guidance

The `graph_objects` version will be 3-5 times longer. For example, a scatter plot requires manually creating `go.Scatter(x=..., y=..., mode="markers", marker=dict(color=..., size=...))` and adding it to a figure. The extra flexibility is worth it when you need multiple trace types on the same chart (e.g., scatter + line + annotation), custom hover templates, or non-standard chart types not available in `plotly.express`.

Exercise 17.22 — Dash with multiple callbacks (4-star)

Build a Dash app with: - A dropdown for region - A slider for year - A checkbox group for income groups - Three charts that update based on all three controls

Document the callback structure (which Inputs affect which Outputs). Discuss the complexity of managing multiple interacting controls.

Guidance

With three Inputs and three Outputs, the callback receives all three input values and must filter the data accordingly. The complexity grows multiplicatively: 6 regions x 20 years x 4 income groups = 480 possible states. The student should discover that thoughtful defaults and validation (e.g., handling cases where no data matches the selected combination) are critical for robust dashboards.

Part M: Mixed Review (integrating earlier chapters)

Exercise 17.23 — Data preparation for plotly (2-star)

Your dataset has country names but no ISO alpha-3 codes. Using pandas skills from Part II, merge your DataFrame with a country-codes lookup table to add the iso_alpha column. Then create a choropleth. What do you do about countries that fail to merge?

Guidance

This integrates [Chapter 9](../../part-02-data-wrangling/chapter-09-reshaping-transforming/index.md) (merging) with Chapter 17 (choropleth). Use `pd.merge(df, codes, left_on="country_name", right_on="name", how="left")`. Countries that fail to merge get NaN for `iso_alpha` and appear as gray on the map. The student should check for failed merges with `merged[merged["iso_alpha"].isna()]` and manually fix common mismatches (e.g., "United States" vs. "United States of America").

Exercise 17.24 — seaborn vs. plotly side-by-side (2-star)

Create the same visualization — a box plot of coverage by region — in both seaborn and plotly. Compare: 1. Lines of code 2. Information density (what can you learn from each without additional code) 3. Output format (what can you do with each) 4. Aesthetic quality

When would you choose each for the same data?

Guidance

Both require similar code length (1-3 lines). The seaborn version has slightly cleaner aesthetics by default. The plotly version lets you hover over outliers to see which country they are, toggle income groups on/off via the legend, and zoom into specific ranges. Choose seaborn for papers and presentations with a fixed narrative. Choose plotly for team data reviews and stakeholder exploration.

Exercise 17.25 — From question to interactive visualization (3-star)

For each question below, choose between matplotlib, seaborn, and plotly as the primary tool, justify your choice, and create the visualization:

"Show me vaccination coverage on a world map, by year."
"What is the distribution shape of coverage in Africa — is it bimodal?"
"I need a publication figure showing the GDP-coverage relationship with confidence bands."
"Let me explore which specific countries are outliers in each region."
"Create a one-page dashboard that a health minister can use to track their country's progress."

Guidance

1. plotly — choropleth with animation. Neither matplotlib nor seaborn has built-in choropleth support. 2. seaborn — `displot(kind="kde")` or `violinplot` for distributional shape; statistical estimation is seaborn's strength. 3. seaborn or matplotlib — `lmplot` with confidence bands, exported as high-resolution static image. plotly's regression trendlines exist but are less statistically sophisticated. 4. plotly — interactive scatter or box with hover tooltips identifying each country. 5. plotly + Dash — interactive dashboard with dropdown for country selection, line chart for trends, and map for geographic context.