Chapter 17: Interactive Visualization — plotly, Dashboard Thinking

Contributors to Introduction to Data Science

31 min read

> "The best way to get the right answer on the internet is not to ask a question; it is to post the wrong answer."

Prerequisites

{'chapter': 15, 'description': 'matplotlib fundamentals — the Figure/Axes mental model'}
{'chapter': 16, 'description': 'seaborn — statistical plot types and the hue/col/row encoding pattern'}
{'chapter': 13, 'description': 'Getting data from the web — helpful for understanding HTML export'}

Learning Objectives

Create interactive scatter plots, line charts, bar charts, and histograms using plotly.express
Add hover tooltips, zoom, pan, and click-to-filter interactions to any chart
Build choropleth maps with geographic data and animate them over time
Construct a basic multi-chart dashboard using Dash with callbacks for user interaction
Export interactive visualizations as standalone HTML files for sharing

In This Chapter

Chapter Overview
17.1 Why Interactive? When Static Is Not Enough
17.2 Getting Started with plotly.express
17.3 Common Chart Types in plotly.express
17.4 Choropleth Maps: Data on a Map
17.5 Animation: Data in Motion
17.6 Layout Customization with update_layout
17.7 Exporting Interactive Charts
17.8 Introduction to Dash: Dashboard Thinking
17.9 plotly.express vs. seaborn: When to Use Which
17.10 Putting It Together: Interactive Global Vaccination Dashboard
17.11 Common Mistakes and How to Fix Them
17.12 The Visualization Stack So Far
17.13 Performance Considerations
17.14 Chapter Summary

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 17: Interactive Visualization — plotly, Dashboard Thinking

"The best way to get the right answer on the internet is not to ask a question; it is to post the wrong answer." — Cunningham's Law

The same principle applies to interactive charts: give people a tool to explore, and they will find the story you missed.

Chapter Overview

In Chapters 15 and 16, you built beautiful, informative, publication-quality charts. They were also static — frozen images that show exactly what you chose to show and nothing more. Your audience cannot hover over a dot to learn which country it represents, zoom into a crowded region, or slide a time control to watch patterns evolve.

Static charts are perfect for papers, reports, and slide decks where you control the narrative. But many analytical situations call for something different — a chart that invites exploration. When you share data with colleagues, stakeholders, or the public, they have questions you did not anticipate. Interactive visualization lets them ask those questions themselves.

Here is an example. In Chapter 16, you created a scatter plot of GDP per capita versus vaccination coverage with seaborn. It was informative, but with 180+ countries plotted as dots, your audience would need a separate table to identify which dot is which country. In plotly, the same chart gains hover tooltips that reveal the country name, exact GDP, exact coverage, and any other column you choose — simply by moving the mouse:

import plotly.express as px

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct",
                 hover_name="country",
                 color="region", size="population",
                 size_max=40, opacity=0.7)
fig.show()

That code produces an interactive scatter plot with tooltips, zoom, pan, box select, and a legend that toggles groups on and off. Try clicking a region name in the legend — those points disappear. Try dragging a box around a cluster — the chart zooms in. Try hovering over any point — the tooltip appears.

This chapter teaches you to build interactive charts with plotly, create geographic visualizations with choropleth maps, add animation with time sliders, and assemble multiple charts into a dashboard with Dash.

In this chapter, you will learn to:

Create interactive scatter plots, line charts, bar charts, and histograms using plotly.express
Add hover tooltips, zoom, pan, and click-to-filter interactions to any chart
Build choropleth maps with geographic data and animate them over time
Construct a basic multi-chart dashboard using Dash with callbacks for user interaction
Export interactive visualizations as standalone HTML files for sharing

17.1 Why Interactive? When Static Is Not Enough

The Limitations of Static Charts

Static charts excel at delivering a specific message. You decide the framing, the scale, the annotations, and the focus. The reader sees exactly what you intend. This is a strength for presentations and publications where you have a thesis to support.

But static charts struggle in three scenarios:

Dense data. When you plot hundreds or thousands of points, individual observations become anonymous dots. Tooltips solve this by making every point queryable.
Exploration by non-experts. When you share data with stakeholders who are not data scientists, they often want to filter, zoom, and ask "What about X?" Interactive charts let them explore without writing code.
Temporal data. A line chart of 20 countries over 30 years produces 20 overlapping lines. An animation with a time slider shows the story unfolding year by year.

The Limitations of Interactive Charts

Interactive charts are not universally better. They have real downsides:

They require a browser or HTML-capable viewer (no PDF, no print).
They can distract from your main message — if the reader can explore freely, they might focus on an irrelevant detail.
They are harder to review for accuracy because you cannot see all states of the chart at once.
They can be slow with very large datasets (100,000+ points).

Rule of thumb: Use static charts when you are telling a story. Use interactive charts when you are enabling exploration. Many analyses benefit from both: static for the final report, interactive for the working session.

A Tale of Two Workflows

To make this concrete, consider how the same dataset serves different audiences:

For a quarterly report to the board of directors (static): You create three carefully designed charts — a trend line, a bar chart, and a map — with annotations highlighting the key findings. You export them as high-resolution PNGs and embed them in a slide deck. The narrative is controlled: the board sees exactly what you want them to see, in the order you present it. If a board member asks "What about Country X?", you answer verbally or refer to a data table in the appendix.

For a team working session with regional health officers (interactive): You build the same charts in plotly and share them as HTML files. Each officer hovers over their region's data points to see exact values. They zoom into the most recent years. They click on regions in the legend to isolate their area. They discover patterns relevant to their specific countries that you might not have anticipated. The narrative is collaborative: the officers drive the exploration based on their domain knowledge.

Both workflows are legitimate. Both use the same data. The difference is who controls the exploration — you (static) or the audience (interactive).

How plotly Works Under the Hood

Before we write code, it helps to understand what plotly actually does. When you call fig.show() in a Jupyter notebook, plotly:

Converts your Python data and chart specification into a JSON object.
Passes that JSON to a JavaScript library (plotly.js) running in your browser.
The JavaScript library renders the chart as SVG and HTML in the notebook output cell.
User interactions (hover, zoom, click) are handled entirely by the JavaScript — no round-trip to Python.

This architecture means plotly charts are inherently web-native. They work in any modern browser, they can be embedded in websites, and they can be shared as standalone HTML files. It also means plotly charts are heavier than static images — a simple scatter plot might be a few kilobytes as a PNG but several megabytes as an interactive HTML file (because the plotly.js library is included).

The plotly Ecosystem

plotly is a Python library for interactive visualization. It has two main interfaces:

Interface	Purpose	Complexity
`plotly.express` (px)	High-level, one-liner charts	Low — similar to seaborn
`plotly.graph_objects` (go)	Low-level, full customization	Higher — similar to matplotlib

We will spend most of our time with plotly.express because it covers 90% of common use cases with minimal code. When you need fine-grained control, you can drop down to plotly.graph_objects, much as you drop from seaborn to matplotlib.

17.2 Getting Started with plotly.express

Installation and Imports

import plotly.express as px
import pandas as pd

df = pd.read_csv("who_vaccination_data.csv")

plotly renders charts in the browser. In Jupyter notebooks, charts appear inline. In scripts, fig.show() opens a new browser tab. All charts are built on web technologies (JavaScript, HTML, SVG), which means they work on any modern device.

Note the installation: if you have not already installed plotly, run pip install plotly in your terminal. For Jupyter notebook support, you may also need pip install nbformat. For Dash dashboards, install pip install dash. For static image export, install pip install kaleido. All of these are straightforward pip installs.

The plotly.express module (abbreviated px) is the high-level interface that we will use for most of this chapter. The naming convention parallels what you have seen throughout this course: pd for pandas, np for NumPy, plt for matplotlib.pyplot, sns for seaborn, and now px for plotly.express.

Your First Interactive Chart

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct")
fig.show()

This looks like a seaborn scatter plot, but try interacting with it:

Hover over any point to see its x and y values.
Drag to select a rectangular region and zoom in.
Double-click to reset the zoom.
Use the toolbar in the top-right corner to switch between zoom, pan, box select, and lasso select modes.

Adding Informative Tooltips

The default tooltip shows x and y values. You can add more information with hover_name and hover_data:

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct",
                 hover_name="country",
                 hover_data=["region", "population",
                             "year"])
fig.show()

Now hovering over a point shows the country name (in bold, from hover_name) plus the region, population, and year. This is the interactive equivalent of labeling every point — except it only shows information on demand, keeping the chart clean.

You can also customize tooltip format:

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct",
                 hover_name="country",
                 hover_data={
                     "gdp_per_capita": ":.0f",
                     "coverage_pct": ":.1f",
                     "population": ":,.0f"
                 })
fig.show()

The format strings use Python's format spec: :.0f for no decimal places, :,.0f for comma-separated thousands, :.1f for one decimal place.

Every plotly chart includes a toolbar in the top-right corner. Understanding these tools will save you time:

Tool	Icon	What It Does
Download plot as PNG	Camera	Exports a static screenshot
Zoom	Magnifying glass +	Drag to select a rectangular area to zoom into
Pan	Crosshair arrows	Drag to move the visible area
Box select	Dotted rectangle	Select data points within a rectangle
Lasso select	Lasso	Select data points within a free-drawn shape
Zoom in/out	+ / -	Incremental zoom
Reset axes	Home	Return to the original view

You can also double-click anywhere on the chart to reset the zoom. And here is a powerful feature many people miss: clicking a legend item toggles that trace on or off. Double-clicking a legend item isolates it — hiding all other traces. Double-click again to show all. This makes legends interactive filters, not just passive labels.

Customizing Hover Behavior

plotly offers fine-grained control over how tooltips appear:

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct",
                 hover_name="country")

fig.update_traces(
    hovertemplate="<b>%{hovertext}</b><br>"
                  "GDP: $%{x:,.0f}<br>"
                  "Coverage: %{y:.1f}%"
                  "<extra></extra>"
)
fig.show()

The hovertemplate parameter gives you complete control over the tooltip format. %{x}, %{y}, and %{hovertext} are template variables. <extra></extra> removes the secondary box that plotly adds by default (showing the trace name). This level of customization is useful when you want tooltips that match a specific reporting style.

17.3 Common Chart Types in plotly.express

Scatter Plots with Color, Size, and Faceting

plotly.express uses the same color, size, facet_col, and facet_row parameters you learned as hue, size, col, and row in seaborn:

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct",
                 color="region",
                 size="population",
                 size_max=40,
                 hover_name="country",
                 facet_col="income_group",
                 facet_col_wrap=2)
fig.show()

The mental model transfers directly from Chapter 16 — you are mapping data variables to visual encodings. The difference is that every element is now interactive. Notice that the parameter names differ slightly from seaborn: plotly uses color where seaborn uses hue, facet_col where seaborn uses col, and symbol where seaborn uses style. The concepts are identical; only the naming conventions differ.

Let us explore the scatter plot in more detail, since it is the most commonly used plotly chart type. Consider what happens when you interact with a basic scatter:

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct",
                 color="region",
                 size="population",
                 size_max=40,
                 hover_name="country",
                 hover_data=["income_group"],
                 opacity=0.7)
fig.show()

Try these interactions: (1) Hover over a small dot in a cluster — the tooltip shows country name in bold, plus all encoded variables. In a static chart, this dot would be anonymous. (2) Click a region name in the legend to hide it. This is equivalent to filtering with pandas, but without writing code. (3) Drag a selection box around the low-GDP, high-coverage corner. The chart zooms in, revealing countries that achieve high coverage despite low GDP. (4) Double-click to reset the zoom.

Each interaction would require separate code in matplotlib. In plotly, they are automatic.

Line Charts

yearly = df.groupby(["year", "region"],
                     as_index=False).agg(
    mean_coverage=("coverage_pct", "mean")
)

fig = px.line(yearly, x="year",
              y="mean_coverage",
              color="region",
              markers=True)
fig.update_layout(
    title="Mean Vaccination Coverage Over Time",
    yaxis_title="Coverage (%)")
fig.show()

Hover over any point on any line to see the exact year, region, and coverage value. Click a region name in the legend to toggle it off. Double-click a region name to isolate it (hide all others).

Bar Charts

region_means = df.groupby("region",
                           as_index=False).agg(
    mean_coverage=("coverage_pct", "mean")
).sort_values("mean_coverage", ascending=False)

fig = px.bar(region_means, x="region",
             y="mean_coverage",
             color="region",
             text_auto=".1f")
fig.update_layout(
    title="Mean Vaccination Coverage by Region",
    yaxis_title="Coverage (%)",
    showlegend=False)
fig.show()

The text_auto=".1f" parameter displays the value on each bar, formatted to one decimal place. Hover for exact numbers.

Histograms

fig = px.histogram(df, x="coverage_pct",
                   nbins=30, color="region",
                   barmode="overlay", opacity=0.6)
fig.update_layout(
    title="Distribution of Coverage by Region")
fig.show()

The barmode="overlay" parameter stacks the colored histograms on top of each other with transparency, similar to seaborn's overlapping KDE with fill=True. Other options include "stack" (bars stacked on top of each other, showing cumulative totals) and "group" (bars placed side by side within each bin, useful for direct comparison between groups).

You can also create cumulative histograms:

fig = px.histogram(df, x="coverage_pct",
                   nbins=30, cumulative=True,
                   title="Cumulative Distribution")
fig.show()

This shows the cumulative count or proportion up to each value — the interactive equivalent of seaborn's ECDF plot. Hovering over any bin tells you both the bin count and the cumulative total.

Box Plots and Violin Plots

fig = px.box(df, x="region", y="coverage_pct",
             color="income_group",
             hover_data=["country"])
fig.show()

Interactive box plots let you hover over the box to see the exact quartile values, and hover over outlier points to identify which observation they represent — something impossible in static box plots.

fig = px.violin(df, x="region",
                y="coverage_pct",
                color="income_group",
                box=True, points="all")
fig.show()

The box=True parameter adds a miniature box plot inside each violin. The points="all" parameter overlays individual data points, similar to the seaborn violin + strip combination.

Trendlines in Scatter Plots

plotly can add statistical trendlines directly to scatter plots:

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct",
                 color="region",
                 trendline="ols",
                 hover_name="country")
fig.show()

The trendline="ols" parameter fits an ordinary least squares regression line to each color group. Hovering over the trendline displays the regression equation and R-squared value. Other options include trendline="lowess" for a LOWESS smoother and trendline="expanding" for an expanding mean.

To fit a single trendline to all points regardless of color groups:

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct",
                 color="region",
                 trendline="ols",
                 trendline_scope="overall")
fig.show()

Sunburst and Treemap Charts

plotly includes hierarchical chart types that seaborn and matplotlib do not offer:

fig = px.sunburst(df, path=["region", "income_group"],
                  values="population",
                  color="coverage_pct",
                  color_continuous_scale="YlGnBu")
fig.show()

This creates a radial chart where the inner ring shows regions and the outer ring shows income groups within each region. The slice size represents population, and the color represents coverage. Click on a region to zoom into its income groups.

Treemaps show the same hierarchical data as nested rectangles:

fig = px.treemap(df, path=["region", "income_group",
                            "country"],
                 values="population",
                 color="coverage_pct",
                 color_continuous_scale="YlGnBu")
fig.show()

These hierarchical charts are particularly effective for showing part-to-whole relationships in nested categories. They work well when you want to show both the relative size of categories (area) and a metric about each category (color) simultaneously.

17.4 Choropleth Maps: Data on a Map

One of plotly's most impressive features is built-in geographic visualization. A choropleth map colors regions (countries, states, counties) according to a data value.

World Choropleth

latest = df[df["year"] == df["year"].max()]

fig = px.choropleth(latest,
                    locations="iso_alpha",
                    color="coverage_pct",
                    hover_name="country",
                    color_continuous_scale="YlGnBu",
                    range_color=[50, 100],
                    title="Global Vaccination Coverage")
fig.show()

Let us unpack this:

locations="iso_alpha" — the column containing ISO 3166-1 alpha-3 country codes (like "USA", "GBR", "BRA"). plotly uses these to match data to countries on the map.
color="coverage_pct" — the column that determines the fill color of each country.
color_continuous_scale="YlGnBu" — a sequential colormap from yellow (low) through green to blue (high).
range_color=[50, 100] — clamps the color range. Countries below 50% appear the same color as 50%.

The result is a world map where you can hover over any country to see its name and exact coverage value, zoom into specific regions by scrolling, and pan by dragging.

Customizing the Map Projection

fig = px.choropleth(latest,
                    locations="iso_alpha",
                    color="coverage_pct",
                    hover_name="country",
                    projection="natural earth",
                    color_continuous_scale="RdYlGn")
fig.show()

plotly supports many projections: "natural earth", "equirectangular", "orthographic" (globe), "mercator", and more. Each has trade-offs in area distortion — a preview of the cartographic considerations we touch on in Chapter 18.

Scope: Focusing on a Region

fig = px.choropleth(latest,
                    locations="iso_alpha",
                    color="coverage_pct",
                    hover_name="country",
                    scope="africa",
                    color_continuous_scale="YlGnBu")
fig.update_layout(
    title="Vaccination Coverage in Africa")
fig.show()

The scope parameter zooms to a continent: "africa", "asia", "europe", "north america", "south america".

Scatter on a Map (scatter_geo)

For point-level geographic data (city-level, not country-level), use scatter_geo:

fig = px.scatter_geo(df, lat="latitude",
                     lon="longitude",
                     color="coverage_pct",
                     size="population",
                     hover_name="country",
                     projection="natural earth",
                     color_continuous_scale="YlGnBu")
fig.show()

This places dots at specific latitude/longitude coordinates on a world map. The dots can be colored and sized by data variables. scatter_geo is useful when your data has geographic coordinates but does not correspond to standard administrative boundaries (e.g., hospital locations, weather stations, city-level data).

Dealing with Missing Countries

When you create a choropleth, countries without data appear as a default light gray. This is usually appropriate — it signals "no data" rather than "low value." But if you want to distinguish "no data" from "very low value," set a custom na_color:

fig = px.choropleth(latest,
                    locations="iso_alpha",
                    color="coverage_pct",
                    hover_name="country",
                    color_continuous_scale="YlGnBu",
                    range_color=[40, 100])
fig.update_geos(showcountries=True,
                countrycolor="lightgray")
fig.update_layout(
    geo=dict(bgcolor="white",
             landcolor="whitesmoke"))
fig.show()

The update_geos call makes country borders visible even for countries without data, and the background styling ensures the map looks clean. Always think about what the gray countries mean — if they are systematically different from the colored countries (e.g., all the missing countries are small island nations), that missing-ness itself is worth noting.

17.5 Animation: Data in Motion

Animation adds a time dimension to any plotly chart. The key parameter is animation_frame, which specifies the column that steps through time.

Animated Scatter Plot

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct",
                 animation_frame="year",
                 animation_group="country",
                 color="region",
                 size="population",
                 size_max=40,
                 hover_name="country",
                 range_x=[0, 80000],
                 range_y=[30, 100])
fig.update_layout(
    title="GDP vs. Vaccination Over Time")
fig.show()

This creates a scatter plot with a play button and a year slider at the bottom. Press play, and the dots move — each country traces its path through GDP-coverage space over the years. Fixed axis ranges (range_x, range_y) prevent the axes from rescaling with each frame, which would make the animation disorienting.

The animation_group="country" parameter tells plotly that the same country should be tracked across frames, enabling smooth transitions rather than dots blinking in and out.

Animated Choropleth

fig = px.choropleth(df,
                    locations="iso_alpha",
                    color="coverage_pct",
                    hover_name="country",
                    animation_frame="year",
                    color_continuous_scale="YlGnBu",
                    range_color=[50, 100],
                    title="Global Vaccination: "
                          "Year by Year")
fig.show()

This is the project milestone for this chapter: an animated world map showing vaccination coverage changing over time. Press play, and you see which countries improve, which stagnate, and which regress. The time slider lets you jump to any specific year.

Animation Tips

Fix your axis ranges. If you let plotly auto-scale, the axes jump around with each frame and the animation is unreadable.
Keep frames short. Animations with 50+ frames become tedious. If your data spans 30 years, consider using 5-year intervals instead of annual data.
Use animation for presentation, not exploration. Animations are great for storytelling ("watch how this changes over time") but poor for analytical work (you cannot compare two frames side by side).

The Gapminder-Style Bubble Animation

The most famous animated scatter plot is Hans Rosling's Gapminder visualization, which shows the relationship between income and life expectancy over 200 years. plotly makes this style accessible:

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct",
                 animation_frame="year",
                 animation_group="country",
                 color="region",
                 size="population",
                 size_max=40,
                 hover_name="country",
                 log_x=True,
                 range_x=[500, 100000],
                 range_y=[30, 100])

fig.update_layout(
    title="The Wealth-Health Nexus Over Time",
    xaxis_title="GDP per Capita (log scale, USD)",
    yaxis_title="Vaccination Coverage (%)")
fig.show()

Note the log_x=True parameter, which puts GDP on a logarithmic scale. This is appropriate because income differences are multiplicative, not additive — the difference between $500 and $5,000 (a 10x change) is more meaningful than the difference between $50,000 and $54,500 (also $4,500, but only a 9% change). A log scale treats proportional changes equally, spreading out the low-GDP countries that would otherwise cluster on the left edge.

Animated Bar Chart Races

A "bar chart race" is an animation where bars reorder and resize over time, showing how rankings change. plotly can create a simple version:

fig = px.bar(df_sorted, x="coverage_pct",
             y="country",
             animation_frame="year",
             orientation="h",
             color="region",
             range_x=[0, 100],
             title="Country Coverage Rankings")
fig.update_layout(yaxis={"categoryorder": "total ascending"})
fig.show()

Bar chart races are engaging but should be used sparingly — they prioritize entertainment over analysis. The human eye struggles to track a specific bar as it moves up and down the ranking, and the reordering makes it hard to focus on any single country's trajectory. For analytical work, a line chart (where each country has a fixed horizontal position) is almost always superior.

17.6 Layout Customization with update_layout

plotly charts are customized by calling .update_layout() on the figure object:

fig = px.scatter(df, x="gdp_per_capita",
                 y="coverage_pct",
                 color="region",
                 hover_name="country")

fig.update_layout(
    title={
        "text": "GDP vs. Vaccination Coverage",
        "x": 0.5,
        "font": {"size": 18}
    },
    xaxis_title="GDP per Capita (USD)",
    yaxis_title="Vaccination Coverage (%)",
    legend_title="WHO Region",
    template="plotly_white",
    width=800,
    height=500,
    margin=dict(l=60, r=30, t=60, b=60)
)
fig.show()

Templates (Themes)

plotly includes several built-in templates, analogous to seaborn's themes:

Template	Description
`"plotly"`	Default plotly style
`"plotly_white"`	Clean white background
`"plotly_dark"`	Dark background (good for presentations)
`"ggplot2"`	R's ggplot2 style
`"seaborn"`	seaborn-inspired style
`"simple_white"`	Minimal, publication-ready

Set the template in update_layout() or globally:

import plotly.io as pio
pio.templates.default = "plotly_white"

Axis Formatting and Annotations

plotly provides extensive axis formatting options:

fig.update_xaxes(
    title_text="GDP per Capita (USD)",
    tickformat=",",       # Comma separators
    type="log",           # Logarithmic scale
    gridcolor="lightgray",
    gridwidth=0.5
)

fig.update_yaxes(
    title_text="Coverage (%)",
    range=[0, 100],       # Fixed range
    dtick=10              # Tick every 10 units
)

Adding annotations (text labels, arrows, reference lines) to plotly charts:

# Add a horizontal reference line
fig.add_hline(y=90, line_dash="dash",
              line_color="red",
              annotation_text="90% Target")

# Add a vertical reference line
fig.add_vline(x=10000, line_dash="dot",
              line_color="gray",
              annotation_text="$10K threshold")

# Add a text annotation
fig.add_annotation(
    x=5000, y=95,
    text="Rwanda: High coverage,<br>low GDP",
    showarrow=True,
    arrowhead=2,
    font=dict(size=11))

These annotation capabilities mirror what you learned in Chapter 15 with matplotlib's ax.annotate() and ax.axhline(), but plotly's annotations are themselves interactive — they move when you zoom and pan, staying anchored to their data coordinates.

Figure Size and Margins

Unlike matplotlib (which uses inches), plotly uses pixels:

fig.update_layout(
    width=800,           # Pixels wide
    height=500,          # Pixels tall
    margin=dict(
        l=60, r=30,      # Left, right margins
        t=60, b=60       # Top, bottom margins
    ),
    font=dict(size=12)   # Global font size
)

For Jupyter notebooks, the default size works well. For HTML exports, consider the expected screen size. For dashboards, set widths in percentages rather than pixels so charts resize with the browser window.

17.7 Exporting Interactive Charts

HTML Export

The most common way to share interactive plotly charts is as standalone HTML files:

fig.write_html("vaccination_scatter.html",
               include_plotlyjs=True)

The resulting file contains all the JavaScript needed to render the chart — no server required. Anyone with a web browser can open it and interact with the chart. File sizes are typically 3-5 MB because the plotly.js library is embedded.

To reduce file size when sharing multiple charts:

fig.write_html("chart.html",
               include_plotlyjs="cdn")

This loads plotly.js from a CDN instead of embedding it, reducing the file to a few kilobytes — but requires an internet connection to view.

Static Image Export

You can also export static images for reports:

fig.write_image("chart.png", scale=2)  # 2x for retina
fig.write_image("chart.pdf")
fig.write_image("chart.svg")

Note: static image export requires the kaleido package (pip install kaleido).

Embedding in Notebooks vs. Web Pages

In Jupyter notebooks, plotly charts render inline by default. But there are cases where you might want to control the rendering:

import plotly.io as pio

# Force rendering in browser (opens a new tab)
pio.renderers.default = "browser"

# Force static image in notebook (no interactivity)
pio.renderers.default = "png"

# Use the default notebook renderer
pio.renderers.default = "notebook"

For embedding in web pages (not Jupyter), the full_html=False parameter in write_html generates an HTML fragment (just a div and script tag) rather than a complete HTML document:

fig.write_html("chart_fragment.html",
               full_html=False,
               include_plotlyjs="cdn")

This fragment can be embedded in a larger HTML page using server-side includes, template engines, or simply by pasting the HTML into your web page's body.

Combining Multiple Charts on One Page

While Dash is the full solution for multi-chart dashboards, you can create simple multi-chart HTML pages using plotly's make_subplots:

from plotly.subplots import make_subplots
import plotly.graph_objects as go

fig = make_subplots(rows=1, cols=2,
                    subplot_titles=["GDP vs Coverage",
                                    "Coverage by Region"])

fig.add_trace(
    go.Scatter(x=df["gdp_per_capita"],
               y=df["coverage_pct"],
               mode="markers",
               text=df["country"],
               name="Countries"),
    row=1, col=1)

region_means = df.groupby("region")["coverage_pct"].mean()
fig.add_trace(
    go.Bar(x=region_means.index,
           y=region_means.values,
           name="Mean Coverage"),
    row=1, col=2)

fig.update_layout(height=400, width=900,
                  showlegend=False)
fig.show()

This creates a side-by-side layout with a scatter plot and a bar chart. Each subplot is interactive independently. Note that make_subplots uses plotly.graph_objects syntax (more verbose than plotly.express), but it gives you full control over multi-chart layouts within a single figure.

17.8 Introduction to Dash: Dashboard Thinking

So far, each chart exists in isolation. A dashboard combines multiple charts on a single page, connected by shared controls — dropdowns, sliders, checkboxes — that filter or transform the underlying data. When you select a region in a dropdown, all charts update to show only that region's data.

What Is Dash?

Dash is a Python framework (by the same company that makes plotly) for building web-based dashboards. A Dash app is a Python script that:

Defines a layout — what charts and controls appear on the page.
Defines callbacks — functions that run when the user interacts with a control.
Runs a local web server that serves the dashboard in a browser.

A Minimal Dashboard

from dash import Dash, html, dcc, Input, Output
import plotly.express as px
import pandas as pd

df = pd.read_csv("who_vaccination_data.csv")

app = Dash(__name__)

app.layout = html.Div([
    html.H1("Vaccination Coverage Explorer"),

    dcc.Dropdown(
        id="region-dropdown",
        options=[{"label": r, "value": r}
                 for r in df["region"].unique()],
        value=df["region"].unique()[0],
        clearable=False
    ),

    dcc.Graph(id="scatter-plot"),
    dcc.Graph(id="histogram")
])

@app.callback(
    Output("scatter-plot", "figure"),
    Output("histogram", "figure"),
    Input("region-dropdown", "value")
)
def update_charts(selected_region):
    filtered = df[df["region"] == selected_region]

    scatter = px.scatter(
        filtered, x="gdp_per_capita",
        y="coverage_pct",
        hover_name="country",
        title=f"GDP vs Coverage: {selected_region}"
    )

    hist = px.histogram(
        filtered, x="coverage_pct", nbins=20,
        title=f"Coverage Distribution: "
              f"{selected_region}"
    )

    return scatter, hist

if __name__ == "__main__":
    app.run(debug=True)

Let us walk through this:

The layout defines an H1 heading, a dropdown menu populated with region names, and two empty graph components identified by id strings.

The callback is a decorated function. The @app.callback decorator specifies that the function's outputs are the figure property of "scatter-plot" and "histogram", and its input is the value property of "region-dropdown". Whenever the user selects a new region, this function runs: it filters the DataFrame, creates two new plotly figures, and returns them. Dash automatically updates the page.

Running the app starts a local web server (usually at http://127.0.0.1:8050). Open that URL in a browser, and you see your dashboard.

Adding a Slider

app.layout = html.Div([
    html.H1("Vaccination Coverage Explorer"),

    dcc.Dropdown(
        id="region-dropdown",
        options=[{"label": r, "value": r}
                 for r in df["region"].unique()],
        value=df["region"].unique()[0]
    ),

    dcc.Slider(
        id="year-slider",
        min=df["year"].min(),
        max=df["year"].max(),
        value=df["year"].max(),
        marks={str(y): str(y)
               for y in df["year"].unique()},
        step=None
    ),

    dcc.Graph(id="scatter-plot"),
    dcc.Graph(id="histogram")
])

The slider lets users select a year. Update the callback to accept both inputs:

@app.callback(
    Output("scatter-plot", "figure"),
    Output("histogram", "figure"),
    Input("region-dropdown", "value"),
    Input("year-slider", "value")
)
def update_charts(selected_region, selected_year):
    filtered = df[(df["region"] == selected_region) &
                  (df["year"] == selected_year)]
    # ... create and return figures

Understanding Callbacks: The Reactive Model

The callback pattern may feel unfamiliar if you have only written linear Python scripts. In a traditional script, code runs top to bottom, once. In Dash, callbacks are reactive — they sit dormant until an Input changes, then they execute. Think of it like Excel: when you change a cell that a formula depends on, the formula recalculates automatically. Dash callbacks work the same way — when the dropdown value changes, the function re-runs with the new value.

A few important rules about callbacks:

Callbacks must return the same number of values as Outputs. If you have two Outputs, the function must return two values (in order).
All Inputs trigger the callback. If your callback has three Inputs, changing any of them triggers the function. There is no way to say "only react to this one."
Callbacks should be fast. Every callback runs in response to user interaction, so it should complete quickly (under a second). Expensive computations (loading data, heavy aggregation) should happen outside the callback.
Callbacks cannot modify global state reliably. Each callback invocation should be independent. Do not store results in global variables — use Dash's dcc.Store component for client-side state.

Styling Your Dashboard

Dash apps use CSS for styling. You can add a stylesheet to improve the visual appearance:

app = Dash(__name__,
           external_stylesheets=[
               "https://codepen.io/chriddyp/"
               "pen/bWLwgP.css"
           ])

This loads a simple CSS grid framework that helps with layout. You can also use Bootstrap for more sophisticated designs:

import dash_bootstrap_components as dbc

app = Dash(__name__,
           external_stylesheets=[dbc.themes.FLATLY])

The dash-bootstrap-components library provides pre-built components (cards, navigation bars, grids) that make dashboards look professional without writing custom CSS.

Dashboard Design Principles

Building a dashboard is as much a design challenge as a technical one:

Start with a question. What does the user need to learn? Every control and chart should serve that goal.
Limit controls. Every dropdown, slider, and checkbox adds cognitive load. Three to five controls is usually the maximum before the interface becomes overwhelming.
Show the big picture first. Place summary charts (totals, trends) at the top. Detail charts (scatter, tables) below.
Connect charts logically. If the dropdown filters one chart, it should filter all charts. Inconsistent filtering confuses users.
Provide defaults. Pre-select the most common or most interesting values. The dashboard should be informative without any interaction.

A More Complete Dashboard Example

Let us build a slightly more sophisticated dashboard that demonstrates multiple component types:

from dash import Dash, html, dcc, Input, Output
import plotly.express as px
import pandas as pd

df = pd.read_csv("who_vaccination_data.csv")

app = Dash(__name__)

app.layout = html.Div([
    html.H1("Global Vaccination Explorer",
            style={"textAlign": "center"}),

    html.Div([
        html.Div([
            html.Label("Select Region:"),
            dcc.Dropdown(
                id="region-dropdown",
                options=[{"label": "All Regions",
                          "value": "All"}] +
                        [{"label": r, "value": r}
                         for r in sorted(
                             df["region"].unique())],
                value="All",
                clearable=False
            )
        ], style={"width": "30%",
                  "display": "inline-block"}),

        html.Div([
            html.Label("Select Year:"),
            dcc.Slider(
                id="year-slider",
                min=df["year"].min(),
                max=df["year"].max(),
                value=df["year"].max(),
                marks={str(y): str(y)
                       for y in range(
                           df["year"].min(),
                           df["year"].max() + 1, 5)},
                step=1
            )
        ], style={"width": "60%",
                  "display": "inline-block",
                  "marginLeft": "5%"})
    ]),

    html.Div([
        dcc.Graph(id="map-chart",
                  style={"width": "50%",
                         "display": "inline-block"}),
        dcc.Graph(id="scatter-chart",
                  style={"width": "50%",
                         "display": "inline-block"})
    ]),

    dcc.Graph(id="trend-chart")
])


@app.callback(
    Output("map-chart", "figure"),
    Output("scatter-chart", "figure"),
    Output("trend-chart", "figure"),
    Input("region-dropdown", "value"),
    Input("year-slider", "value")
)
def update_all(region, year):
    # Filter by region (if not "All")
    filt = df.copy()
    if region != "All":
        filt = filt[filt["region"] == region]

    # Map: selected year
    year_data = filt[filt["year"] == year]
    map_fig = px.choropleth(
        year_data, locations="iso_alpha",
        color="coverage_pct",
        hover_name="country",
        color_continuous_scale="YlGnBu",
        range_color=[40, 100],
        title=f"Coverage Map ({year})")

    # Scatter: selected year
    scatter_fig = px.scatter(
        year_data, x="gdp_per_capita",
        y="coverage_pct",
        hover_name="country",
        color="region",
        size="population", size_max=30,
        title=f"GDP vs Coverage ({year})")

    # Trend: all years
    trend_data = filt.groupby(
        ["year", "region"],
        as_index=False)["coverage_pct"].mean()
    trend_fig = px.line(
        trend_data, x="year",
        y="coverage_pct",
        color="region", markers=True,
        title="Coverage Trend Over Time")
    trend_fig.add_vline(
        x=year, line_dash="dash",
        line_color="gray",
        annotation_text=f"Selected: {year}")

    return map_fig, scatter_fig, trend_fig


if __name__ == "__main__":
    app.run(debug=True)

This dashboard demonstrates several important patterns:

An "All" option in the dropdown — when no specific region is selected, all data is shown. This is the default that provides the big picture.
Side-by-side charts — the map and scatter share the same row, giving geographic and statistical views simultaneously.
A vertical reference line on the trend chart shows which year is currently selected via the slider. This connects the temporal control to the temporal chart visually.
Three callbacks returning at once — all three charts update together from the same two inputs, maintaining consistency.

When to Use Dash vs. Standalone HTML

Situation	Approach
Share with one person via email	Standalone HTML files
Embed in a presentation or paper	Static export (PNG/PDF)
Team uses it repeatedly	Dash app deployed on a server
One-time exploration	Individual charts in a notebook
Public-facing data portal	Dash app with Heroku/Render deployment

The key question is frequency of use. If someone will look at the visualization once, a standalone HTML file is perfect. If a team will revisit the same dashboard weekly to check updated data, the investment in a Dash app pays off.

17.9 plotly.express vs. seaborn: When to Use Which

You now have two high-level visualization libraries. Here is a decision guide:

Criterion	Use seaborn	Use plotly
Output format	Static image (PNG, PDF, paper)	Interactive HTML, dashboard
Audience	Academic paper, printed report	Web, stakeholders, exploratory
Statistical summaries	Built-in (KDE, CI, regression)	Limited (histogram, trendline)
Tooltip/hover	Not available (static)	Built-in on every chart
Geographic maps	Not built-in	Built-in choropleth and scatter_geo
Animation	Not built-in	Built-in with animation_frame
Customization depth	Deep (via matplotlib)	Deep (via graph_objects)
Rendering speed	Fast (matplotlib backend)	Moderate (browser rendering)
Plot types	More statistical types (violin, swarm, pair, heatmap)	More interactive types (choropleth, 3D, animation)

In practice, most data scientists use both. seaborn for exploration and publication figures. plotly for sharing with non-technical stakeholders and for geographic/animated visualizations.

A Practical Workflow: Both Tools in One Analysis

Here is how a professional data scientist might use both tools in a single project:

Exploration phase (seaborn in Jupyter): Use pairplot, heatmap, catplot, and lmplot to understand the data's structure, distributions, correlations, and group differences. seaborn's statistical intelligence (KDE, confidence intervals, regression fits) is essential here.
Communication phase (plotly for stakeholders): Take the 3-4 most important findings from exploration and rebuild them as interactive plotly charts. Add hover tooltips so stakeholders can identify specific observations. Add a choropleth map if geographic patterns are relevant. Export as HTML files.
Publication phase (seaborn/matplotlib for paper): For the final report or paper, recreate the key charts in seaborn or matplotlib with publication styling (context="paper", style="ticks"). Export at 300 DPI as PNG or PDF. These static versions have precise control over layout, typography, and annotation that interactive charts cannot match.
Dashboard phase (Dash for ongoing monitoring): If the analysis will be repeated regularly with updated data, build a Dash app. The exploration and communication phases informed which charts and controls to include.

17.10 Putting It Together: Interactive Global Vaccination Dashboard

Let us build the chapter's project milestone — an interactive choropleth map of global vaccination rates with a time slider and region filter.

Step 1: The Animated Choropleth

fig = px.choropleth(
    df, locations="iso_alpha",
    color="coverage_pct",
    hover_name="country",
    hover_data={"coverage_pct": ":.1f",
                "gdp_per_capita": ":,.0f",
                "region": True},
    animation_frame="year",
    color_continuous_scale="YlGnBu",
    range_color=[40, 100],
    projection="natural earth",
    title="Global Vaccination Coverage Over Time"
)

fig.update_layout(
    coloraxis_colorbar_title="Coverage (%)",
    width=900, height=500)
fig.show()

Step 2: Supporting Charts

The animated map provides the geographic story. Now let us add supporting charts that provide statistical context:

# Trend lines by region
yearly = df.groupby(["year", "region"],
                     as_index=False).agg(
    mean_coverage=("coverage_pct", "mean"))

trend = px.line(yearly, x="year",
                y="mean_coverage",
                color="region", markers=True,
                title="Regional Coverage Trends")
trend.update_layout(yaxis_title="Coverage (%)",
                    template="plotly_white")
trend.show()

The trend chart complements the animated map. While the map shows geographic distribution at each point in time, the trend chart shows temporal trajectories for each region simultaneously. Together, they answer both "where is coverage high/low?" and "is coverage improving or declining?"

# Latest year scatter
latest = df[df["year"] == df["year"].max()]
scatter = px.scatter(
    latest, x="gdp_per_capita",
    y="coverage_pct",
    color="region", size="population",
    size_max=35, hover_name="country",
    hover_data={"gdp_per_capita": ":,.0f",
                "coverage_pct": ":.1f",
                "income_group": True},
    opacity=0.7,
    trendline="lowess",
    title="GDP vs. Coverage (Latest Year)")
scatter.update_layout(
    xaxis_title="GDP per Capita (USD)",
    yaxis_title="Coverage (%)",
    template="plotly_white")
scatter.show()

The scatter plot adds the economic dimension. The LOWESS trendline reveals the diminishing-returns relationship: coverage rises steeply with GDP at low incomes and plateaus at high incomes. The interactive tooltips let stakeholders identify specific countries that interest them — a feature that would require a separate data lookup with static charts.

Step 3: A Bar Chart of Current Status

region_summary = (latest.groupby("region")
    .agg(
        mean_coverage=("coverage_pct", "mean"),
        n_countries=("country", "nunique"),
        below_90=("coverage_pct",
                  lambda x: (x < 90).sum())
    )
    .sort_values("mean_coverage", ascending=True)
    .reset_index())

bar = px.bar(region_summary, y="region",
             x="mean_coverage",
             orientation="h",
             text_auto=".1f",
             hover_data=["n_countries", "below_90"],
             title="Mean Coverage by Region "
                   "(Latest Year)")
bar.update_layout(
    xaxis_title="Mean Coverage (%)",
    xaxis_range=[0, 100],
    yaxis_title="",
    template="plotly_white")
bar.show()

The horizontal bar chart provides the simplest view: which region is doing best, which is doing worst. The hover data adds information not shown visually — the number of countries per region and the number below the 90% target. This is the "headline chart" that a busy executive would scan first.

fig.write_html("global_vaccination_map.html")
trend.write_html("regional_trends.html")
scatter.write_html("gdp_coverage_scatter.html")
bar.write_html("regional_summary_bar.html")

Four HTML files, each fully interactive, that anyone can open in a browser. No Python required on the viewer's end. The total file size (if using include_plotlyjs=True for the first file and include_plotlyjs=False for the rest, plus a shared plotly.js file) is about 3-5 MB for the set.

17.11 Common Mistakes and How to Fix Them

Mistake 1: Forgetting to Fix Axis Ranges in Animations

# Problem: axes jump around every frame
fig = px.scatter(df, x="gdp", y="coverage",
                 animation_frame="year")

# Fix: set explicit ranges
fig = px.scatter(df, x="gdp", y="coverage",
                 animation_frame="year",
                 range_x=[0, 80000],
                 range_y=[30, 100])

Mistake 2: Overloading Tooltips

# Problem: tooltip shows 15 columns of data
fig = px.scatter(df, x="x", y="y",
                 hover_data=df.columns.tolist())

# Fix: show only the most useful columns
fig = px.scatter(df, x="x", y="y",
                 hover_name="country",
                 hover_data=["region", "year"])

Mistake 3: Using plotly for Print

plotly charts look great on screen but may not export cleanly to PDF or print at the resolution you need. For publication figures, use seaborn or matplotlib and export at 300+ DPI.

Mistake 4: Building Dashboards Before Building Charts

Start by creating individual plotly charts in a notebook. Once each chart works, then assemble them into a Dash layout. Debugging a callback is much harder than debugging a standalone chart.

Mistake 5: Too Many Dashboard Controls

Every dropdown and slider is a decision the user must make. Three controls with 5 options each create 125 possible states. Start with one or two controls and add more only if users request them.

Mistake 6: Not Handling Empty Data States

When a callback filters the data and the result is an empty DataFrame, plotly may produce an error or an ugly empty chart:

# Problem: some region-year combinations have no data
filtered = df[(df["region"] == region) &
              (df["year"] == year)]
# If filtered is empty, px.scatter will crash
fig = px.scatter(filtered, x="x", y="y")

Fix: Add a check in your callback:

if filtered.empty:
    fig = px.scatter(title="No data available "
                     "for this selection")
    fig.add_annotation(text="Try a different "
                       "region or year",
                       xref="paper", yref="paper",
                       x=0.5, y=0.5, showarrow=False,
                       font=dict(size=16))
    return fig

Mistake 7: Huge Datasets in the Browser

plotly sends all data to the browser as JSON. If your DataFrame has 500,000 rows, the browser has to parse and render 500,000 markers, which can be extremely slow or crash the tab entirely.

Fixes: - Aggregate before plotting. Instead of plotting every row, compute group means, medians, or counts. - Sample randomly. df.sample(5000) gives a representative subset. - Use px.density_heatmap() to show density instead of individual points. - For Dash apps, consider server-side filtering — only send the data that matches the current filter selections.

17.12 The Visualization Stack So Far

You now have a complete visualization toolkit:

Tool	Strength	Output	Chapters
matplotlib	Low-level control, any custom chart	Static PNG/PDF/SVG	15
seaborn	Statistical charts, elegant defaults	Static PNG/PDF/SVG	16
plotly	Interactive charts, maps, animation	HTML, dashboard	17

They are not competitors — they are layers. matplotlib is the engine. seaborn is the statistical expressway. plotly is the interactive experience. And Dash is the web application framework that puts plotly charts in front of non-technical users.

Choosing Your Tool: A Decision Flowchart

When you sit down to create a visualization, ask these questions in order:

1. Is the output for print (paper, PDF, poster)? If yes, use matplotlib or seaborn. Static output is their strength. Export at 300+ DPI.

2. Does the visualization need statistical computation (KDE, regression, confidence intervals)? If yes and the output is static, use seaborn. Its statistical intelligence is unmatched. If yes and the output is interactive, use plotly with trendline parameters (less sophisticated but often sufficient).

3. Does the audience need to explore, filter, or hover? If yes, use plotly. Export as HTML for one-time sharing, or build a Dash app for recurring use.

4. Does the data have a geographic dimension? If yes, plotly is almost certainly the right choice. Neither matplotlib nor seaborn has built-in choropleth support (there are third-party libraries like geopandas + matplotlib, but plotly's integration is far smoother).

5. Does the visualization need animation? If yes, plotly is the simplest option. matplotlib has FuncAnimation for programmatic animation, but plotly's animation_frame parameter is dramatically simpler.

6. Is this for a recurring dashboard used by a team? If yes, build a Dash app. The upfront investment pays off when multiple people use it repeatedly with updated data.

Most of the time, you will use seaborn for exploration and analysis in notebooks, and plotly for sharing results with others. This is not a rule — it is a pattern that emerges from the strengths of each tool.

17.13 Performance Considerations

Interactive visualizations live in a web browser, which introduces performance constraints that static charts do not face.

Data Volume Limits

As a rough guide:

Data Size	Performance
< 1,000 points	Instant rendering, smooth interaction
1,000 - 10,000 points	Fast rendering, smooth zoom/pan
10,000 - 50,000 points	Noticeable rendering delay, smooth once loaded
50,000 - 100,000 points	Slow rendering, choppy interaction
> 100,000 points	May crash browser tab or become unusable

These numbers are approximate and depend on the chart type (scatter plots are heavier per point than line charts), the browser, and the computer's hardware. Animations multiply the problem — 10,000 points across 20 frames means 200,000 total data points for the browser to manage.

Strategies for Large Datasets

When your data exceeds the comfortable range, you have several options:

Aggregation. Instead of plotting every row, compute summaries. Replace a 500,000-row scatter with a 2D histogram or hexbin that aggregates into a grid. Replace a million-row time series with daily or weekly averages.

Sampling. df.sample(5000) gives a random subset. For stratified sampling that preserves group proportions:

sampled = df.groupby("region").apply(
    lambda x: x.sample(min(len(x), 500))
).reset_index(drop=True)

WebGL rendering. Some plotly chart types support WebGL, which uses the GPU for rendering and can handle much larger datasets:

fig = px.scatter(large_df, x="x", y="y",
                 render_mode="webgl")

WebGL rendering trades some interactivity features (custom hover templates may be limited) for dramatically better performance with large point counts.

Server-side computation with Dash. In a Dash app, the Python server handles data filtering and aggregation. Only the filtered subset is sent to the browser. This means the browser never has to deal with the full dataset — it only renders the few hundred or thousand points that match the current filter selection.

In Chapter 18, you will step back from tools entirely and think about design — what makes a visualization honest, accessible, and effective, regardless of which library produced it.

17.14 Chapter Summary

You started this chapter making static charts. Now you can build interactive visualizations that your audience can explore:

plotly.express provides a high-level API similar to seaborn's, producing interactive charts with hover, zoom, pan, and legend toggle.
Choropleth maps visualize geographic data by coloring regions according to data values, with built-in country and state geometries.
Animation adds a time dimension via animation_frame, with play buttons and sliders for temporal exploration.
Dash turns individual charts into multi-chart dashboards with dropdowns, sliders, and callbacks that update all charts when the user interacts.
HTML export lets you share interactive charts with anyone who has a browser, no Python installation required.
Design principles for dashboards emphasize starting with a question, limiting controls, showing the big picture first, and providing sensible defaults.

The tools are powerful. But power without judgment produces confusing or misleading charts. Chapter 18 is about that judgment — the principles of visualization design, accessibility, ethics, and the most common mistakes that even experienced analysts make.

What to Practice Next

Before moving to Chapter 18, take 30 minutes to do this practice exercise. It will cement the plotly workflow:

Load the vaccination dataset in a new notebook.
Create an interactive scatter plot with px.scatter() — GDP vs. coverage, colored by region, with hover tooltips showing country names. Spend 2 minutes exploring: zoom into a cluster, filter by clicking legend items, identify specific countries via hover.
Create a choropleth map with px.choropleth() for the latest year. Hover over 5 countries and check their values against the scatter plot. Does the geographic view tell you something the scatter plot did not?
Create an animated version of the choropleth. Press play and watch the progression. Identify one country that improved dramatically and one that regressed.
Export all three as HTML files. Open them in your browser (not Jupyter). Send one to a friend or classmate and see if they can discover the same patterns you found.

This exercise takes you through the complete plotly workflow: create, explore, compare views, animate, and share. The comparison step (3) is especially important — seeing the same data through geographic and statistical lenses often reveals patterns that either lens alone would miss. Countries that are geographic neighbors sometimes have wildly different coverage, which is visible on the map but hidden in the scatter plot. Conversely, the scatter plot reveals the GDP-coverage relationship that the map cannot show.

By practicing this workflow, you develop the habit of creating multiple views of the same data — a core skill that will serve you throughout your career as a data scientist.

Prerequisites

Learning Objectives

In This Chapter

Chapter 17: Interactive Visualization — plotly, Dashboard Thinking

Chapter Overview

17.1 Why Interactive? When Static Is Not Enough

The Limitations of Static Charts

The Limitations of Interactive Charts

A Tale of Two Workflows

How plotly Works Under the Hood

The plotly Ecosystem

17.2 Getting Started with plotly.express

Installation and Imports

Your First Interactive Chart

Adding Informative Tooltips

The Built-In Toolbar

Customizing Hover Behavior

17.3 Common Chart Types in plotly.express

Scatter Plots with Color, Size, and Faceting

Line Charts

Bar Charts

Histograms

Box Plots and Violin Plots

Trendlines in Scatter Plots

Sunburst and Treemap Charts

17.4 Choropleth Maps: Data on a Map

World Choropleth

Customizing the Map Projection

Scope: Focusing on a Region

Scatter on a Map (scatter_geo)

Dealing with Missing Countries

17.5 Animation: Data in Motion

Animated Scatter Plot

Animated Choropleth

Animation Tips

The Gapminder-Style Bubble Animation

Animated Bar Chart Races

17.6 Layout Customization with update_layout

Templates (Themes)

Axis Formatting and Annotations

Figure Size and Margins

17.7 Exporting Interactive Charts

HTML Export

Static Image Export

Embedding in Notebooks vs. Web Pages

Combining Multiple Charts on One Page

17.8 Introduction to Dash: Dashboard Thinking

What Is Dash?

A Minimal Dashboard

Adding a Slider

Understanding Callbacks: The Reactive Model

Styling Your Dashboard

Dashboard Design Principles

A More Complete Dashboard Example

When to Use Dash vs. Standalone HTML

17.9 plotly.express vs. seaborn: When to Use Which

A Practical Workflow: Both Tools in One Analysis

17.10 Putting It Together: Interactive Global Vaccination Dashboard

Step 1: The Animated Choropleth

Step 2: Supporting Charts

Step 3: A Bar Chart of Current Status

Step 4: Export for Sharing

17.11 Common Mistakes and How to Fix Them

Mistake 1: Forgetting to Fix Axis Ranges in Animations

Mistake 2: Overloading Tooltips

Mistake 3: Using plotly for Print

Mistake 4: Building Dashboards Before Building Charts

Mistake 5: Too Many Dashboard Controls

Mistake 6: Not Handling Empty Data States

Mistake 7: Huge Datasets in the Browser

17.12 The Visualization Stack So Far

Choosing Your Tool: A Decision Flowchart

17.13 Performance Considerations

Data Volume Limits

Strategies for Large Datasets

17.14 Chapter Summary

What to Practice Next

Related Reading