Exercises: Geospatial Visualization

DataField.Dev

Exercises: Geospatial Visualization

These exercises assume pip install plotly geopandas folium and the vega_datasets package. Imports: import plotly.express as px, import geopandas as gpd, import folium, import pandas as pd.

Part A: Conceptual (6 problems)

A.1 ★☆☆ | Recall

Name four categories of map projection and describe what each optimizes.

Guidance

**Cylindrical** (e.g., Mercator): preserves angles at the cost of area distortion. **Conic** (e.g., Lambert conformal conic): preserves shape in a latitude band. **Azimuthal** (e.g., orthographic): projects from a point onto a plane, good for hemispheric or polar views. **Equal-area** (e.g., Mollweide, Robinson, Eckert IV): preserves area at the cost of shape. The choice determines which visual comparisons are faithful and which are distorted.

A.2 ★☆☆ | Recall

What is the population proxy pitfall, and how is it fixed?

Guidance

The population proxy pitfall: choropleths of raw counts (cases, crimes, restaurants) just reflect where people live, not the variable of interest. The fix is **normalization** — divide by population to get rates per capita. After normalization, the map shows the actual pattern rather than the underlying population distribution.

A.3 ★★☆ | Understand

Explain the difference between EPSG:4326 and EPSG:3857.

Guidance

**EPSG:4326** (WGS84) is the raw latitude/longitude coordinate system used by GPS and GeoJSON. Coordinates are in degrees. **EPSG:3857** (Web Mercator) is a projected CRS used by Google Maps and most web tile services. Coordinates are in meters. Most workflows start in 4326, reproject to 3857 (or another appropriate projected CRS) for visualization, and optionally reproject back for data interchange.

A.4 ★★☆ | Understand

Describe the chapter's threshold concept ("maps are arguments about space") in your own words.

Guidance

A map is not a neutral window onto geographic reality. Every design choice — projection, color scale, administrative level, normalization — constructs an argument about what matters spatially. The Mercator projection makes Africa look smaller than Greenland; a population-count choropleth just shows where people live; a rate choropleth shows something different. The reader will interpret the map based on these choices whether or not the designer intended them, so every choice is a rhetorical decision.

A.5 ★★☆ | Analyze

When would you choose a dot map over a choropleth?

Guidance

Dot maps are best when **individual locations** are the story — "where are the hospitals?", "where did earthquakes occur?", "where do our customers live?". Choropleths are best when **regional rates** are the story — "which states have high unemployment?", "which counties have low vaccination rates?". If the question is "where?", use a dot map. If the question is "what is the pattern across regions?", use a choropleth.

A.6 ★★★ | Evaluate

A news article shows a US county-level map colored by "total opioid deaths." List three reasons this map is misleading and propose fixes.

Guidance

(1) **Unnormalized counts** — large counties with more people have more deaths mechanically. Fix: show deaths per 100,000 residents. (2) **Area distortion** — LA County dominates visually, but its visual weight does not match its importance relative to smaller counties with higher rates. Fix: consider a cartogram or stack with a sortable table. (3) **Missing temporal context** — a single-year map does not show trends. Fix: show year-over-year change or include a time slider. Bonus: consider whether the county level is the right administrative unit for the question.

Part B: Applied (10 problems)

B.1 ★☆☆ | Apply

Create a world choropleth of life expectancy for 2007 using px.choropleth and ISO country codes.

Guidance

import plotly.express as px
gapminder = px.data.gapminder().query("year == 2007")
fig = px.choropleth(
    gapminder,
    locations="iso_alpha",
    color="lifeExp",
    hover_name="country",
    projection="natural earth",
    color_continuous_scale="viridis",
)
fig.show()

B.2 ★☆☆ | Apply

Create a dot map of major world cities using px.scatter_geo with population as marker size.

Guidance

cities = pd.DataFrame({
    "city": ["Tokyo", "Delhi", "Shanghai", "São Paulo", "Mexico City", "Cairo"],
    "lat": [35.68, 28.70, 31.23, -23.55, 19.43, 30.04],
    "lon": [139.69, 77.10, 121.47, -46.63, -99.13, 31.24],
    "population": [37_000_000, 30_000_000, 27_000_000, 22_000_000, 22_000_000, 20_000_000],
})
fig = px.scatter_geo(cities, lat="lat", lon="lon", size="population",
                     hover_name="city", projection="robinson")
fig.show()

B.3 ★★☆ | Apply

Load the Natural Earth low-resolution world dataset with geopandas and plot all countries colored by GDP estimate.

Guidance

import geopandas as gpd
from geodatasets import get_path  # geopandas 1.0 removed gpd.datasets — use the geodatasets package
import matplotlib.pyplot as plt

world = gpd.read_file(get_path("naturalearth.land"))
fig, ax = plt.subplots(figsize=(14, 7))
world.plot(column="gdp_md_est", cmap="viridis", legend=True, ax=ax)
ax.set_axis_off()
ax.set_title("World GDP estimate (millions USD)")
plt.show()

B.4 ★★☆ | Apply

Reproject the world dataset from B.3 to the Robinson projection (ESRI:54030) and re-plot.

Guidance

world_robinson = world.to_crs("ESRI:54030")
fig, ax = plt.subplots(figsize=(14, 7))
world_robinson.plot(column="gdp_md_est", cmap="viridis", legend=True, ax=ax)
ax.set_axis_off()
ax.set_title("World GDP — Robinson projection")

After reprojection, coordinates are in meters. The shape of the map changes visibly: Greenland shrinks, Africa grows, polar distortion disappears.

B.5 ★★☆ | Apply

Build a simple Folium map centered on your city with three custom markers.

Guidance

import folium
m = folium.Map(location=[47.6062, -122.3321], zoom_start=12)
folium.Marker([47.6062, -122.3321], popup="Downtown").add_to(m)
folium.Marker([47.6205, -122.3493], popup="Space Needle").add_to(m)
folium.Marker([47.5480, -122.3079], popup="Stadium").add_to(m)
m.save("seattle.html")

B.6 ★★☆ | Apply

Create a Plotly choropleth of US states colored by a synthetic metric using locations="state code" with locationmode="USA-states".

Guidance

df = pd.DataFrame({
    "state": ["CA", "TX", "NY", "FL", "IL", "PA"],
    "metric": [100, 80, 90, 70, 85, 75],
})
fig = px.choropleth(df, locations="state", locationmode="USA-states",
                    color="metric", scope="usa", color_continuous_scale="Blues")
fig.show()

B.7 ★★★ | Apply

Take a simple dataset of (lat, lon, value) points and create a Plotly Mapbox scatter map with hover tooltips showing the value.

Guidance

df = pd.DataFrame({
    "lat": [47.60, 37.77, 34.05, 40.71],
    "lon": [-122.33, -122.42, -118.24, -74.00],
    "city": ["Seattle", "San Francisco", "Los Angeles", "New York"],
    "value": [100, 200, 300, 400],
})
fig = px.scatter_map(df, lat="lat", lon="lon", hover_name="city", size="value",
                     map_style="carto-positron", zoom=3,
                     center={"lat": 39.8, "lon": -98.5})  # scatter_map replaces deprecated scatter_mapbox (Plotly 5.24+)
fig.show()

B.8 ★★★ | Apply

Demonstrate the population proxy pitfall: build two choropleths of US states — one of raw population and one of some other metric (e.g., GDP). Compare them.

Guidance

The two maps will look nearly identical because GDP is strongly correlated with population. This is the pitfall in action — the second "GDP" map is indistinguishable from the population map, so it conveys no new information. The fix: divide GDP by population to get GDP per capita, then re-plot. The per-capita map will look dramatically different, with some small-population states (e.g., Wyoming, Alaska) having high rankings that the count-based map hides.

B.9 ★★☆ | Apply

Create an Altair choropleth of US unemployment data using alt.topo_feature and transform_lookup.

Guidance

import altair as alt
from vega_datasets import data
import pandas as pd

states = alt.topo_feature(data.us_10m.url, "states")
unemp = pd.read_csv("https://vega.github.io/vega-datasets/data/unemployment.tsv", sep="\t")

alt.Chart(states).mark_geoshape().encode(
    color=alt.Color("rate:Q", scale=alt.Scale(scheme="blues")),
    tooltip=["id:N", "rate:Q"],
).transform_lookup(
    lookup="id",
    from_=alt.LookupData(unemp, "id", ["rate"]),
).project("albersUsa").properties(width=700, height=400)

B.10 ★★★ | Create

Build a Folium map with a choropleth layer over US states colored by a metric of your choice, plus markers for a few major cities on top.

Guidance

import folium

m = folium.Map(location=[39.8, -96], zoom_start=4)

folium.Choropleth(
    geo_data="https://raw.githubusercontent.com/python-visualization/folium/master/examples/data/us-states.json",
    data=df,
    columns=["state", "metric"],
    key_on="feature.id",
    fill_color="YlOrRd",
    legend_name="Metric",
).add_to(m)

for city, (lat, lon) in [("Seattle", (47.6, -122.3)), ("New York", (40.7, -74.0))]:
    folium.Marker([lat, lon], popup=city).add_to(m)

m.save("us_map.html")

Part C: Synthesis (4 problems)

C.1 ★★★ | Analyze

Take a news-article map you have seen recently and evaluate it against the ethical checklist in Section 23.12. Which items pass? Which fail? What would you change?

Guidance

Checklist items: projection appropriate, color scale matches data, values normalized, administrative level matches question, limits disclosed, no better alternative. Most news maps pass the first three but fail on the fourth or fifth (they rarely disclose why they chose the administrative level they did, and they rarely acknowledge alternative framings). The critique exercise is subjective but useful for developing your own sense of cartographic rigor.

C.2 ★★★ | Evaluate

Suppose you need to make a world map of "estimated COVID deaths" to accompany a news article. Which projection do you use, and why? Which normalization (or none)? Which color scale?

Guidance

**Projection**: equal-area (Robinson, Mollweide, Eckert IV, or Goode homolosine). World maps of country-level metrics should not use Mercator, because area distortion inflates high-latitude countries. **Normalization**: deaths per 100,000 residents, because raw counts just show population. Optionally also show total counts in a companion table or tooltip. **Color scale**: sequential (single hue), not diverging — deaths have no meaningful midpoint. Reds are conventional for death/severity but may feel melodramatic; a neutral orange-to-dark-brown sequential palette is often better. Accompanying text should disclose the data source, date, and the normalization choice.

C.3 ★★★ | Create

Build a four-map climate dashboard: station locations (Plotly scatter_geo), regional temperature anomaly choropleth (Plotly or Altair), static publication version (geopandas + matplotlib), and interactive Folium version.

Guidance

The goal is to practice using multiple libraries for the same data, picking the right tool for each delivery context. The Plotly versions are fastest to build; the geopandas version has the best print polish; the Folium version is the easiest to embed in an HTML report. A real project would choose one or two of these based on the delivery format, but the exercise of building all four reinforces the complementary strengths.

C.4 ★★★ | Evaluate

The chapter argues that there is no "neutral" map. Is this overstated? Can you think of a map that is genuinely neutral — one where no design choice privileges any perspective?

Guidance

Probably not, though the chapter's claim is strong enough to debate. Even a map with "default" choices (Mercator, sequential palette, raw counts) is privileging specific perspectives — the Mercator projection privileges high-latitude areas, the sequential palette privileges the highest values, the raw counts privilege populous areas. What the chapter is really arguing is that every choice has consequences, and pretending otherwise is dishonest. A "neutral" map is one where the choices are deliberate and disclosed, not one where the choices are absent. This distinction matters: you cannot escape choices, but you can make them transparent.

These exercises exercise the main geospatial libraries and design principles. Chapter 24 leaves geographic space for network visualization — charts of relationships rather than places.