Case Study 1: Ice Cream and Drowning — The Classic Confounding Story (And Its Modern Equivalents)

Contributors to Introduction to Data Science

Case Study 1: Ice Cream and Drowning — The Classic Confounding Story (And Its Modern Equivalents)

Tier 1 — Verified Concepts: This case study discusses well-established statistical concepts (confounding, spurious correlation) using the classic ice cream/drowning example and extends to modern equivalents. All statistical phenomena described are based on widely taught and documented principles. The specific data examples are constructed for pedagogical purposes, though the patterns they illustrate are documented in statistical literature.

The Oldest Trick in the Statistical Book

Every statistics instructor has a favorite example of spurious correlation, and the ice cream-and-drowning story is probably the most famous one of all. It goes like this:

There is a strong positive correlation between ice cream sales and drowning deaths. When ice cream sales go up, drowning deaths go up. When ice cream sales go down, drowning deaths go down.

The naive interpretation: ice cream causes drowning! Or maybe drowning causes people to buy ice cream! (Neither makes any sense, which is precisely the point.)

The correct interpretation: a confounding variable — temperature, or more broadly, summer weather — causes both. Hot weather makes people buy more ice cream. Hot weather also makes people swim more, which increases the opportunity for drowning. Ice cream and drowning are correlated not because one causes the other, but because they share a common cause.

This example is so familiar it might seem trivial. But the pattern it illustrates — two variables appearing related because of a hidden third variable — is not trivial at all. It's one of the most important patterns in all of data science, and it shows up constantly in contexts far more consequential than frozen desserts.

Seeing It in Data

Let's build this from scratch to really understand the mechanics:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats

np.random.seed(42)

# Generate 3 years of daily data
n_days = 365 * 3
days = np.arange(n_days)

# Temperature follows a seasonal pattern
temperature = 55 + 25 * np.sin(2 * np.pi * days / 365 - np.pi/2) + \
              np.random.normal(0, 8, n_days)

# Ice cream sales: driven by temperature
ice_cream = 200 + 8 * temperature + np.random.normal(0, 50, n_days)
ice_cream = np.clip(ice_cream, 50, None)

# Drowning risk: driven by temperature (via swimming activity)
drowning = 0.05 * np.maximum(temperature - 60, 0)**1.3 + \
           np.random.poisson(0.5, n_days)
drowning = drowning.astype(int)

# The spurious correlation
r_spurious, p_spurious = stats.pearsonr(ice_cream, drowning)

# The real correlations
r_temp_ice, _ = stats.pearsonr(temperature, ice_cream)
r_temp_drown, _ = stats.pearsonr(temperature, drowning)

print("=== Correlation Analysis ===")
print(f"Ice cream vs Drowning:    r = {r_spurious:.3f}  (SPURIOUS)")
print(f"Temperature vs Ice cream: r = {r_temp_ice:.3f}  (CAUSAL)")
print(f"Temperature vs Drowning:  r = {r_temp_drown:.3f}  (CAUSAL)")

The spurious correlation is real — it's there in the data. If you didn't know about the temperature connection, the relationship between ice cream and drowning would look convincing:

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Plot 1: The spurious correlation
axes[0, 0].scatter(ice_cream, drowning, alpha=0.15, s=10, color='steelblue')
axes[0, 0].set_xlabel('Daily Ice Cream Sales ($)')
axes[0, 0].set_ylabel('Daily Drowning Incidents')
axes[0, 0].set_title(f'The Spurious Correlation\nr = {r_spurious:.3f}',
                      fontsize=12)

# Plot 2: Temperature → Ice cream (real cause)
axes[0, 1].scatter(temperature, ice_cream, alpha=0.15, s=10, color='#e74c3c')
axes[0, 1].set_xlabel('Temperature (°F)')
axes[0, 1].set_ylabel('Daily Ice Cream Sales ($)')
axes[0, 1].set_title(f'Real Cause: Temperature → Ice Cream\nr = {r_temp_ice:.3f}',
                      fontsize=12)

# Plot 3: Temperature → Drowning (real cause)
axes[1, 0].scatter(temperature, drowning, alpha=0.15, s=10, color='#2ecc71')
axes[1, 0].set_xlabel('Temperature (°F)')
axes[1, 0].set_ylabel('Daily Drowning Incidents')
axes[1, 0].set_title(f'Real Cause: Temperature → Drowning\nr = {r_temp_drown:.3f}',
                      fontsize=12)

# Plot 4: Causal diagram
axes[1, 1].text(0.5, 0.8, 'Temperature', fontsize=18, ha='center',
                bbox=dict(boxstyle='round', facecolor='lightyellow'))
axes[1, 1].annotate('', xy=(0.2, 0.45), xytext=(0.4, 0.7),
                     arrowprops=dict(arrowstyle='->', lw=2))
axes[1, 1].annotate('', xy=(0.8, 0.45), xytext=(0.6, 0.7),
                     arrowprops=dict(arrowstyle='->', lw=2))
axes[1, 1].text(0.15, 0.3, 'Ice Cream\nSales', fontsize=14, ha='center',
                bbox=dict(boxstyle='round', facecolor='lightblue'))
axes[1, 1].text(0.85, 0.3, 'Drowning\nDeaths', fontsize=14, ha='center',
                bbox=dict(boxstyle='round', facecolor='lightsalmon'))
axes[1, 1].annotate('', xy=(0.65, 0.33), xytext=(0.35, 0.33),
                     arrowprops=dict(arrowstyle='-', lw=2,
                                     linestyle='dashed', color='gray'))
axes[1, 1].text(0.5, 0.38, 'spurious', fontsize=10, ha='center',
                color='gray', style='italic')
axes[1, 1].text(0.5, 0.1, 'Causal Structure', fontsize=14, ha='center',
                fontweight='bold')
axes[1, 1].set_xlim(0, 1)
axes[1, 1].set_ylim(0, 1)
axes[1, 1].axis('off')

plt.suptitle('Ice Cream and Drowning: Anatomy of a Spurious Correlation',
             fontsize=16, fontweight='bold', y=1.02)
plt.tight_layout()
plt.savefig('ice_cream_drowning_anatomy.png', dpi=150, bbox_inches='tight')
plt.show()

Controlling for the Confounder

Watch what happens when we control for temperature — the spurious relationship vanishes:

from sklearn.linear_model import LinearRegression

# Partial correlation: ice cream vs drowning, controlling for temperature
temp_reshaped = temperature.reshape(-1, 1)

# Remove the effect of temperature from ice cream sales
model_ice = LinearRegression().fit(temp_reshaped, ice_cream)
ice_residuals = ice_cream - model_ice.predict(temp_reshaped)

# Remove the effect of temperature from drowning
model_drown = LinearRegression().fit(temp_reshaped, drowning)
drown_residuals = drowning - model_drown.predict(temp_reshaped)

# Correlation of residuals = partial correlation
r_partial, p_partial = stats.pearsonr(ice_residuals, drown_residuals)

print(f"Raw correlation (ice cream vs drowning):     r = {r_spurious:.3f}")
print(f"Partial correlation (controlling for temp):   r = {r_partial:.3f}")
print(f"\nOnce temperature is accounted for, the relationship nearly vanishes!")

The partial correlation drops from around 0.5 to near zero. This confirms that the ice cream-drowning relationship is almost entirely explained by temperature.

Beyond Ice Cream: Modern Confounding Stories

The ice cream example is charming but low-stakes. Let's look at cases where confounding has real consequences.

Modern Example 1: The "Lucky Hospital" Problem

Imagine comparing hospitals by their surgical mortality rates. Hospital A has a 2% mortality rate. Hospital B has a 5% mortality rate. Should you choose Hospital A?

Not necessarily. Hospital B might be a major trauma center that takes the sickest patients — patients whom Hospital A wouldn't even attempt to treat. The confounding variable is patient severity. If Hospital B's patients are much sicker to begin with, a 5% mortality rate might actually reflect better care than Hospital A's 2% rate on easier cases.

This is called case-mix bias, and it has real implications:

np.random.seed(42)

# Simulate the hospital paradox
hospital_a_severity = np.random.normal(3, 1, 500)   # Lower severity patients
hospital_b_severity = np.random.normal(7, 1, 500)    # Higher severity patients

# True quality: Hospital B is actually BETTER (lower base mortality)
hospital_a_mortality_prob = 0.005 * np.exp(0.3 * hospital_a_severity)
hospital_b_mortality_prob = 0.003 * np.exp(0.3 * hospital_b_severity)

hospital_a_deaths = np.random.binomial(1, np.clip(hospital_a_mortality_prob, 0, 1))
hospital_b_deaths = np.random.binomial(1, np.clip(hospital_b_mortality_prob, 0, 1))

print("=== Unadjusted Mortality Rates ===")
print(f"Hospital A: {hospital_a_deaths.mean()*100:.1f}% mortality")
print(f"Hospital B: {hospital_b_deaths.mean()*100:.1f}% mortality")
print(f"Naive conclusion: Hospital A is better!")

# But controlling for patient severity...
print("\n=== Adjusted for Patient Severity ===")
# Compare patients with similar severity
for severity_range in [(2, 4), (4, 6), (6, 8)]:
    low, high = severity_range
    mask_a = (hospital_a_severity >= low) & (hospital_a_severity < high)
    mask_b = (hospital_b_severity >= low) & (hospital_b_severity < high)
    if mask_a.sum() > 10 and mask_b.sum() > 10:
        rate_a = hospital_a_deaths[mask_a].mean() * 100
        rate_b = hospital_b_deaths[mask_b].mean() * 100
        print(f"  Severity {low}-{high}: Hospital A = {rate_a:.1f}%, "
              f"Hospital B = {rate_b:.1f}% "
              f"→ {'B is better' if rate_b < rate_a else 'A is better'}")

print("\nAdjusted conclusion: Hospital B is actually better for similar patients!")

This pattern has led to real-world harm. Hospital ranking systems that don't properly adjust for patient severity can penalize the hospitals that take the hardest cases — creating perverse incentives to turn away sick patients.

Modern Example 2: "Organic Food Makes You Healthier"

Studies consistently show that people who eat organic food have better health outcomes: lower rates of obesity, heart disease, and cancer. Does this mean organic food is healthier?

Almost certainly not — or at least, most of the correlation is confounding. People who buy organic food differ from non-organic consumers in dozens of ways:

They tend to have higher incomes
They're more likely to exercise regularly
They're less likely to smoke
They're more likely to have health insurance
They tend to live in areas with better healthcare access
They're more health-conscious overall (which affects diet in many ways beyond organic vs. conventional)

Each of these factors independently improves health outcomes. The organic-food-health correlation is mostly a lifestyle confounder: organic food consumption is a marker for a health-conscious, affluent lifestyle, and it's the lifestyle — not the organic label on the food — that's driving the health benefits.

# Simulating the organic food confounder
np.random.seed(42)
n = 1000

# Health consciousness is the hidden confounder
health_consciousness = np.random.normal(50, 15, n)

# People with high health consciousness:
# - Buy organic food
organic_probability = 1 / (1 + np.exp(-(health_consciousness - 55) / 10))
buys_organic = np.random.binomial(1, organic_probability)

# - AND have better health (regardless of food choices)
exercise_hours = 0.1 * health_consciousness + np.random.normal(0, 2, n)
diet_quality = 0.08 * health_consciousness + np.random.normal(0, 2, n)
health_score = (20 + 0.5 * exercise_hours + 0.5 * diet_quality +
                np.random.normal(0, 3, n))

# The spurious correlation
organic_health = health_score[buys_organic == 1].mean()
conventional_health = health_score[buys_organic == 0].mean()

print(f"Average health score (organic buyers):     {organic_health:.1f}")
print(f"Average health score (conventional buyers): {conventional_health:.1f}")
print(f"Difference: {organic_health - conventional_health:.1f} points")
print(f"\nThis difference is largely due to health consciousness,")
print(f"not organic food itself.")

Modern Example 3: Screen Time and Mental Health

Headlines regularly claim that increased screen time causes depression, anxiety, or other mental health problems in teens. But consider the confounders:

Teens with pre-existing mental health issues might turn to screens for comfort (reverse causation)
Family instability might cause both more screen time (less supervision) and worse mental health
Poverty might lead to both more screen time (cheaper entertainment) and more stress
Physical inactivity might cause both more screen time and worse mental health
Social isolation might cause both more screen time and more depression

The correlation between screen time and mental health problems is real. But the causal story is much more complex than "screens cause depression." And some research has found that the correlation, while statistically significant, is extremely small (comparable to the correlation between eating potatoes and depression).

The Lesson: Train Your Confounder Reflex

The ice cream-drowning example teaches a reflex that you should apply to every correlation you encounter:

Step 1: See a correlation → Immediately ask: "Could a third variable explain this?"

Step 2: For every confounding candidate, ask: "Is this variable plausibly related to BOTH X and Y?"

Step 3: If you can identify plausible confounders, downgrade the correlation from "evidence of causation" to "evidence of association."

Step 4: Consider what evidence you would need to establish actual causation (an RCT, a natural experiment, controlling for confounders in a regression model).

This reflex will serve you in every data science project you ever work on. It's the difference between being someone who finds patterns in data and being someone who understands what those patterns actually mean.

Connecting to the Progressive Project

In the vaccination project, you've found correlations between GDP, healthcare spending, and vaccination rates. Before concluding that GDP causes higher vaccination rates, apply the confounder reflex:

Climate and geography: Countries in temperate zones tend to have both higher GDPs and different disease burdens that affect vaccination priorities
Colonial history: Some countries' current economic status and health systems reflect historical patterns that affect both GDP and healthcare capacity
Governance quality: Strong institutions promote both economic growth and effective public health programs
Education: Higher education levels are associated with both economic productivity and health-seeking behavior

Your analysis should acknowledge these confounders explicitly. The correlation is real, the policy implications are real (wealthier countries DO have better vaccination coverage), but the causal mechanism is more complex than a simple arrow from GDP to vaccination.

Discussion Questions

Training the reflex: Over the next 24 hours, notice three correlations mentioned in news, social media, or conversation. For each, identify at least one plausible confounder.
When confounders aren't a problem: Can you think of a correlation where confounding is unlikely to be an issue? What makes some correlations more trustworthy than others?
The policy question: Even when a correlation is partly or entirely confounded, can it still be useful for policy? If ice cream and drowning are correlated (via temperature), should a city increase lifeguard staffing during weeks with high ice cream sales?
Your project: List three confounders that might inflate the correlation between GDP and vaccination rates in your progressive project data. For each, explain how it could drive both variables.

Key Takeaway: Every correlation has a story behind it. The ice cream-drowning example is charming because the confounding is obvious. But in most real-world data analysis, the confounding is subtle, plausible, and easy to miss. The skill of spotting confounders — of always asking "what else could explain this?" — is one of the most valuable things you can learn from this book.