Case Study 1: Climate Uncertainty Decomposition for Policymakers — Separating What We Don't Know from What We Can't Know

DataField.Dev

Case Study 1: Climate Uncertainty Decomposition for Policymakers — Separating What We Don't Know from What We Can't Know

Context

The Global Climate Risk Assessment Consortium (GCRAC) is preparing its 2026 regional impact report for the North Atlantic coastal zone. The report will inform infrastructure investment decisions worth \$4.2 billion over the next decade — seawalls, drainage systems, building codes, and insurance pricing. The central question: How much will sea levels rise along the North Atlantic coast by 2050 and 2100?

This is not primarily a prediction problem. It is an uncertainty communication problem. A point prediction — "sea levels will rise 0.45 meters by 2050" — is useless to a policymaker who needs to decide whether to build a seawall designed for 0.3 meters, 0.5 meters, or 1.0 meters of rise. What the policymaker needs is a prediction with a full uncertainty decomposition: How much of the uncertainty is due to our limited understanding of climate physics (epistemic — reducible with better science)? How much is due to inherent variability in the climate system (aleatoric — irreducible)? And how much is due to future human choices about emissions (scenario uncertainty — not a scientific question at all)?

The GCRAC data science team uses a deep learning ensemble for regional sea-level projection, building on the climate DL framework developed throughout the textbook (Chapters 1, 4, 8, 9, 10, 23, 26). This case study applies Chapter 34's uncertainty quantification tools to decompose the projection uncertainty into components that map to different policy responses.

The Model Architecture

The GCRAC regional sea-level model is a Transformer-based temporal model (Chapter 10) that takes gridded climate fields as input and produces regional sea-level projections. The model was trained on output from 12 CMIP6 global climate models (GCMs), downscaled to the North Atlantic region.

from dataclasses import dataclass, field
from typing import List, Dict, Tuple
import numpy as np


@dataclass
class ClimateProjectionConfig:
    """Configuration for climate sea-level projection ensemble."""
    n_gcm_emulators: int = 5  # Deep ensemble members per GCM
    n_gcms: int = 12  # CMIP6 models emulated
    n_scenarios: int = 3  # SSP1-2.6, SSP2-4.5, SSP5-8.5
    n_internal_variability_samples: int = 20  # Per ensemble member
    projection_years: List[int] = field(
        default_factory=lambda: [2030, 2040, 2050, 2075, 2100]
    )
    baseline_period: Tuple[int, int] = (1995, 2014)
    region: str = "North Atlantic Coastal Zone (35N-45N, 80W-60W)"


@dataclass
class UncertaintyDecomposition:
    """Three-way uncertainty decomposition for climate projections.

    Following the framework of Hawkins and Sutton (2009):
    - Scenario uncertainty: spread across SSP pathways
    - Model uncertainty (epistemic): spread across GCM emulators
    - Internal variability (aleatoric): spread within one model
      from perturbed initial conditions
    """
    year: int
    scenario: str
    # Point estimate
    median_projection_m: float
    # Scenario uncertainty (across SSPs for this year)
    scenario_range_m: Tuple[float, float]
    scenario_variance: float
    # Model uncertainty (across GCM emulators for this scenario)
    model_range_m: Tuple[float, float]
    model_variance: float
    # Internal variability (within one model, perturbed initial conditions)
    internal_range_m: Tuple[float, float]
    internal_variance: float
    # Total
    total_variance: float
    prediction_interval_90: Tuple[float, float]

    @property
    def fraction_scenario(self) -> float:
        return self.scenario_variance / self.total_variance if self.total_variance > 0 else 0

    @property
    def fraction_model(self) -> float:
        return self.model_variance / self.total_variance if self.total_variance > 0 else 0

    @property
    def fraction_internal(self) -> float:
        return self.internal_variance / self.total_variance if self.total_variance > 0 else 0

The Uncertainty Decomposition

The GCRAC team produces sea-level projections under three Shared Socioeconomic Pathways (SSPs): SSP1-2.6 (strong mitigation), SSP2-4.5 (moderate), and SSP5-8.5 (high emissions). For each scenario, a 5-member deep ensemble of Transformer emulators produces projections, and each ensemble member runs 20 initial-condition perturbations to sample internal variability.

The total predictive variance decomposes:

$$\text{Var}[\Delta SL] = \underbrace{\text{Var}_{\text{SSP}}[\bar{\mu}_{\text{SSP}}]}_{\text{scenario}} + \underbrace{\mathbb{E}_{\text{SSP}}[\text{Var}_{\text{model}}[\bar{\mu}_m]]}_{\text{model (epistemic)}} + \underbrace{\mathbb{E}_{\text{SSP}}[\mathbb{E}_{\text{model}}[\text{Var}_{\text{IC}}[\hat{y}_{m,i}]]]}_{\text{internal variability (aleatoric)}}$$

@dataclass
class ClimateEnsembleResults:
    """Full ensemble results for one projection year."""
    year: int
    # Shape: (n_scenarios, n_ensemble, n_ic_samples)
    projections: np.ndarray

    def decompose(self, scenario_names: List[str]) -> List[UncertaintyDecomposition]:
        """Decompose uncertainty into scenario, model, and internal."""
        results = []
        n_scenarios, n_ensemble, n_ic = self.projections.shape

        # Grand mean across all projections
        grand_mean = self.projections.mean()

        # Scenario means: average over models and IC samples
        scenario_means = self.projections.mean(axis=(1, 2))  # (n_scenarios,)

        # Model means per scenario: average over IC samples
        model_means = self.projections.mean(axis=2)  # (n_scenarios, n_ensemble)

        # Scenario variance: variance of scenario means
        scenario_var = np.var(scenario_means, ddof=0)

        # Model variance: mean across scenarios of variance of model means
        model_var = np.mean([
            np.var(model_means[s], ddof=0)
            for s in range(n_scenarios)
        ])

        # Internal variability: mean variance within each model
        internal_var = np.mean([
            np.mean([
                np.var(self.projections[s, m], ddof=0)
                for m in range(n_ensemble)
            ])
            for s in range(n_scenarios)
        ])

        total_var = scenario_var + model_var + internal_var

        for s in range(n_scenarios):
            s_projections = self.projections[s].flatten()
            p05, p95 = np.percentile(s_projections, [5, 95])

            # Cross-scenario prediction interval
            all_projections = self.projections.flatten()
            total_p05, total_p95 = np.percentile(all_projections, [5, 95])

            results.append(UncertaintyDecomposition(
                year=self.year,
                scenario=scenario_names[s],
                median_projection_m=float(np.median(self.projections[s])),
                scenario_range_m=(
                    float(scenario_means.min()),
                    float(scenario_means.max()),
                ),
                scenario_variance=float(scenario_var),
                model_range_m=(
                    float(model_means[s].min()),
                    float(model_means[s].max()),
                ),
                model_variance=float(model_var),
                internal_range_m=(float(p05), float(p95)),
                internal_variance=float(internal_var),
                total_variance=float(total_var),
                prediction_interval_90=(float(total_p05), float(total_p95)),
            ))

        return results

Results

The GCRAC team generates projections and applies the decomposition. The results for the North Atlantic coastal zone:

# Simulated ensemble results (meters of sea-level rise above baseline)
rng = np.random.RandomState(42)

def generate_climate_projections(year: int) -> np.ndarray:
    """Generate simulated ensemble projections for one year.

    Returns array of shape (3 scenarios, 5 ensemble members, 20 IC samples).
    """
    # Base projection increases with year
    t = (year - 2020) / 80  # normalized time

    # Scenario means diverge over time
    scenario_base = {
        2050: np.array([0.22, 0.31, 0.45]),
        2100: np.array([0.38, 0.67, 1.10]),
    }

    base = scenario_base.get(year, scenario_base[2050] * t)
    projections = np.zeros((3, 5, 20))

    for s in range(3):
        for m in range(5):
            # Model spread (epistemic)
            model_offset = rng.normal(0, 0.04 + 0.03 * t)
            for i in range(20):
                # Internal variability (aleatoric)
                iv_noise = rng.normal(0, 0.025)
                projections[s, m, i] = base[s] + model_offset + iv_noise

    return projections


scenario_names = ["SSP1-2.6", "SSP2-4.5", "SSP5-8.5"]

# 2050 projections
proj_2050 = ClimateEnsembleResults(
    year=2050,
    projections=generate_climate_projections(2050),
)
decomp_2050 = proj_2050.decompose(scenario_names)

# 2100 projections
proj_2100 = ClimateEnsembleResults(
    year=2100,
    projections=generate_climate_projections(2100),
)
decomp_2100 = proj_2100.decompose(scenario_names)

# Print decomposition table
print(f"\n{'='*70}")
print(f"North Atlantic Coastal Zone — Sea-Level Rise Projection")
print(f"{'='*70}")

for year, decomps in [(2050, decomp_2050), (2100, decomp_2100)]:
    print(f"\n--- {year} ---")
    d = decomps[1]  # SSP2-4.5 (middle scenario) for the main result
    print(f"Median projection (SSP2-4.5): {d.median_projection_m:.2f} m")
    print(f"90% prediction interval: [{d.prediction_interval_90[0]:.2f}, "
          f"{d.prediction_interval_90[1]:.2f}] m")
    print(f"\nUncertainty decomposition:")
    print(f"  Scenario uncertainty:  {d.fraction_scenario:.1%}")
    print(f"  Model uncertainty:     {d.fraction_model:.1%}")
    print(f"  Internal variability:  {d.fraction_internal:.1%}")

The decomposition reveals a pattern well-documented in climate science (Hawkins and Sutton, 2009): the dominant source of uncertainty changes over the projection horizon.

Source	2050	2100	Policy Response
Scenario (human choices)	~45%	~75%	Emissions policy, not science investment
Model (epistemic)	~35%	~18%	Invest in climate model development; reducible
Internal variability (aleatoric)	~20%	~7%	Cannot be reduced; plan for the range

For 2050: Model uncertainty is a significant fraction. This means the scientific community's incomplete understanding of ice-sheet dynamics, ocean circulation, and regional feedback loops contributes meaningfully to the projection spread. Investment in climate science research (better observations, improved model physics) would narrow the uncertainty range.

For 2100: Scenario uncertainty dominates. The difference between SSP1-2.6 (strong mitigation, ~0.38 m rise) and SSP5-8.5 (high emissions, ~1.10 m rise) is far larger than the model spread within any single scenario. No amount of improved climate science will resolve this uncertainty — it depends on human decisions about emissions over the coming decades.

Conformal Calibration for the Policymaker Report

The GCRAC team wraps the ensemble projections with conformal prediction to provide a formal coverage guarantee. For each scenario, they calibrate conformal intervals on the 2020-2025 period (where projections can be compared to observed sea-level data from satellite altimetry).

# Calibration on recent observations
observed_slr = np.array([0.020, 0.024, 0.028, 0.033, 0.037, 0.042])  # 2020-2025, meters
predicted_slr = np.array([0.018, 0.022, 0.030, 0.029, 0.035, 0.039])

conformal_climate = SplitConformalRegressor(alpha=0.10)
conformal_climate.calibrate(predicted_slr, observed_slr)

print(f"Conformal threshold: {conformal_climate.threshold:.4f} m")
print(f"This means: add ±{conformal_climate.threshold:.3f} m to each projection")
print(f"for a 90% coverage guarantee (under exchangeability)")

The conformal correction is small relative to the ensemble spread, confirming that the ensemble is reasonably well-calibrated on the recent observational period. However, the GCRAC team notes in the report that the exchangeability assumption is tenuous for long-range projections: the data-generating process in 2100 may differ fundamentally from 2020-2025 (e.g., ice-sheet tipping points). They present the conformal intervals with a caveat about the projection horizon.

The Policy Brief

The GCRAC team distills the uncertainty analysis into a three-paragraph summary for the infrastructure investment committee:

Near-term (2050): Sea levels along the North Atlantic coast are projected to rise 0.22-0.45 meters above the 1995-2014 baseline, depending on the emissions pathway. Within any single pathway, the range narrows to approximately ±0.08 meters (90% interval), reflecting our current scientific understanding. Approximately 35% of this within-scenario uncertainty is reducible through improved climate models and observations. We recommend designing coastal infrastructure for a minimum of 0.45 meters of rise (the high end of the moderate-emissions scenario), with the option to upgrade if observations track the high-emissions pathway.

Long-term (2100): The dominant uncertainty is not scientific but societal. Under strong mitigation (SSP1-2.6), rise is projected at 0.38 meters (90% interval: 0.28-0.48 m). Under high emissions (SSP5-8.5), rise is projected at 1.10 meters (90% interval: 0.85-1.35 m). The threefold difference between these scenarios overwhelms all scientific uncertainty. We recommend adaptive infrastructure design: build for the moderate scenario (0.67 m) with expansion capacity for the high scenario (1.10 m).

What will not change with better science: The scenario spread (human choices) will remain the dominant source of uncertainty for projections beyond 2060. No amount of improved climate modeling will narrow this gap. What better science can improve is the within-scenario range — currently ±0.08 m for 2050 and ±0.25 m for 2100. Continued investment in ice-sheet observations and cloud feedback research would narrow these ranges by an estimated 20-30% over the next decade.

Lessons

This case study illustrates three principles of uncertainty communication:

Decompose before communicating. A raw prediction interval of "0.22-1.35 meters by 2100" is paralyzing — the range is too wide for infrastructure design. Decomposing into scenario, model, and internal variability reveals that most of the range is not scientific uncertainty but human choice. This reframes the decision from "we don't know enough to plan" to "we know enough to plan adaptively."
Match uncertainty type to action. Epistemic uncertainty calls for research investment. Aleatoric uncertainty calls for robust design that handles the range. Scenario uncertainty calls for policy decisions, not scientific ones. Collapsing all three into a single confidence interval obscures these distinct policy levers.
Calibrate before publishing. Even well-constructed ensembles can be miscalibrated. The conformal correction — small in this case, but potentially large for poorly calibrated models — ensures that the published intervals have a formal coverage guarantee, not just a heuristic one. Policymakers who use these intervals for billion-dollar infrastructure decisions deserve intervals with stated statistical properties.