Case Study 2: Gaussian Process Regression for Time Series Forecasting

Background

GreenGrid Energy operates a network of solar farms across the southwestern United States. Accurate day-ahead solar irradiance forecasting is critical for grid planning and energy market bidding. The team needs a model that:

Captures both smooth daily trends and periodic seasonal patterns
Provides calibrated uncertainty estimates (not just point forecasts)
Works well with moderate amounts of historical data (hundreds, not millions, of observations)
Quantifies when the model is uncertain, allowing operators to hedge their market bids

Gaussian processes are an ideal fit for this problem. Their nonparametric flexibility, built-in uncertainty quantification, and ability to encode domain knowledge through kernel design make them well-suited to structured time series with moderate data volumes.

Problem Formulation

Data

We use daily average solar irradiance (W/m^2) measured at a single station over 2 years (730 observations). The data exhibits:

A smooth annual cycle (higher in summer, lower in winter)
Day-to-day variability from weather patterns
Measurement noise

For this case study, we generate synthetic data that mimics these properties.

Model: GP with Composite Kernel

A single kernel cannot capture all the structure in this data. We compose multiple kernels:

$$ k(t, t') = k_{\text{trend}}(t, t') + k_{\text{seasonal}}(t, t') + k_{\text{short}}(t, t') + k_{\text{noise}}(t, t') $$

where:

$k_{\text{trend}}$: Long-term RBF kernel for multi-year trends
$k_{\text{seasonal}}$: Periodic kernel for the annual cycle
$k_{\text{short}}$: Short length-scale RBF for weather-scale variability
$k_{\text{noise}}$: White noise kernel for measurement noise

Analysis

Step 1: Data Generation and Exploration

import numpy as np

def generate_solar_data(
    n_days: int = 730,
    seed: int = 42,
) -> tuple[np.ndarray, np.ndarray]:
    """Generate synthetic daily solar irradiance data.

    Args:
        n_days: Number of days to simulate.
        seed: Random seed for reproducibility.

    Returns:
        Tuple of (days, irradiance) arrays.
    """
    rng = np.random.default_rng(seed)
    t = np.arange(n_days, dtype=float)

    # Annual seasonal cycle (peak in summer ~ day 180)
    seasonal = 120 * np.sin(2 * np.pi * (t - 80) / 365.25)

    # Slow multi-year trend (slight increase due to station calibration)
    trend = 5 * t / 365.25

    # Weather-scale variability (autocorrelated)
    weather = np.zeros(n_days)
    for i in range(1, n_days):
        weather[i] = 0.7 * weather[i - 1] + rng.normal(0, 20)

    # Base irradiance
    base = 250.0

    # Measurement noise
    noise = rng.normal(0, 10, n_days)

    irradiance = base + trend + seasonal + weather + noise
    return t, irradiance

t, y = generate_solar_data()

Step 2: Train-Test Split

We use the first 18 months (548 days) for training and the last 6 months (182 days) for testing:

train_idx = 548
t_train, y_train = t[:train_idx], y[:train_idx]
t_test, y_test = t[train_idx:], y[train_idx:]

Step 3: Kernel Design

from sklearn.gaussian_process.kernels import (
    RBF,
    ExpSineSquared,
    WhiteKernel,
    ConstantKernel,
)

# Long-term trend kernel (smooth, long length scale)
k_trend = 50.0**2 * RBF(length_scale=365.0)

# Annual seasonal kernel (periodic with period ~365 days)
k_seasonal = 100.0**2 * ExpSineSquared(
    length_scale=1.0,
    periodicity=365.25,
)

# Short-term weather variability
k_weather = 30.0**2 * RBF(length_scale=10.0)

# Measurement noise
k_noise = WhiteKernel(noise_level=10.0**2)

# Composite kernel
kernel = k_trend + k_seasonal + k_weather + k_noise

Design rationale:

The trend kernel has a length scale of 365 days, capturing slow changes over years. Its amplitude (50 W/m^2) allows for modest long-term variation.
The seasonal kernel uses ExpSineSquared with a periodicity of 365.25 days. The amplitude (100 W/m^2) reflects the large summer-winter swing.
The weather kernel has a length scale of 10 days, capturing synoptic weather patterns. Its amplitude (30 W/m^2) allows for day-to-day variability.
The noise kernel captures independent measurement noise.

Step 4: Fit and Predict

from sklearn.gaussian_process import GaussianProcessRegressor

# Reshape for sklearn
X_train = t_train.reshape(-1, 1)
X_test = t_test.reshape(-1, 1)

# Fit GP with hyperparameter optimization
gp = GaussianProcessRegressor(
    kernel=kernel,
    n_restarts_optimizer=15,
    normalize_y=True,
    alpha=1e-6,  # Numerical stability
)
gp.fit(X_train, y_train)

# Predict on test set
y_pred, y_std = gp.predict(X_test, return_std=True)

print(f"Optimized kernel:\n{gp.kernel_}")

Step 5: Evaluate Forecast Quality

from sklearn.metrics import mean_absolute_error, mean_squared_error

mae = mean_absolute_error(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))

print(f"MAE:  {mae:.2f} W/m^2")
print(f"RMSE: {rmse:.2f} W/m^2")

# Calibration: what fraction of test points fall within the 95% CI?
lower = y_pred - 1.96 * y_std
upper = y_pred + 1.96 * y_std
coverage = np.mean((y_test >= lower) & (y_test <= upper))
print(f"95% CI coverage: {coverage:.2%}")

Typical results: - MAE: ~22 W/m^2 - RMSE: ~28 W/m^2 - 95% CI coverage: ~93-96% (well-calibrated)

Step 6: Uncertainty Analysis

The GP's uncertainty estimates have practical value for grid operations:

# Identify high-uncertainty days
high_uncertainty_mask = y_std > np.percentile(y_std, 90)
high_uncertainty_days = t_test[high_uncertainty_mask]

print(f"Number of high-uncertainty days: {high_uncertainty_mask.sum()}")
print(f"Mean uncertainty on normal days: "
      f"{y_std[~high_uncertainty_mask].mean():.1f} W/m^2")
print(f"Mean uncertainty on high-uncertainty days: "
      f"{y_std[high_uncertainty_mask].mean():.1f} W/m^2")

The GP naturally produces larger uncertainty bands: - Further from training data: Uncertainty grows as we forecast further into the future. - In regions with fewer training observations: If certain seasons are underrepresented, the GP honestly reflects this. - Where the kernel structure is uncertain: The optimized kernel parameters propagate uncertainty about the data-generating process.

Step 7: Kernel Ablation Study

To understand each kernel component's contribution:

kernels_to_test = {
    "Full composite": kernel,
    "Trend + Noise": k_trend + k_noise,
    "Seasonal + Noise": k_seasonal + k_noise,
    "Weather + Noise": k_weather + k_noise,
    "Trend + Seasonal + Noise": k_trend + k_seasonal + k_noise,
}

for name, k in kernels_to_test.items():
    gp_test = GaussianProcessRegressor(
        kernel=k, n_restarts_optimizer=10, normalize_y=True
    )
    gp_test.fit(X_train, y_train)
    y_p = gp_test.predict(X_test)
    rmse_test = np.sqrt(mean_squared_error(y_test, y_p))
    print(f"{name:35s} RMSE: {rmse_test:.2f} W/m^2")

Typical results:

Full composite                      RMSE: 28.15 W/m^2
Trend + Noise                       RMSE: 52.83 W/m^2
Seasonal + Noise                    RMSE: 32.47 W/m^2
Weather + Noise                     RMSE: 45.21 W/m^2
Trend + Seasonal + Noise            RMSE: 29.89 W/m^2

The seasonal component provides the largest single improvement, but the full composite kernel outperforms all subsets.

Step 8: Comparison with Baselines

from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor

# Feature engineering for baselines
def create_features(t: np.ndarray) -> np.ndarray:
    """Create time features for baseline models."""
    return np.column_stack([
        t,
        np.sin(2 * np.pi * t / 365.25),
        np.cos(2 * np.pi * t / 365.25),
        np.sin(4 * np.pi * t / 365.25),
        np.cos(4 * np.pi * t / 365.25),
    ])

X_feat_train = create_features(t_train)
X_feat_test = create_features(t_test)

# Linear regression with harmonic features
lr = LinearRegression().fit(X_feat_train, y_train)
lr_rmse = np.sqrt(mean_squared_error(y_test, lr.predict(X_feat_test)))

# Random forest
rf = RandomForestRegressor(n_estimators=100, random_state=42)
rf.fit(X_feat_train, y_train)
rf_rmse = np.sqrt(mean_squared_error(y_test, rf.predict(X_feat_test)))

print(f"GP RMSE:              {rmse:.2f} W/m^2")
print(f"Linear Regression RMSE: {lr_rmse:.2f} W/m^2")
print(f"Random Forest RMSE:   {rf_rmse:.2f} W/m^2")

The GP typically achieves competitive or better RMSE than the baselines, but its key advantage is the calibrated uncertainty estimates that neither baseline provides natively.

Decision Impact

GreenGrid Energy integrated the GP forecaster into their bidding system:

Point forecast ($\mu_*$): Used for the baseline energy bid in the day-ahead market.
Lower bound ($\mu_* - 1.96\sigma_*$): Used for the guaranteed minimum bid (conservative estimate).
Upper bound ($\mu_* + 1.96\sigma_*$): Used to assess upside potential for intraday market adjustments.
High-uncertainty alerts: When $\sigma_* > $ threshold, the trading desk hedges positions more aggressively.

Over a 6-month pilot, this reduced bidding penalties (from over-forecasting) by 18% and increased revenue from accurate forecasts by 7%.

Key Lessons

Kernel design encodes domain knowledge: The composite kernel naturally decomposes the signal into interpretable components (trend, seasonality, weather, noise). This is a form of prior knowledge injection analogous to choosing priors in parametric Bayesian models.
Uncertainty grows honestly with distance: Unlike many black-box models that produce constant-width prediction intervals, GP uncertainty widens in regions far from training data -- exactly when a human operator should be most cautious.
Marginal likelihood optimizes the bias-complexity tradeoff automatically: The GP framework optimizes kernel hyperparameters by maximizing the marginal likelihood, which naturally penalizes overly complex models (Section 10.8.4).
GPs complement, not replace, other approaches: For this moderate-sized dataset (hundreds of points), GPs are ideal. For very large datasets (millions of points), the $\mathcal{O}(n^3)$ cost would require sparse approximations or alternative models.
Calibrated uncertainty has direct business value: The ability to quantify "how much we don't know" enabled risk-aware decision-making that improved financial outcomes.

Connection to Chapter Concepts

Section 10.8.1-10.8.2: GP definition and kernel functions are the foundation of this case study.
Section 10.8.3: The GP regression equations produce the predictive mean and variance.
Section 10.8.4: Marginal likelihood optimization automatically sets the kernel hyperparameters.
Section 10.8.5: For larger solar farm networks, sparse GP approximations would be necessary.
Section 10.1: The GP embodies Bayesian thinking -- the prior is encoded in the kernel, the data updates our beliefs through the posterior, and predictions come with uncertainty.

See code/case-study-code.py for the complete implementation.