Win Probability Added (WPA)
What is Win Probability Added (WPA)?
Win Probability Added (WPA) is a context-dependent baseball statistic that measures the change in a team's probability of winning as a result of a specific play or plate appearance. Unlike traditional counting statistics or rate metrics, WPA captures the game situation, quantifying how much each event moves the needle toward victory or defeat. A home run in a tie game in the ninth inning adds far more win probability than a solo shot in a 10-0 blowout, and WPA reflects this reality.
Developed from the pioneering work of statisticians like Mills Brothers in the 1970s and popularized by websites like FanGraphs in the 2000s, WPA has become an essential tool for understanding clutch performance and game-changing moments. The metric ranges from -0.5 to +0.5 for any single event (representing a complete swing from certain loss to certain win), with players accumulating positive and negative WPA throughout a season based on their performance in various game situations.
WPA is fundamentally different from Wins Above Replacement (WAR) because it is context-dependent rather than context-neutral. While WAR attempts to isolate a player's skill independent of situation, WPA embraces context, asking "What actually happened in this specific game situation?" This makes WPA particularly valuable for analyzing individual games, postseason heroics, and the ebb and flow of pennant races, though less suitable for comparing player value across different team contexts.
How Win Probability is Calculated
Win probability estimation relies on historical data from thousands of games to determine the likelihood of victory given the current game state. The fundamental inputs to any win probability model include the inning, score differential, number of outs, runners on base, and sometimes additional factors like the quality of remaining pitchers or home field advantage.
The most common approach uses empirical win expectancy tables constructed from play-by-play data spanning decades of Major League Baseball games. For each possible game state, analysts calculate the percentage of times that the team at bat (or the home team) ultimately won the game. Modern implementations typically use over 100,000 games to generate these probability tables, ensuring statistical robustness across even rare situations.
For example, historical data might show that the home team wins 72.3% of games when batting in the bottom of the ninth inning, down by one run, with a runner on first base and one out. If the next batter hits a single that advances the runner to third, the new state (runner on third, one out, still down one) might have a 78.1% win probability. The difference—5.8 percentage points, or +0.058 WPA—is credited to the batter who hit the single.
More sophisticated models incorporate additional variables such as the specific batters and pitchers involved, park factors, weather conditions, and even pitch-level data from Statcast. Machine learning approaches like gradient boosting or neural networks can capture complex non-linear relationships between game state variables and win probability, though traditional table-based methods remain popular due to their transparency and computational simplicity.
Base-Out-Inning-Score States
The foundation of win probability calculations is the enumeration of all possible game states. The core variables create a multi-dimensional state space that captures the strategic complexity of baseball:
Core State Variables:
- Inning: 1 through 9 (and extra innings), with separate probabilities for top and bottom halves
- Outs: 0, 1, or 2 outs in the current half-inning
- Base State: 8 possible configurations (000, 100, 010, 001, 110, 101, 011, 111)
- Score Differential: Runs ahead or behind, typically binned (e.g., tied, +1, +2, +3, +4, +5 or more)
This creates approximately 3 × 8 × 9 × 2 × 20 = 8,640 unique states for a typical nine-inning game (3 outs, 8 base states, 9 innings, 2 half-innings, ~20 score bins). Each state has an associated win probability derived from historical outcomes, forming what's known as a win expectancy table or matrix.
The transition from one state to another drives WPA calculations. When a batter steps to the plate with runners on first and second, nobody out, in the bottom of the seventh inning of a tie game, the initial state has a certain win probability. If they hit into a double play, the new state (runner on third, two outs) has a significantly lower win probability. The WPA for that play is the difference between these values, which would be substantially negative for the batter and positive for the pitcher.
Some advanced implementations further subdivide states by additional factors. For instance, score differentials might be treated granularly rather than binned (distinguishing between down 1 and down 2 more precisely), or the model might account for the specific inning more granularly in extra innings. The trade-off is always between specificity and sample size—too many subdivisions create states with insufficient historical data for reliable probability estimates.
Historical Win Expectancy Tables
Win expectancy tables are the lookup tables that power WPA calculations. These tables are constructed by analyzing decades of play-by-play data, recording the outcome of every game from every possible state. The win expectancy for a state is simply the proportion of games won by the offensive team when that state occurred.
For example, a classic win expectancy table might show:
| Inning | Score Diff | Base State | Outs | Win Expectancy |
|---|---|---|---|---|
| Bottom 9 | -1 | 000 | 0 | 0.281 |
| Bottom 9 | -1 | 100 | 0 | 0.389 |
| Bottom 9 | -1 | 101 | 1 | 0.512 |
| Bottom 9 | Tied | 000 | 0 | 0.542 |
These values show that in the bottom of the ninth inning, trailing by one run with the bases empty and no outs, teams historically win about 28.1% of the time. Adding a runner to first base increases that to 38.9%, while having runners on first and third with one out pushes the probability above 50%.
The construction of these tables requires careful data cleaning and normalization. Games from different eras must be weighted appropriately to account for changing run-scoring environments. Modern tables typically focus on data from the post-2000 era to reflect contemporary offensive levels, though some applications use adaptive weighting that gives more influence to recent seasons while still incorporating the statistical power of historical data.
Open-source win expectancy tables are available from several sources. Tom Tango's work in the early 2000s produced widely-used tables, and websites like FanGraphs and Baseball Prospectus maintain updated versions. Researchers can also construct custom tables from Retrosheet play-by-play data, allowing for sport-specific adjustments or experimental state definitions.
WPA Formula and Interpretation
The fundamental WPA formula is elegantly simple:
For a batting event, positive WPA indicates the batter increased their team's chance of winning, while negative WPA indicates a decrease. For pitchers, the signs are reversed—a strikeout that reduces the opponent's win probability by 0.03 gives the pitcher +0.03 WPA.
Consider a concrete example: Bottom of the ninth, tie game, bases loaded, one out. The win expectancy in this state might be 0.71 (71% chance of winning). The batter hits a sacrifice fly, scoring the winning run. The new state is a walk-off win, with win probability 1.0. The WPA for this play is 1.0 - 0.71 = +0.29 WPA for the batter. This large positive value reflects the game-ending significance of the play.
WPA accumulates linearly over a season. A player's seasonal WPA is the sum of their WPA from every plate appearance. A typical everyday player might accumulate anywhere from -2 to +5 WPA over a full season, though this varies widely based on both performance and playing time in high-leverage situations. The league average WPA is zero by construction—every increase in one team's win probability is offset by an equal decrease for their opponent.
Interpreting WPA requires understanding its context-dependent nature. A player with high WPA had many impactful moments in critical situations, but this could result from luck (being at bat in high-leverage spots) rather than skill. Similarly, a low WPA might reflect poor clutch performance or simply few opportunities in important situations. This is why WPA is best used for describing what happened rather than predicting future performance.
Clutch Performance Measurement
WPA is intrinsically linked to clutch performance because it weights events by their situational importance. A player who excels in high-pressure situations will accumulate more WPA than their raw statistics might suggest, while a player who pads stats in low-leverage blowouts will have WPA that lags their traditional metrics.
The concept of leverage is formalized through the Leverage Index (LI), which measures how much a typical event in a given game state changes win probability. High-leverage situations are those where events swing win probability dramatically—close games in late innings. Low-leverage situations are blowouts where even significant events barely move the needle. The average LI is defined as 1.0, with values ranging from near 0 in garbage time to 8+ in crucial playoff moments.
Clutch performance can be assessed by comparing a player's WPA to their context-neutral value metrics. If a player's WPA significantly exceeds their WAR contribution, they likely performed well in high-leverage spots. Conversely, underperformance in clutch situations manifests as WPA below expectation. However, statistical research has consistently shown that "clutch ability" (the capacity to systematically outperform in high-pressure situations) is minimal to non-existent for most players, with year-to-year clutch performance showing little correlation.
Notable exceptions exist in baseball lore—players like David Ortiz, Derek Jeter, and Mariano Rivera are celebrated for postseason heroics that are statistically unlikely to be pure chance. However, even for these legends, separating skill from luck requires careful analysis and large sample sizes that often span entire careers. WPA serves as the descriptive tool for identifying these moments, even if it cannot definitively attribute them to repeatable clutch skill.
WPA Leaders and Laggards
Seasonal WPA leaderboards reveal which players had the most impactful contributions to their team's wins in a given year. Top WPA performers are typically some combination of excellent players (who perform well in all situations) and players who happened to bat in many high-leverage spots. The record for single-season WPA in the modern era is around +10, achieved by MVP-caliber seasons where a player both excelled statistically and came through in crucial moments.
Historical WPA leaders include:
| Player | Season | WPA | Notable Context |
|---|---|---|---|
| Barry Bonds | 2002 | +9.7 | MVP season, .370/.582/.799 |
| Chase Utley | 2006 | +8.9 | MVP runner-up, clutch hits |
| Alex Rodriguez | 2007 | +8.5 | MVP season, 54 HR, 156 RBI |
On the flip side, WPA laggards are players who accumulated significantly negative WPA, typically through poor performance in critical situations or simply being victims of bad timing. Pitchers can accumulate extreme negative WPA by allowing catastrophic innings at crucial junctures, while position players might rack up negative totals through double plays or strikeouts in high-leverage spots.
Team-level WPA analysis can identify patterns in roster construction and bullpen usage. Teams that spread high-leverage opportunities among multiple capable players tend to have more balanced WPA distributions, while teams overly reliant on a single closer or cleanup hitter may show concentrated WPA accumulation. This can inform strategic decisions about lineup construction, pinch-hitting, and late-inning defensive substitutions.
Limitations of WPA (Context-Dependent)
Despite its utility for game narrative and situation analysis, WPA has significant limitations as a player evaluation metric. The fundamental issue is that WPA conflates performance with opportunity. A player who happens to bat in many high-leverage situations has more opportunity to accumulate WPA (positive or negative) than an equally skilled player who bats primarily in low-leverage spots. This makes WPA unsuitable for direct player comparisons across teams or seasons.
Consider two hypothetical players with identical batting lines (.280/.350/.480) but different lineup positions. The cleanup hitter batting fourth accumulates +4.5 WPA because they frequently bat with runners on base in close games. The ninth-place hitter accumulates +1.2 WPA because they rarely face high-leverage situations. Neither player is more skilled, but WPA dramatically favors the player with more opportunities. This context-dependence means WPA cannot answer "Who is the better player?" only "Who had more impactful plate appearances?"
WPA also struggles with small sample sizes and random variation. A player who goes 1-for-4 in four blowout games might have near-zero WPA, while a player who goes 1-for-4 in four one-run games could have substantial positive or negative WPA depending on the timing of their lone hit. Over a full season, these variations tend to regress toward performance expectation, but in small samples (playoffs, hot streaks), WPA can be highly volatile and driven by chance.
Additionally, WPA inherently treats all outs as equal when they occur in the same game state, despite differences in quality of contact or process. A hard-hit lineout and a weak popup both produce the same WPA change if the game state is identical, yet the former indicates better underlying skill. This is why advanced analysts supplement WPA with expected statistics (xwOBA, xBA) that capture batted ball quality independent of outcome.
WPA/LI (Context-Neutral WPA)
To address the context-dependence problem, sabermetricians developed WPA/LI (Win Probability Added per Leverage Index), also known as Context Neutral WPA or simply "Context Neutral Wins." This metric adjusts WPA for the leverage of situations faced, creating a measure that isolates performance from opportunity.
The formula for WPA/LI is:
Alternatively, it can be expressed as:
This normalization removes the advantage enjoyed by players who simply faced more high-leverage situations. A player who performed well in low-leverage spots and poorly in high-leverage spots might have positive WPA (because they accumulated value in many at-bats) but negative WPA/LI (because their performance was worse when it mattered most). Conversely, a clutch performer would show WPA/LI exceeding their raw WPA.
WPA/LI correlates more strongly with context-neutral metrics like WAR because it attempts to measure "How well did the player perform?" rather than "How much did the player's performance change game outcomes?" This makes it more appropriate for player evaluation and comparison, though it sacrifices the narrative appeal of raw WPA. Most analysts use both metrics in tandem—WPA to tell the story of a season's key moments, WPA/LI to assess player skill independent of situation.
Research has shown that WPA/LI has minimal year-to-year correlation for individual players, suggesting that "clutch skill" (the ability to systematically outperform in high-leverage situations) is largely illusory. Players who excel under pressure one year typically regress to their overall performance level the next, supporting the null hypothesis that clutch performance is predominantly luck rather than a repeatable skill.
Championship WPA (cWPA)
Championship WPA (cWPA) extends the win probability framework beyond individual games to measure a player's contribution to their team's playoff and championship aspirations. Rather than tracking the probability of winning a single game, cWPA tracks the changing probability of winning the division, making the playoffs, or winning the World Series as the season progresses.
The fundamental concept is similar to game-level WPA: for each game, there is a "playoff probability" or "championship probability" before and after the game. A win in late September when teams are tied for the division lead might increase playoff probability by 8%, while a win in April might increase it by only 0.2%. The cWPA for players in that September game is allocated based on their contributions to the win, weighted by the leverage of the game for championship aspirations.
Calculating cWPA requires:
- A playoff probability model that estimates championship odds based on current standings, remaining schedule, and team quality
- Game-by-game win probabilities (standard WPA)
- A methodology for translating within-game WPA into championship probability changes
The implementation typically involves running Monte Carlo simulations of the remainder of the season before and after each game, computing the change in playoff probability, and allocating that change to players based on their in-game WPA contributions. A player who hits a walk-off homer in a crucial September game might receive +0.05 cWPA if that win increased the team's playoff odds by 5 percentage points.
Championship WPA is particularly valuable for analyzing playoff races and historical pennant chases. It can identify individual performances that were truly franchise-altering—the home run that clinched a division, the pitching performance that saved a season. However, cWPA is even more context-dependent than regular WPA, as it depends entirely on team quality, division competitiveness, and schedule luck. A .400 hitter on a 100-loss team would accumulate minimal cWPA, while a mediocre player on a team in a tight race could have substantial cWPA through timely contributions.
Python Code Examples
Calculating WPA from Play-by-Play Data
import pandas as pd
import numpy as np
class WinProbabilityCalculator:
"""
Calculate Win Probability Added from play-by-play data
using historical win expectancy tables.
"""
def __init__(self, expectancy_table_path):
"""
Initialize with a win expectancy lookup table.
Parameters:
-----------
expectancy_table_path : str
Path to CSV containing win expectancy data with columns:
inning, is_bottom, outs, runners, score_diff, win_prob
"""
self.expectancy_table = pd.read_csv(expectancy_table_path)
def get_base_state(self, runner_first, runner_second, runner_third):
"""Convert runner presence to base state code (0-7)."""
return (runner_first * 1 +
runner_second * 2 +
runner_third * 4)
def lookup_win_probability(self, inning, is_bottom, outs,
base_state, score_diff):
"""
Look up win probability for a given game state.
Parameters:
-----------
inning : int
Current inning (1-9+)
is_bottom : bool
True if bottom of inning
outs : int
Number of outs (0-2)
base_state : int
Base runners state (0-7)
score_diff : int
Runs ahead (positive) or behind (negative)
Returns:
--------
float : Win probability for batting team
"""
# Bin score differential for lookup
if score_diff > 5:
score_diff = 6
elif score_diff < -5:
score_diff = -6
# Query the expectancy table
mask = (
(self.expectancy_table['inning'] == inning) &
(self.expectancy_table['is_bottom'] == is_bottom) &
(self.expectancy_table['outs'] == outs) &
(self.expectancy_table['base_state'] == base_state) &
(self.expectancy_table['score_diff'] == score_diff)
)
result = self.expectancy_table[mask]
if len(result) == 0:
# Fallback to nearest state if exact match not found
return 0.5
return result.iloc[0]['win_prob']
def calculate_wpa_for_game(self, play_by_play_df):
"""
Calculate WPA for all plays in a game.
Parameters:
-----------
play_by_play_df : DataFrame
Play-by-play data with columns:
inning, is_bottom, outs_before, outs_after,
runners_before_*, runners_after_*,
score_diff_before, score_diff_after,
batter_id, pitcher_id
Returns:
--------
DataFrame : Original data with WPA columns added
"""
wpa_list = []
for idx, play in play_by_play_df.iterrows():
# Get base states
base_before = self.get_base_state(
play['runner_1b_before'],
play['runner_2b_before'],
play['runner_3b_before']
)
base_after = self.get_base_state(
play['runner_1b_after'],
play['runner_2b_after'],
play['runner_3b_after']
)
# Look up probabilities
wp_before = self.lookup_win_probability(
play['inning'],
play['is_bottom'],
play['outs_before'],
base_before,
play['score_diff_before']
)
# Handle end-of-game states
if play['game_over']:
wp_after = 1.0 if play['batting_team_won'] else 0.0
else:
wp_after = self.lookup_win_probability(
play['inning'],
play['is_bottom'],
play['outs_after'],
base_after,
play['score_diff_after']
)
# Calculate WPA
wpa = wp_after - wp_before
wpa_list.append(wpa)
play_by_play_df['wpa'] = wpa_list
return play_by_play_df
# Example usage
if __name__ == "__main__":
# Load win expectancy table
wpa_calc = WinProbabilityCalculator('win_expectancy_table.csv')
# Load play-by-play data for a game
game_data = pd.read_csv('game_20230915_nyy_vs_bos.csv')
# Calculate WPA
game_with_wpa = wpa_calc.calculate_wpa_for_game(game_data)
# Aggregate by player
batter_wpa = game_with_wpa.groupby('batter_id')['wpa'].sum()
pitcher_wpa = game_with_wpa.groupby('pitcher_id')['wpa'].apply(
lambda x: -x.sum() # Negative because we calculated from offense perspective
)
print("Top batters by WPA:")
print(batter_wpa.sort_values(ascending=False).head(5))
print("\nTop pitchers by WPA:")
print(pitcher_wpa.sort_values(ascending=False).head(5))
Creating Win Probability Graphs
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
def plot_win_probability_graph(play_by_play_df, home_team, away_team,
game_date, save_path=None):
"""
Create a win probability graph for a baseball game.
Parameters:
-----------
play_by_play_df : DataFrame
Play-by-play data with WPA calculated
home_team : str
Home team abbreviation
away_team : str
Away team abbreviation
game_date : str
Game date for title
save_path : str, optional
Path to save figure
"""
# Calculate cumulative win probability for home team
play_by_play_df['play_number'] = range(len(play_by_play_df))
# Initialize win probability (typically 0.5 for beginning of game)
initial_wp = 0.5
# Calculate running win probability
wp_timeline = [initial_wp]
for idx, play in play_by_play_df.iterrows():
if play['is_bottom']:
# Home team batting - add WPA directly
wp_timeline.append(wp_timeline[-1] + play['wpa'])
else:
# Away team batting - subtract WPA
wp_timeline.append(wp_timeline[-1] - play['wpa'])
# Create figure
fig, ax = plt.subplots(figsize=(14, 7))
# Plot win probability
ax.plot(range(len(wp_timeline)), wp_timeline,
linewidth=2, color='#2E86AB')
# Fill areas
ax.fill_between(range(len(wp_timeline)), wp_timeline, 0.5,
where=[wp >= 0.5 for wp in wp_timeline],
alpha=0.3, color='blue', label=home_team)
ax.fill_between(range(len(wp_timeline)), wp_timeline, 0.5,
where=[wp < 0.5 for wp in wp_timeline],
alpha=0.3, color='red', label=away_team)
# Styling
ax.axhline(y=0.5, color='gray', linestyle='--', alpha=0.5)
ax.set_ylim(0, 1)
ax.set_xlim(0, len(wp_timeline))
# Labels
ax.set_xlabel('Play Number', fontsize=12)
ax.set_ylabel('Win Probability', fontsize=12)
ax.set_title(f'Win Probability: {away_team} @ {home_team} ({game_date})',
fontsize=14, fontweight='bold')
# Format y-axis as percentage
ax.yaxis.set_major_formatter(plt.FuncFormatter(
lambda y, _: f'{int(y*100)}%'
))
# Add inning markers
inning_changes = play_by_play_df[
play_by_play_df['inning'] != play_by_play_df['inning'].shift()
].index.tolist()
for inning_start in inning_changes:
ax.axvline(x=inning_start, color='gray',
linestyle=':', alpha=0.3)
# Legend
ax.legend(loc='best', fontsize=10)
# Grid
ax.grid(axis='y', alpha=0.3)
plt.tight_layout()
if save_path:
plt.savefig(save_path, dpi=300, bbox_inches='tight')
return fig, ax
# Example: Identify key momentum swings
def find_key_plays(play_by_play_df, threshold=0.10):
"""
Identify plays with WPA magnitude exceeding threshold.
Returns DataFrame of high-leverage plays sorted by |WPA|.
"""
key_plays = play_by_play_df[
abs(play_by_play_df['wpa']) >= threshold
].copy()
key_plays['abs_wpa'] = abs(key_plays['wpa'])
key_plays = key_plays.sort_values('abs_wpa', ascending=False)
return key_plays[['inning', 'is_bottom', 'outs_before',
'batter_name', 'play_description', 'wpa']]
Identifying Clutch Performers
def calculate_clutch_metrics(season_data):
"""
Calculate various clutch performance metrics for a season.
Parameters:
-----------
season_data : DataFrame
Season play-by-play with WPA and LI (Leverage Index)
Returns:
--------
DataFrame : Player clutch statistics
"""
player_stats = []
for player_id in season_data['batter_id'].unique():
player_data = season_data[season_data['batter_id'] == player_id]
# Basic WPA
total_wpa = player_data['wpa'].sum()
# Leverage-adjusted metrics
total_li = player_data['leverage_index'].sum()
wpa_li = total_wpa / total_li if total_li > 0 else 0
# High vs low leverage performance
high_lev = player_data[player_data['leverage_index'] >= 1.5]
low_lev = player_data[player_data['leverage_index'] < 1.0]
high_lev_woba = (
high_lev['woba_value'].sum() / high_lev['pa'].sum()
if len(high_lev) > 0 else 0
)
low_lev_woba = (
low_lev['woba_value'].sum() / low_lev['pa'].sum()
if len(low_lev) > 0 else 0
)
# Clutch differential
clutch_diff = high_lev_woba - low_lev_woba
# Late & Close situations (7th inning+, margin ≤3)
late_close = player_data[
(player_data['inning'] >= 7) &
(abs(player_data['score_diff_before']) <= 3)
]
late_close_wpa = late_close['wpa'].sum()
late_close_pa = len(late_close)
player_stats.append({
'player_id': player_id,
'total_wpa': total_wpa,
'wpa_li': wpa_li,
'high_lev_woba': high_lev_woba,
'low_lev_woba': low_lev_woba,
'clutch_differential': clutch_diff,
'late_close_wpa': late_close_wpa,
'late_close_pa': late_close_pa,
'total_pa': len(player_data)
})
clutch_df = pd.DataFrame(player_stats)
# Sort by WPA/LI to find genuine skill-based clutch performers
clutch_df = clutch_df.sort_values('wpa_li', ascending=False)
return clutch_df
# Example: Generate clutch leaderboard
if __name__ == "__main__":
# Load season data
season_pbp = pd.read_csv('mlb_2023_play_by_play.csv')
# Calculate clutch metrics
clutch_leaders = calculate_clutch_metrics(season_pbp)
# Filter to qualified batters (e.g., 400+ PA)
qualified = clutch_leaders[clutch_leaders['total_pa'] >= 400]
print("Top 10 Clutch Performers by WPA/LI:")
print(qualified.head(10))
# Identify over-performers in clutch situations
print("\nBiggest High-Leverage Over-Performers:")
print(qualified.nlargest(10, 'clutch_differential'))
Comparing WPA with RE24
import numpy as np
import scipy.stats as stats
def compare_wpa_re24(season_data):
"""
Compare WPA with RE24 (Run Expectancy 24 base-out states).
RE24 measures runs above average based on base-out state changes,
while WPA measures win probability changes. This comparison shows
how context affects evaluation.
"""
# Calculate player totals
player_comparison = season_data.groupby('batter_id').agg({
'wpa': 'sum',
're24': 'sum',
'leverage_index': 'mean',
'pa': 'count'
}).reset_index()
# Filter to qualified batters
qualified = player_comparison[player_comparison['pa'] >= 400]
# Calculate correlation
correlation = qualified[['wpa', 're24']].corr().iloc[0, 1]
# Create scatter plot
fig, ax = plt.subplots(figsize=(10, 8))
scatter = ax.scatter(qualified['re24'], qualified['wpa'],
c=qualified['avg_leverage_index'],
cmap='viridis', s=80, alpha=0.6)
# Add regression line
z = np.polyfit(qualified['re24'], qualified['wpa'], 1)
p = np.poly1d(z)
ax.plot(qualified['re24'], p(qualified['re24']),
"r--", alpha=0.8, linewidth=2)
# Styling
ax.set_xlabel('RE24 (Runs Above Average)', fontsize=12)
ax.set_ylabel('WPA (Win Probability Added)', fontsize=12)
ax.set_title(f'WPA vs RE24 Comparison (r = {correlation:.3f})',
fontsize=14, fontweight='bold')
# Color bar
cbar = plt.colorbar(scatter, ax=ax)
cbar.set_label('Average Leverage Index', fontsize=10)
ax.grid(alpha=0.3)
# Identify outliers
residuals = qualified['wpa'] - p(qualified['re24'])
qualified['residual'] = residuals
# Players who over-performed in clutch (WPA > RE24 expectation)
over_performers = qualified.nlargest(5, 'residual')
# Players who under-performed in clutch
under_performers = qualified.nsmallest(5, 'residual')
print("Players who exceeded WPA expectation (clutch heroes):")
print(over_performers[['batter_id', 're24', 'wpa', 'residual']])
print("\nPlayers who fell short of WPA expectation:")
print(under_performers[['batter_id', 're24', 'wpa', 'residual']])
plt.tight_layout()
return fig, qualified
# Year-to-year clutch consistency test
def test_clutch_repeatability(year1_data, year2_data):
"""
Test whether clutch performance (WPA - expected WPA) repeats
from one season to the next.
"""
# Merge consecutive seasons
y1_clutch = calculate_clutch_metrics(year1_data)
y2_clutch = calculate_clutch_metrics(year2_data)
merged = y1_clutch.merge(y2_clutch, on='player_id',
suffixes=('_y1', '_y2'))
# Filter to players with sufficient PAs both years
qualified = merged[
(merged['total_pa_y1'] >= 300) &
(merged['total_pa_y2'] >= 300)
]
# Correlation between years
wpa_li_corr = qualified[['wpa_li_y1', 'wpa_li_y2']].corr().iloc[0, 1]
clutch_diff_corr = qualified[
['clutch_differential_y1', 'clutch_differential_y2']
].corr().iloc[0, 1]
print(f"WPA/LI year-to-year correlation: {wpa_li_corr:.3f}")
print(f"Clutch differential correlation: {clutch_diff_corr:.3f}")
print(f"(Values near 0 suggest clutch performance is not repeatable)")
return qualified
R Code Examples
WPA Calculation and Visualization in R
library(tidyverse)
library(ggplot2)
library(scales)
# Load and prepare data
load_win_expectancy_table <- function(filepath) {
we_table <- read_csv(filepath)
return(we_table)
}
# WPA Calculator class
WPACalculator <- R6::R6Class("WPACalculator",
public = list(
we_table = NULL,
initialize = function(we_table_path) {
self$we_table <- read_csv(we_table_path)
},
get_base_state = function(on_1b, on_2b, on_3b) {
# Convert binary runner positions to 0-7 state
return(on_1b * 1 + on_2b * 2 + on_3b * 4)
},
lookup_wp = function(inning, is_bottom, outs, base_state, score_diff) {
# Bin extreme score differentials
score_diff <- pmax(pmin(score_diff, 6), -6)
# Query expectancy table
wp <- self$we_table %>%
filter(
inning == !!inning,
is_bottom == !!is_bottom,
outs == !!outs,
base_state == !!base_state,
score_diff == !!score_diff
) %>%
pull(win_prob)
if(length(wp) == 0) return(0.5)
return(wp[1])
},
calculate_game_wpa = function(pbp_df) {
# Calculate WPA for each play
pbp_df <- pbp_df %>%
mutate(
base_before = self$get_base_state(
runner_1b_before, runner_2b_before, runner_3b_before
),
base_after = self$get_base_state(
runner_1b_after, runner_2b_after, runner_3b_after
)
)
# Vectorized WPA calculation
pbp_df$wp_before <- mapply(
self$lookup_wp,
pbp_df$inning,
pbp_df$is_bottom,
pbp_df$outs_before,
pbp_df$base_before,
pbp_df$score_diff_before
)
pbp_df$wp_after <- mapply(
self$lookup_wp,
pbp_df$inning,
pbp_df$is_bottom,
pbp_df$outs_after,
pbp_df$base_after,
pbp_df$score_diff_after
)
# Handle game-ending plays
pbp_df <- pbp_df %>%
mutate(
wp_after = ifelse(game_over,
ifelse(batting_team_won, 1.0, 0.0),
wp_after),
wpa = wp_after - wp_before
)
return(pbp_df)
}
)
)
# Win Probability Graph
plot_wp_graph <- function(pbp_df, home_team, away_team, game_date) {
# Calculate cumulative WP for home team
pbp_df <- pbp_df %>%
mutate(
play_num = row_number(),
wpa_home = ifelse(is_bottom, wpa, -wpa),
wp_home = 0.5 + cumsum(wpa_home)
)
# Create plot
ggplot(pbp_df, aes(x = play_num, y = wp_home)) +
geom_line(color = "#2E86AB", size = 1.2) +
geom_ribbon(aes(ymin = 0.5, ymax = wp_home),
fill = "blue", alpha = 0.3,
data = pbp_df %>% filter(wp_home >= 0.5)) +
geom_ribbon(aes(ymin = wp_home, ymax = 0.5),
fill = "red", alpha = 0.3,
data = pbp_df %>% filter(wp_home < 0.5)) +
geom_hline(yintercept = 0.5, linetype = "dashed",
color = "gray50", alpha = 0.6) +
scale_y_continuous(labels = percent_format(),
limits = c(0, 1)) +
labs(
title = sprintf("Win Probability: %s @ %s (%s)",
away_team, home_team, game_date),
x = "Play Number",
y = "Home Team Win Probability"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 14, face = "bold"),
axis.title = element_text(size = 11),
panel.grid.minor = element_blank()
)
}
# Clutch Analysis
analyze_clutch_performance <- function(season_data) {
clutch_stats <- season_data %>%
group_by(batter_id, batter_name) %>%
summarize(
pa = n(),
total_wpa = sum(wpa, na.rm = TRUE),
avg_li = mean(leverage_index, na.rm = TRUE),
wpa_li = total_wpa / sum(leverage_index, na.rm = TRUE),
# High leverage situations
high_lev_pa = sum(leverage_index >= 1.5, na.rm = TRUE),
high_lev_woba = mean(woba_value[leverage_index >= 1.5],
na.rm = TRUE),
# Low leverage
low_lev_woba = mean(woba_value[leverage_index < 1.0],
na.rm = TRUE),
# Late & close
late_close_wpa = sum(wpa[inning >= 7 &
abs(score_diff_before) <= 3],
na.rm = TRUE),
.groups = "drop"
) %>%
mutate(
clutch_diff = high_lev_woba - low_lev_woba
) %>%
filter(pa >= 400) %>%
arrange(desc(wpa_li))
return(clutch_stats)
}
# WPA vs RE24 comparison
compare_metrics <- function(season_data) {
comparison <- season_data %>%
group_by(batter_id) %>%
summarize(
wpa = sum(wpa, na.rm = TRUE),
re24 = sum(re24, na.rm = TRUE),
avg_li = mean(leverage_index, na.rm = TRUE),
pa = n()
) %>%
filter(pa >= 400)
# Scatter plot
ggplot(comparison, aes(x = re24, y = wpa, color = avg_li)) +
geom_point(size = 3, alpha = 0.6) +
geom_smooth(method = "lm", se = TRUE, color = "red",
linetype = "dashed") +
scale_color_viridis_c(name = "Avg LI") +
labs(
title = "WPA vs RE24: Context vs Context-Neutral Value",
x = "RE24 (Runs Above Average)",
y = "WPA (Win Probability Added)",
subtitle = sprintf("Correlation: %.3f",
cor(comparison$wpa, comparison$re24))
) +
theme_minimal() +
theme(
plot.title = element_text(size = 13, face = "bold"),
legend.position = "right"
)
}
# Example usage
wpa_calc <- WPACalculator$new("win_expectancy_table.csv")
game_data <- read_csv("game_20230915.csv")
game_with_wpa <- wpa_calc$calculate_game_wpa(game_data)
# Generate visualizations
plot_wp_graph(game_with_wpa, "NYY", "BOS", "2023-09-15")
# Season analysis
season_pbp <- read_csv("mlb_2023_pbp.csv")
clutch_leaders <- analyze_clutch_performance(season_pbp)
print(head(clutch_leaders, 10))
Championship WPA Simulation
# Simulate championship probability changes
simulate_championship_wpa <- function(standings, remaining_games,
num_simulations = 10000) {
# Monte Carlo simulation of playoff probabilities
results <- matrix(0, nrow = num_simulations,
ncol = nrow(standings))
for(sim in 1:num_simulations) {
# Simulate remaining games
current_standings <- standings
for(game in 1:nrow(remaining_games)) {
home_team <- remaining_games$home_team[game]
away_team <- remaining_games$away_team[game]
# Simple win probability based on team strength
home_win_prob <- 0.54 * standings$talent[home_team]
if(runif(1) < home_win_prob) {
current_standings$wins[home_team] <-
current_standings$wins[home_team] + 1
} else {
current_standings$wins[away_team] <-
current_standings$wins[away_team] + 1
}
}
# Determine playoff teams
playoff_teams <- current_standings %>%
arrange(desc(wins)) %>%
slice(1:5) %>%
pull(team_id)
results[sim, ] <- as.numeric(standings$team_id %in% playoff_teams)
}
# Calculate playoff probabilities
playoff_probs <- colMeans(results)
return(data.frame(
team_id = standings$team_id,
playoff_probability = playoff_probs
))
}
Practical Applications and Best Practices
Win Probability Added serves multiple practical purposes in baseball analysis. For broadcasters and media, WPA provides narrative structure to games, identifying the most crucial moments and player contributions. Graphics showing win probability swings help viewers understand game flow and momentum shifts in real-time.
For front offices, WPA offers insight into roster construction and in-game management. Managers can evaluate bullpen usage by examining WPA accumulated by relievers in various leverage situations, informing decisions about when to deploy a closer. Similarly, WPA can validate or question strategic choices like pinch-hitting, defensive substitutions, or intentional walks.
Fantasy baseball and betting markets incorporate WPA-adjacent concepts through live win probability models that update with each pitch. Understanding WPA helps bettors identify value in live betting markets and assess the true impact of injuries or lineup changes on game outcomes.
Best practices for using WPA include:
- Use WPA descriptively, not prescriptively: WPA tells you what happened, not necessarily what will happen or who is more skilled
- Combine with context-neutral metrics: Pair WPA with WAR, wOBA, or WPA/LI for comprehensive player evaluation
- Account for sample size: Single-game or single-season WPA can be highly variable; career trends are more meaningful
- Recognize park and era effects: Win expectancy tables should be updated periodically to reflect changing run-scoring environments
- Validate with multiple models: Different win expectancy tables can produce different WPA values; cross-reference results
The future of WPA lies in more granular, real-time models that incorporate pitch-level data, defensive positioning, and player-specific matchup effects. Machine learning models trained on Statcast data can generate batter-pitcher-situation specific win probabilities that go far beyond traditional base-out-score state tables, though at the cost of transparency and computational complexity.
Conclusion
Win Probability Added represents a powerful framework for understanding baseball's inherent drama and contextual nature. By quantifying how individual plays and performances shift the balance between victory and defeat, WPA bridges the gap between traditional statistics and the lived experience of watching a game unfold. While its context-dependence limits its use for player evaluation and projection, WPA excels at storytelling, identifying crucial moments, and measuring what actually happened in specific games and seasons.
Analysts should view WPA as one tool in a comprehensive analytical toolkit, complementing context-neutral metrics like WAR and batted ball data from Statcast. Used appropriately, WPA provides unique insights into clutch performance, game-changing plays, and the ebb and flow of championship races. As baseball analytics continue to evolve, the fundamental concept of measuring win probability changes will remain relevant, even as the underlying models grow more sophisticated and data-rich.