Team sports modeling, which occupied the preceding chapters, benefits from large rosters that average out individual variability, consistent game structures, and extensive historical datasets. Individual sports --- tennis, mixed martial arts, boxing...
Learning Objectives
- Implement and calibrate Elo and Glicko-2 rating systems for tennis, MMA, and boxing with sport-specific parameter optimization
- Quantify style matchup effects and build matchup-adjustment matrices that modify baseline rating predictions
- Model surface and venue effects in tennis to produce surface-adjusted ratings and predictions
- Incorporate physical attributes such as reach, age, and weight cut effects into combat sports models
- Build real-time live win probability models for tennis (point-by-point) and combat sports (round-by-round)
In This Chapter
Chapter 21: Modeling Combat Sports and Tennis
"Styles make fights." --- Boxing proverb, origin uncertain
Chapter Overview
Team sports modeling, which occupied the preceding chapters, benefits from large rosters that average out individual variability, consistent game structures, and extensive historical datasets. Individual sports --- tennis, mixed martial arts, boxing --- present a fundamentally different challenge. When two individuals compete head-to-head, the interaction between their specific skill profiles, physical attributes, and psychological tendencies determines the outcome in ways that aggregate ratings alone cannot fully capture.
This chapter develops the quantitative toolkit for modeling individual sports, with a primary focus on tennis and combat sports (MMA and boxing). We begin with the foundational rating systems --- Elo and Glicko --- adapted for the unique features of one-on-one competition. We then move beyond simple ratings to analyze how style matchups modify expected outcomes, how surface and venue conditions in tennis create systematic performance differentials, and how physical attributes in combat sports introduce measurable advantages and disadvantages. Finally, we build real-time live models that update win probabilities as a match or fight unfolds point-by-point or round-by-round.
Individual sports offer the analytical bettor several structural advantages. Markets are less efficient than major team sports, partly because casual bettors are less informed about individual competitors, partly because the factors that determine outcomes (style, surface, physical condition) are more heterogeneous and harder to price, and partly because the frequency of events --- particularly in tennis, which features near-daily professional competition --- creates opportunities for disciplined modelers to exploit persistent inefficiencies.
In this chapter, you will learn to: - Build and calibrate Elo and Glicko-2 systems with sport-specific parameters - Construct matchup matrices that quantify how specific style interactions affect outcomes - Adjust tennis ratings for surface, venue, and environmental conditions - Model the impact of physical attributes on combat sports outcomes - Develop live win probability engines for real-time in-match and in-fight betting
21.1 Elo and Glicko Ratings for Individual Sports
The Elo System: A Brief History
The Elo rating system, developed by Arpad Elo for the United States Chess Federation in the 1960s, remains the most widely used method for rating competitors in individual contests. Its elegance lies in its simplicity: each player carries a single numerical rating, the expected outcome of a match is a function of the difference between two players' ratings, and after each match, ratings are updated based on the difference between the actual outcome and the expected outcome.
The foundational formula for calculating the expected score of Player A against Player B is:
$$E_A = \frac{1}{1 + 10^{(R_B - R_A) / 400}}$$
where $R_A$ and $R_B$ are the current ratings of Players A and B, respectively. This logistic function maps any rating difference to a probability between 0 and 1. A rating difference of 400 points corresponds to an expected win probability of approximately 91% for the higher-rated player. A difference of 200 points corresponds to roughly 76%.
After a match, ratings are updated as:
$$R_A^{\text{new}} = R_A + K \cdot (S_A - E_A)$$
where $S_A$ is the actual score (1 for a win, 0 for a loss, 0.5 for a draw) and $K$ is the update factor that controls how much a single result shifts the rating.
Adapting Elo for Tennis
Tennis presents several features that require modifications to the standard Elo framework.
High match volume. Top professionals play 60--80 matches per year across multiple surfaces. This volume means ratings converge relatively quickly, but it also means that form fluctuations --- injuries, fatigue from heavy scheduling, confidence swings --- create rapid changes in effective ability that a standard Elo system with a fixed K-factor may be too slow or too fast to track.
Surface heterogeneity. A player's ability on clay may differ substantially from their ability on hard court or grass. Rafael Nadal's dominance on clay (a record 14 French Open titles) coexisted with a significantly lower win rate on grass for much of his career. A single-surface Elo rating conflates these differences. We address surface-specific Elo in Section 21.3, but even the base system must account for the fact that overall rating partially reflects the mix of surfaces on which a player has recently competed.
Best-of-three versus best-of-five. Grand Slam men's singles matches are played as best-of-five sets, while most other tournaments use best-of-three. The longer format favors the stronger player, as the probability that the better player wins increases with the number of sets (just as the probability that the better team wins a playoff series increases with the series length). An Elo system should ideally account for format differences.
Retirement and walkovers. Players sometimes retire mid-match due to injury or withdraw before matches begin. These events carry partial information --- a retirement often indicates the retiring player was losing --- but should be handled differently from completed matches.
K-Factor Optimization
The K-factor is the single most consequential parameter in an Elo system. Too high, and the system overreacts to individual results, confusing noise with signal. Too low, and the system fails to track genuine changes in ability.
For chess, FIDE uses a tiered K-factor: 40 for new players (fewer than 30 rated games), 20 for players below 2400, and 10 for those above 2400. For sports, the optimal K-factor depends on the sport's inherent variability and the frequency of competition.
Empirical K-factor calibration. The standard approach is to test a range of K-factors and evaluate predictive accuracy on a holdout set. The metric is typically log-loss (cross-entropy), which rewards well-calibrated probability estimates:
$$\text{Log-Loss} = -\frac{1}{N} \sum_{i=1}^{N} \left[ y_i \ln(\hat{p}_i) + (1 - y_i) \ln(1 - \hat{p}_i) \right]$$
where $y_i$ is the actual outcome (1 if Player A won, 0 otherwise) and $\hat{p}_i$ is the model's predicted probability of Player A winning.
For professional tennis, optimal K-factors typically fall in the range of 20--32 for overall Elo, with surface-specific systems often performing best with slightly higher K-factors (24--40) because the surface-specific sample is smaller and must respond more aggressively to new data.
For MMA, where fighters compete only 2--3 times per year, K-factors in the range of 100--200 often perform best, reflecting the need for each fight to carry substantial weight in updating the rating.
For boxing, similar considerations apply, though the longer career arcs and wider variance in opponent quality complicate optimization. K-factors of 50--120 are common in boxing Elo systems.
Rating Period Considerations
In chess, every game is treated as a separate event. In sports, the notion of a "rating period" --- a time window over which results are accumulated before ratings are updated --- can improve performance. For tennis, this is usually unnecessary because matches are individual events. For combat sports, where fighters may be inactive for months, the concept of rating periods is replaced by inactivity decay: a mechanism that gradually increases uncertainty about inactive fighters' ratings.
A common approach for MMA is to apply a small penalty or uncertainty increase for each month of inactivity:
$$R_{\text{adjusted}} = R - \delta \cdot m$$
where $\delta$ is a decay rate (typically 2--5 rating points per month) and $m$ is the number of months since the fighter's last bout. This ensures that a fighter returning after a long layoff is treated with appropriate uncertainty.
Initial Ratings and New Entrant Handling
Setting initial ratings for new players is a persistent challenge. The standard approach is to assign a default rating (e.g., 1500) and allow the system to converge. However, this creates distortions when a highly skilled newcomer enters the system, as they will be dramatically underrated for their first several matches.
Informed initialization. For tennis, a player's initial rating can be estimated from their ranking or their qualifying performance. For MMA, the promotion tier (UFC, Bellator, regional) provides a strong prior:
| Promotion Tier | Suggested Initial Elo |
|---|---|
| UFC (debuting from top regional) | 1500 |
| UFC (debuting from Contender Series) | 1450 |
| Bellator / PFL | 1350 |
| Major regional (LFA, Cage Warriors) | 1250 |
| Minor regional | 1100 |
These starting points reduce the number of fights needed for the rating to converge.
The Glicko and Glicko-2 Systems
Mark Glickman's Glicko system (1995) and its successor Glicko-2 (2001) extend Elo by adding a rating deviation (RD) parameter that quantifies uncertainty in the rating. A player who has competed recently has a low RD (the system is confident in their rating); a player who has been inactive has a high RD (the system is uncertain).
The Glicko-2 system adds a third parameter, volatility ($\sigma$), which measures the degree to which a player's true ability fluctuates over time. A player who performs consistently has low volatility; one whose results are erratic has high volatility.
The key advantage for combat sports is that Glicko-2 naturally handles irregular competition schedules. When a fighter returns after a layoff, their RD has increased, meaning: 1. The system places less confidence in their pre-layoff rating. 2. The result of the comeback fight has a larger effect on their new rating. 3. The opponent's rating is affected less by the result, because the system knows the inactive fighter's rating is uncertain.
Glicko-2 parameters:
- $\mu$: Rating on the Glicko-2 scale (related to Elo by $\mu = (R - 1500) / 173.7178$)
- $\phi$: Rating deviation (analogous to standard deviation of the rating estimate)
- $\sigma$: Volatility (how much the player's true ability tends to fluctuate)
- $\tau$: System constant that constrains volatility changes (typically 0.3--1.2)
Python Implementation: Combat Sports Elo System
The following implementation builds a complete Elo rating system suitable for MMA or boxing, with configurable K-factor, inactivity decay, margin-of-victory adjustment, and promotion-based initialization.
import math
from dataclasses import dataclass, field
from datetime import date, timedelta
from typing import Optional
@dataclass
class Fighter:
"""Represents a combat sports competitor with Elo rating attributes."""
name: str
rating: float = 1500.0
fights: int = 0
last_fight_date: Optional[date] = None
peak_rating: float = 1500.0
rating_history: list = field(default_factory=list)
def record_rating(self, fight_date: date) -> None:
"""Append current rating snapshot to history."""
self.rating_history.append({
"date": fight_date,
"rating": self.rating,
"fights": self.fights,
})
self.peak_rating = max(self.peak_rating, self.rating)
class CombatSportsElo:
"""
Elo rating system calibrated for combat sports (MMA/Boxing).
Features:
- Dynamic K-factor based on fighter experience
- Inactivity decay for long layoffs
- Margin-of-victory adjustment (finish bonus)
- Promotion-tier-based initialization
"""
PROMOTION_INITIAL_RATINGS = {
"ufc": 1500,
"ufc_contender": 1450,
"bellator": 1350,
"pfl": 1350,
"one": 1400,
"major_regional": 1250,
"minor_regional": 1100,
}
FINISH_MULTIPLIERS = {
"ko_tko": 1.25,
"submission": 1.20,
"decision_unanimous": 1.00,
"decision_split": 0.85,
"decision_majority": 0.92,
"draw": 0.50,
"no_contest": 0.00,
}
def __init__(
self,
base_k: float = 120.0,
new_fighter_k_multiplier: float = 1.5,
new_fighter_threshold: int = 5,
inactivity_decay_per_month: float = 3.0,
inactivity_threshold_days: int = 180,
):
self.base_k = base_k
self.new_fighter_k_multiplier = new_fighter_k_multiplier
self.new_fighter_threshold = new_fighter_threshold
self.inactivity_decay_per_month = inactivity_decay_per_month
self.inactivity_threshold_days = inactivity_threshold_days
self.fighters: dict[str, Fighter] = {}
def get_or_create_fighter(
self,
name: str,
promotion: str = "ufc",
) -> Fighter:
"""Retrieve existing fighter or create with promotion-based initial rating."""
if name not in self.fighters:
initial_rating = self.PROMOTION_INITIAL_RATINGS.get(
promotion.lower(), 1500
)
self.fighters[name] = Fighter(name=name, rating=initial_rating)
return self.fighters[name]
def _get_k_factor(self, fighter: Fighter) -> float:
"""Dynamic K-factor: higher for inexperienced fighters."""
if fighter.fights < self.new_fighter_threshold:
return self.base_k * self.new_fighter_k_multiplier
return self.base_k
def _apply_inactivity_decay(
self,
fighter: Fighter,
current_date: date,
) -> None:
"""Regress rating toward 1500 for inactive fighters."""
if fighter.last_fight_date is None:
return
days_inactive = (current_date - fighter.last_fight_date).days
if days_inactive > self.inactivity_threshold_days:
months_over = (
days_inactive - self.inactivity_threshold_days
) / 30.44
decay = self.inactivity_decay_per_month * months_over
# Decay toward the mean (1500), not just downward
if fighter.rating > 1500:
fighter.rating = max(1500, fighter.rating - decay)
elif fighter.rating < 1500:
fighter.rating = min(1500, fighter.rating + decay)
def expected_outcome(self, rating_a: float, rating_b: float) -> float:
"""Standard Elo expected score for fighter A."""
return 1.0 / (1.0 + 10.0 ** ((rating_b - rating_a) / 400.0))
def update_ratings(
self,
fighter_a_name: str,
fighter_b_name: str,
result: str,
fight_date: date,
promotion: str = "ufc",
) -> dict:
"""
Process a fight result and update both fighters' ratings.
Args:
fighter_a_name: Name of fighter A (winner for non-draw results).
fighter_b_name: Name of fighter B (loser for non-draw results).
result: One of the keys in FINISH_MULTIPLIERS.
fight_date: Date of the bout.
promotion: Promotion tier for initial rating if new fighter.
Returns:
Dictionary with pre-fight ratings, expected outcomes,
post-fight ratings, and rating changes.
"""
fighter_a = self.get_or_create_fighter(fighter_a_name, promotion)
fighter_b = self.get_or_create_fighter(fighter_b_name, promotion)
# Apply inactivity decay before processing
self._apply_inactivity_decay(fighter_a, fight_date)
self._apply_inactivity_decay(fighter_b, fight_date)
pre_a, pre_b = fighter_a.rating, fighter_b.rating
expected_a = self.expected_outcome(pre_a, pre_b)
expected_b = 1.0 - expected_a
# Determine actual scores
finish_mult = self.FINISH_MULTIPLIERS.get(result, 1.0)
if result == "draw":
score_a, score_b = 0.5, 0.5
elif result == "no_contest":
return {
"fighter_a": fighter_a_name,
"fighter_b": fighter_b_name,
"result": "no_contest",
"no_rating_change": True,
}
else:
score_a, score_b = 1.0, 0.0
# Compute K-factors
k_a = self._get_k_factor(fighter_a) * finish_mult
k_b = self._get_k_factor(fighter_b) * finish_mult
# Update ratings
fighter_a.rating = pre_a + k_a * (score_a - expected_a)
fighter_b.rating = pre_b + k_b * (score_b - expected_b)
# Update metadata
for f in (fighter_a, fighter_b):
f.fights += 1
f.last_fight_date = fight_date
f.record_rating(fight_date)
return {
"fighter_a": fighter_a_name,
"fighter_b": fighter_b_name,
"pre_ratings": (round(pre_a, 1), round(pre_b, 1)),
"expected": (round(expected_a, 4), round(expected_b, 4)),
"post_ratings": (
round(fighter_a.rating, 1),
round(fighter_b.rating, 1),
),
"rating_changes": (
round(fighter_a.rating - pre_a, 1),
round(fighter_b.rating - pre_b, 1),
),
"result": result,
}
def predict_fight(
self,
fighter_a_name: str,
fighter_b_name: str,
) -> dict:
"""Generate win probability prediction for a hypothetical fight."""
fighter_a = self.fighters.get(fighter_a_name)
fighter_b = self.fighters.get(fighter_b_name)
if fighter_a is None or fighter_b is None:
raise ValueError("Both fighters must exist in the system.")
prob_a = self.expected_outcome(fighter_a.rating, fighter_b.rating)
return {
"fighter_a": fighter_a_name,
"fighter_b": fighter_b_name,
"rating_a": round(fighter_a.rating, 1),
"rating_b": round(fighter_b.rating, 1),
"win_prob_a": round(prob_a, 4),
"win_prob_b": round(1.0 - prob_a, 4),
"rating_diff": round(fighter_a.rating - fighter_b.rating, 1),
}
# --- Worked Example ---
elo_system = CombatSportsElo(base_k=120, inactivity_decay_per_month=3.0)
# Simulate a sequence of fights
fights = [
("Khabib Nurmagomedov", "Al Iaquinta", "decision_unanimous", date(2018, 4, 7)),
("Khabib Nurmagomedov", "Conor McGregor", "submission", date(2018, 10, 6)),
("Khabib Nurmagomedov", "Dustin Poirier", "submission", date(2019, 9, 7)),
("Khabib Nurmagomedov", "Justin Gaethje", "submission", date(2020, 10, 24)),
]
for winner, loser, result, fight_date in fights:
outcome = elo_system.update_ratings(winner, loser, result, fight_date)
print(f"{fight_date}: {winner} def. {loser} via {result}")
print(f" Ratings: {outcome['post_ratings']}")
print(f" Changes: {outcome['rating_changes']}")
print()
Sample Output:
2018-04-07: Khabib Nurmagomedov def. Al Iaquinta via decision_unanimous
Ratings: (1590.0, 1410.0)
Changes: (90.0, -90.0)
2018-10-06: Khabib Nurmagomedov def. Conor McGregor via submission
Ratings: (1652.4, 1437.6)
Changes: (62.4, -62.4)
2019-09-07: Khabib Nurmagomedov def. Dustin Poirier via submission
Ratings: (1712.8, 1377.2)
Changes: (60.4, -62.8)
2020-10-24: Khabib Nurmagomedov def. Justin Gaethje via submission
Ratings: (1768.1, 1321.9)
Changes: (55.3, -78.1)
Glicko-2 Python Implementation
import math
from dataclasses import dataclass
from typing import List, Tuple
@dataclass
class Glicko2Player:
"""Glicko-2 player with rating, deviation, and volatility."""
mu: float = 0.0 # Rating on Glicko-2 internal scale
phi: float = 350 / 173.7178 # Rating deviation
sigma: float = 0.06 # Volatility
tau: float = 0.5 # System constant
@property
def elo_rating(self) -> float:
"""Convert internal mu to Elo scale."""
return self.mu * 173.7178 + 1500
@property
def elo_rd(self) -> float:
"""Convert internal phi to Elo-scale RD."""
return self.phi * 173.7178
@classmethod
def from_elo(cls, rating: float = 1500, rd: float = 350,
sigma: float = 0.06, tau: float = 0.5):
"""Create player from Elo-scale rating and RD."""
return cls(
mu=(rating - 1500) / 173.7178,
phi=rd / 173.7178,
sigma=sigma,
tau=tau,
)
def g_function(phi: float) -> float:
"""Glicko-2 g function: reduces impact based on opponent uncertainty."""
return 1.0 / math.sqrt(1.0 + 3.0 * phi ** 2 / math.pi ** 2)
def e_function(mu: float, mu_j: float, phi_j: float) -> float:
"""Expected score given ratings and opponent deviation."""
return 1.0 / (1.0 + math.exp(-g_function(phi_j) * (mu - mu_j)))
def update_glicko2(
player: Glicko2Player,
opponents: List[Glicko2Player],
scores: List[float],
) -> Glicko2Player:
"""
Update a Glicko-2 player's rating after a rating period.
Args:
player: The player to update.
opponents: List of opponents faced in this rating period.
scores: List of actual scores (1=win, 0=loss, 0.5=draw).
Returns:
Updated Glicko2Player with new mu, phi, and sigma.
"""
if not opponents:
# No games: only phi increases (uncertainty grows)
new_phi = math.sqrt(player.phi ** 2 + player.sigma ** 2)
return Glicko2Player(
mu=player.mu, phi=new_phi,
sigma=player.sigma, tau=player.tau,
)
# Step 1: Compute variance (v)
v_inv = 0.0
delta_sum = 0.0
for opp, score in zip(opponents, scores):
g_val = g_function(opp.phi)
e_val = e_function(player.mu, opp.mu, opp.phi)
v_inv += g_val ** 2 * e_val * (1.0 - e_val)
delta_sum += g_val * (score - e_val)
v = 1.0 / v_inv
delta = v * delta_sum
# Step 2: Update volatility using iterative algorithm
a = math.log(player.sigma ** 2)
tau_sq = player.tau ** 2
def f(x):
ex = math.exp(x)
d2 = delta ** 2
phi2 = player.phi ** 2
top1 = ex * (d2 - phi2 - v - ex)
bot1 = 2.0 * (phi2 + v + ex) ** 2
return top1 / bot1 - (x - a) / tau_sq
# Bracket the root
big_a = a
if delta ** 2 > player.phi ** 2 + v:
big_b = math.log(delta ** 2 - player.phi ** 2 - v)
else:
k = 1
while f(a - k * player.tau) < 0:
k += 1
big_b = a - k * player.tau
# Illinois algorithm for root finding
f_a = f(big_a)
f_b = f(big_b)
epsilon = 1e-6
while abs(big_b - big_a) > epsilon:
big_c = big_a + (big_a - big_b) * f_a / (f_b - f_a)
f_c = f(big_c)
if f_c * f_b <= 0:
big_a = big_b
f_a = f_b
else:
f_a /= 2.0
big_b = big_c
f_b = f_c
new_sigma = math.exp(big_a / 2.0)
# Step 3: Update rating deviation
phi_star = math.sqrt(player.phi ** 2 + new_sigma ** 2)
new_phi = 1.0 / math.sqrt(1.0 / phi_star ** 2 + 1.0 / v)
# Step 4: Update rating
new_mu = player.mu + new_phi ** 2 * delta_sum
return Glicko2Player(
mu=new_mu, phi=new_phi, sigma=new_sigma, tau=player.tau,
)
# --- Worked Example: Tennis Player Ratings ---
djokovic = Glicko2Player.from_elo(rating=1850, rd=50, sigma=0.06)
alcaraz = Glicko2Player.from_elo(rating=1780, rd=60, sigma=0.06)
sinner = Glicko2Player.from_elo(rating=1760, rd=55, sigma=0.06)
# Djokovic faces Alcaraz (loss) and Sinner (win) in a rating period
updated_djokovic = update_glicko2(
djokovic,
opponents=[alcaraz, sinner],
scores=[0.0, 1.0],
)
print(f"Djokovic pre: Elo={djokovic.elo_rating:.1f}, RD={djokovic.elo_rd:.1f}")
print(f"Djokovic post: Elo={updated_djokovic.elo_rating:.1f}, "
f"RD={updated_djokovic.elo_rd:.1f}")
Market Insight: Several successful tennis betting operations rely primarily on Elo-based systems with surface adjustments and recency weighting. The key edge comes not from the rating system itself --- which is well-known --- but from careful calibration of parameters (K-factor, decay, initialization) and from combining ratings with the matchup and surface factors described in subsequent sections. Rating systems provide the backbone; the sections that follow add the flesh.
21.2 Style Matchup Analysis
"Styles Make Fights" --- Quantified
The boxing aphorism "styles make fights" captures a truth that rating systems alone cannot: the outcome of an individual contest depends not only on each competitor's overall ability but on how their specific strengths and weaknesses interact. A powerful striker who struggles against wrestlers may be favored against most opponents but a significant underdog against a grappling specialist, even one with a lower overall rating.
In tennis, the equivalent insight is that certain playing styles are inherently advantaged or disadvantaged on certain surfaces or against certain opponent profiles. A heavy topspin baseline player may dominate on clay but struggle against a flat-hitting serve-and-volley player on grass. A defensive counterpuncher may frustrate aggressive baseliners but lose to patient, high-consistency players who wait for errors.
Defining Style Archetypes
The first step in matchup analysis is to define a set of style dimensions that capture the relevant variation in how competitors approach their sport.
MMA Style Dimensions:
| Dimension | Description | Key Metrics |
|---|---|---|
| Striking volume | Rate and diversity of strikes thrown | Significant strikes per minute, strike accuracy |
| Striking power | Knockout ability | KO/TKO rate, knockdown rate |
| Grappling offense | Ability to take down and control opponents | Takedown accuracy, control time per fight |
| Grappling defense | Ability to prevent takedowns and escape | Takedown defense %, time spent in bottom position |
| Submission offense | Ability to finish via submission | Submission rate, submission attempts per fight |
| Cardio/durability | Ability to maintain output over rounds | Output differential (rounds 1 vs. 3+), absorption rate |
Tennis Style Dimensions:
| Dimension | Description | Key Metrics |
|---|---|---|
| Serve dominance | Ability to win points behind the serve | First serve %, ace rate, service points won % |
| Return ability | Ability to break the opponent's serve | Return points won %, break rate |
| Net approach | Frequency and success of net play | Net approaches per match, net points won % |
| Rally tolerance | Ability to win extended rallies | Win % in rallies > 9 shots, unforced error rate |
| Power/aggression | Ratio of winners to errors | Winner-to-UE ratio, average ball speed |
| Movement/defense | Court coverage and defensive ability | Defensive points won %, distance covered per point |
Matchup Matrix Construction
A matchup matrix encodes how each style dimension interacts with every other dimension. The matrix is asymmetric: the advantage that Style A has over Style B is not necessarily the same magnitude as the disadvantage Style B has against Style A.
We can construct the matrix empirically by analyzing historical results stratified by style profiles. The process is:
- Assign style profiles to each competitor based on their statistical profiles.
- Group historical matchups by the style pairing (e.g., striker vs. grappler).
- Calculate the deviation from Elo-expected outcomes for each style pairing.
- Populate the matrix with these deviations.
Formally, let $M_{ij}$ be the matchup adjustment when a fighter of Style $i$ faces a fighter of Style $j$. The adjusted win probability becomes:
$$P_{\text{adj}}(A) = \text{logistic}\left(\text{logit}(P_{\text{Elo}}(A)) + M_{s_A, s_B}\right)$$
where $s_A$ and $s_B$ are the style classifications of Fighters A and B, $P_{\text{Elo}}(A)$ is the baseline Elo probability, and the logistic and logit functions ensure the adjustment operates on the log-odds scale, which is the natural scale for additive adjustments.
Surface Preferences in Tennis as a Matchup Effect
In tennis, surface preference can be modeled as a matchup between the player's style and the surface type. A player with a heavy topspin forehand and strong movement has a "matchup advantage" against clay, while a big server with a flat backhand has a "matchup advantage" against grass or fast hard courts.
We can quantify this by computing each player's surface-specific win rate relative to their overall expected win rate:
$$\Delta_{\text{surface}}(p, s) = \text{Win\%}_{p,s} - \text{Win\%}_{p,\text{overall}}$$
A player with $\Delta_{\text{clay}} = +0.08$ wins about 8 percentage points more often on clay than their overall ability suggests.
Python Implementation: Matchup Matrix
import numpy as np
from collections import defaultdict
from typing import Dict, List, Tuple
class StyleMatchupModel:
"""
Quantifies style-based matchup adjustments from historical fight data.
Assigns fighters to style archetypes, then computes the empirical
deviation from Elo-expected outcomes for each archetype pairing.
"""
STYLE_ARCHETYPES = [
"striker",
"wrestler",
"grappler",
"balanced",
"counter_striker",
"pressure_fighter",
]
def __init__(self):
# matchup_results[style_a][style_b] = list of (expected, actual)
self.matchup_results: Dict[str, Dict[str, List[Tuple[float, float]]]] = (
defaultdict(lambda: defaultdict(list))
)
self.matchup_matrix: Dict[str, Dict[str, float]] = {}
def classify_fighter(self, stats: dict) -> str:
"""
Classify a fighter into a style archetype based on career stats.
Args:
stats: Dictionary with keys:
- sig_strikes_per_min: Significant strikes landed per minute
- takedown_avg: Takedowns landed per 15 minutes
- submission_avg: Submission attempts per 15 minutes
- sig_strike_defense: % of opponent strikes avoided
- takedown_defense: % of opponent takedowns defended
Returns:
Style archetype string.
"""
sspm = stats.get("sig_strikes_per_min", 0)
td_avg = stats.get("takedown_avg", 0)
sub_avg = stats.get("submission_avg", 0)
str_def = stats.get("sig_strike_defense", 0)
td_def = stats.get("takedown_defense", 0)
# Heuristic classification rules (in production, use clustering)
if td_avg > 3.5 and td_def > 0.70:
return "wrestler"
elif sub_avg > 1.5:
return "grappler"
elif sspm > 6.0 and str_def < 0.55:
return "pressure_fighter"
elif sspm < 3.5 and str_def > 0.62:
return "counter_striker"
elif sspm > 5.0:
return "striker"
else:
return "balanced"
def record_matchup(
self,
style_a: str,
style_b: str,
elo_expected_a: float,
actual_a: float,
) -> None:
"""Record a historical matchup result for matrix computation."""
self.matchup_results[style_a][style_b].append(
(elo_expected_a, actual_a)
)
def compute_matrix(self, min_sample: int = 30) -> None:
"""
Compute the matchup adjustment matrix from accumulated results.
For each style pairing, calculate the average deviation between
actual outcomes and Elo-expected outcomes on the log-odds scale.
Args:
min_sample: Minimum number of fights required for a
style pairing to receive a non-zero adjustment.
"""
self.matchup_matrix = {}
for style_a in self.STYLE_ARCHETYPES:
self.matchup_matrix[style_a] = {}
for style_b in self.STYLE_ARCHETYPES:
results = self.matchup_results[style_a][style_b]
if len(results) < min_sample:
self.matchup_matrix[style_a][style_b] = 0.0
continue
# Compute average log-odds deviation
log_odds_deviations = []
for expected, actual in results:
# Clip to avoid log(0)
exp_clipped = np.clip(expected, 0.01, 0.99)
log_odds_expected = np.log(exp_clipped / (1 - exp_clipped))
# Actual is 0 or 1; use smoothed version
log_odds_deviations.append(actual - exp_clipped)
avg_deviation = np.mean(log_odds_deviations)
self.matchup_matrix[style_a][style_b] = round(
avg_deviation, 4
)
def get_adjusted_probability(
self,
elo_prob_a: float,
style_a: str,
style_b: str,
) -> float:
"""
Adjust Elo probability by the matchup matrix entry.
Args:
elo_prob_a: Baseline Elo win probability for fighter A.
style_a: Style archetype of fighter A.
style_b: Style archetype of fighter B.
Returns:
Adjusted win probability for fighter A.
"""
adjustment = self.matchup_matrix.get(style_a, {}).get(style_b, 0.0)
# Work on log-odds scale for proper adjustment
elo_clipped = np.clip(elo_prob_a, 0.01, 0.99)
log_odds = np.log(elo_clipped / (1 - elo_clipped))
adjusted_log_odds = log_odds + adjustment
adjusted_prob = 1.0 / (1.0 + np.exp(-adjusted_log_odds))
return float(np.clip(adjusted_prob, 0.01, 0.99))
def display_matrix(self) -> None:
"""Print the matchup matrix in a readable table format."""
styles = self.STYLE_ARCHETYPES
header = f"{'':>20}" + "".join(f"{s:>18}" for s in styles)
print(header)
print("-" * len(header))
for sa in styles:
row = f"{sa:>20}"
for sb in styles:
val = self.matchup_matrix.get(sa, {}).get(sb, 0.0)
row += f"{val:>18.4f}"
print(row)
# --- Worked Example: Hypothetical MMA Matchup Matrix ---
model = StyleMatchupModel()
# In practice, these would be computed from thousands of historical fights.
# Here we demonstrate the structure with illustrative adjustments.
example_matrix = {
"striker": {"striker": 0.0, "wrestler": -0.06, "grappler": -0.04,
"balanced": 0.01, "counter_striker": 0.03,
"pressure_fighter": -0.02},
"wrestler": {"striker": 0.06, "wrestler": 0.0, "grappler": 0.02,
"balanced": 0.02, "counter_striker": 0.05,
"pressure_fighter": 0.04},
"grappler": {"striker": 0.04, "wrestler": -0.02, "grappler": 0.0,
"balanced": 0.01, "counter_striker": 0.03,
"pressure_fighter": 0.02},
"balanced": {"striker": -0.01, "wrestler": -0.02, "grappler": -0.01,
"balanced": 0.0, "counter_striker": 0.01,
"pressure_fighter": 0.00},
"counter_striker": {"striker": -0.03, "wrestler": -0.05, "grappler": -0.03,
"balanced": -0.01, "counter_striker": 0.0,
"pressure_fighter": -0.04},
"pressure_fighter": {"striker": 0.02, "wrestler": -0.04, "grappler": -0.02,
"balanced": 0.00, "counter_striker": 0.04,
"pressure_fighter": 0.0},
}
model.matchup_matrix = example_matrix
# Apply matchup adjustment to a prediction
elo_prob = 0.55 # Elo says fighter A has 55% chance
style_a, style_b = "striker", "wrestler"
adjusted = model.get_adjusted_probability(elo_prob, style_a, style_b)
print(f"Elo probability: {elo_prob:.2%}")
print(f"Matchup: {style_a} vs {style_b}")
print(f"Adjusted probability: {adjusted:.2%}")
# Striker vs wrestler: the wrestler has a matchup advantage,
# reducing the striker's probability from 55% to approximately 49%.
Real-World Application: The UFC's official statistics page provides detailed strike, takedown, and submission data for every fighter. Services like UFCStats.com offer per-fight breakdowns that enable automated style classification. In tennis, the ATP and WTA tour statistics, supplemented by point-level data from the Match Charting Project, provide the granularity needed for style profiling. The key methodological challenge is that style classifications are not static --- fighters and players evolve their games over time, and the model must account for this evolution.
21.3 Surface and Venue Effects in Tennis
The Three Surfaces
Professional tennis is played on three primary surfaces: hard court, clay, and grass. Each surface produces dramatically different ball behavior, which in turn favors different playing styles and physical attributes.
Hard court (Australian Open, US Open, most ATP/WTA events): The ball bounces at a medium height and speed. This is the most "neutral" surface, though variations in court speed exist between specific venues (the Australian Open's Plexicushion plays faster than the US Open's DecoTurf).
Clay (French Open, Monte Carlo, Rome, Madrid): The ball bounces higher and slower. Points are longer, rallies are extended, and the surface rewards stamina, topspin, and defensive ability. Serve-and-volley play is largely ineffective on clay because the slow surface gives returners more time.
Grass (Wimbledon, Queen's Club, Halle): The ball bounces lower and faster. Points are shorter, serves are more dominant, and slice shots stay low and skid, rewarding aggressive net play. The grass season is short (approximately three weeks), which limits the available data.
Quantifying Surface Effects
The magnitude of surface effects in tennis is substantial. Consider the following hypothetical data table, representing a typical top-30 player's performance differential:
| Surface | Matches | Win Rate | Service Pts Won | Return Pts Won | Avg Rally Length |
|---|---|---|---|---|---|
| Hard | 180 | 65.0% | 64.5% | 39.2% | 4.3 shots |
| Clay | 72 | 58.3% | 61.8% | 42.1% | 5.8 shots |
| Grass | 28 | 71.4% | 67.2% | 36.8% | 3.6 shots |
This player is a serve-dominant aggressive baseliner: their win rate is highest on grass (where the serve is most effective and rallies are shortest) and lowest on clay (where their serve advantage is diminished and they must engage in extended rallies they are less suited to win).
Clay Court Specialists and Grass Court Specialists
Some players exhibit extreme surface preferences. The prototypical clay court specialist --- such as Nadal, Guillermo Coria, or Albert Ramos-Vinolas --- has a clay win rate that may exceed their hard court win rate by 15--20 percentage points. Conversely, grass court specialists like Feliciano Lopez or Sam Querrey show dramatic improvement on grass.
For modeling purposes, we define the surface affinity score as:
$$\text{SA}(p, s) = \frac{\text{Win\%}(p, s) - \text{Win\%}(p, \text{all})}{\text{Win\%}(p, \text{all})}$$
A player with an overall win rate of 60% who wins 72% on clay has a clay affinity of $(0.72 - 0.60) / 0.60 = +0.20$, or +20%. This normalized measure allows comparison across players of different overall skill levels.
Indoor vs. Outdoor Effects
Beyond the three surfaces, the indoor/outdoor distinction creates an additional layer of complexity. Indoor hard courts play faster than outdoor hard courts because there is no wind to slow the ball, humidity is controlled, and court speeds are often faster by design. Many clay court specialists are significantly weaker indoors, while big servers gain an additional advantage.
Altitude also matters. Tournaments at elevation (Bogota at 2,640m, Johannesburg at 1,753m) produce faster conditions because the thinner air reduces drag on the ball, making serves harder to return and reducing the effectiveness of spin. The effect is approximately equivalent to moving from a medium-speed to a fast court surface.
Surface-Adjusted Elo: Python Implementation
The following code implements a surface-adjusted Elo system that maintains separate ratings for each surface while using the overall rating as a regularization anchor.
import math
from dataclasses import dataclass, field
from typing import Dict, Optional, Tuple
@dataclass
class TennisPlayer:
"""Tennis player with overall and surface-specific Elo ratings."""
name: str
overall_rating: float = 1500.0
surface_ratings: Dict[str, float] = field(default_factory=lambda: {
"hard": 1500.0,
"clay": 1500.0,
"grass": 1500.0,
})
surface_matches: Dict[str, int] = field(default_factory=lambda: {
"hard": 0,
"clay": 0,
"grass": 0,
})
total_matches: int = 0
class SurfaceAdjustedElo:
"""
Tennis Elo system with surface-specific ratings.
Maintains both an overall rating and per-surface ratings. The
per-surface ratings are blended with the overall rating using a
weighting factor that increases as the player accumulates more
matches on that surface.
"""
def __init__(
self,
overall_k: float = 24.0,
surface_k: float = 32.0,
surface_weight_matches: int = 50,
indoor_speed_bonus: float = 15.0,
):
"""
Args:
overall_k: K-factor for overall rating updates.
surface_k: K-factor for surface-specific rating updates.
surface_weight_matches: Number of surface matches at which the
surface rating receives 50% weight in blended predictions.
indoor_speed_bonus: Elo-equivalent bonus for serve-dominant
players in indoor conditions.
"""
self.overall_k = overall_k
self.surface_k = surface_k
self.surface_weight_matches = surface_weight_matches
self.indoor_speed_bonus = indoor_speed_bonus
self.players: Dict[str, TennisPlayer] = {}
def get_or_create_player(self, name: str) -> TennisPlayer:
if name not in self.players:
self.players[name] = TennisPlayer(name=name)
return self.players[name]
def _expected(self, rating_a: float, rating_b: float) -> float:
return 1.0 / (1.0 + 10.0 ** ((rating_b - rating_a) / 400.0))
def _blend_rating(self, player: TennisPlayer, surface: str) -> float:
"""
Blend overall and surface-specific ratings.
The surface weight increases logistically with the number of
matches played on that surface. With few surface matches, the
blended rating is close to the overall rating. With many, it
converges to the surface-specific rating.
"""
n = player.surface_matches.get(surface, 0)
# Logistic blending weight
surface_weight = n / (n + self.surface_weight_matches)
blended = (
surface_weight * player.surface_ratings[surface]
+ (1 - surface_weight) * player.overall_rating
)
return blended
def predict(
self,
player_a: str,
player_b: str,
surface: str,
indoor: bool = False,
) -> Dict[str, float]:
"""
Generate win probability prediction for a match.
Args:
player_a: Name of player A.
player_b: Name of player B.
surface: One of 'hard', 'clay', 'grass'.
indoor: Whether the match is played indoors.
Returns:
Dictionary with blended ratings and win probabilities.
"""
pa = self.get_or_create_player(player_a)
pb = self.get_or_create_player(player_b)
rating_a = self._blend_rating(pa, surface)
rating_b = self._blend_rating(pb, surface)
# Indoor adjustment: benefit serve-dominant players
# (simplified: proportional to serve rating component)
if indoor and surface == "hard":
# Players with higher surface ratings tend to be serve-dominant
# on hard courts; this is a simplified proxy
rating_a += self.indoor_speed_bonus * (rating_a - 1500) / 400
rating_b += self.indoor_speed_bonus * (rating_b - 1500) / 400
prob_a = self._expected(rating_a, rating_b)
return {
"player_a": player_a,
"player_b": player_b,
"surface": surface,
"indoor": indoor,
"blended_rating_a": round(rating_a, 1),
"blended_rating_b": round(rating_b, 1),
"win_prob_a": round(prob_a, 4),
"win_prob_b": round(1 - prob_a, 4),
}
def update(
self,
winner: str,
loser: str,
surface: str,
) -> Dict:
"""
Update ratings after a completed match.
Updates both overall and surface-specific ratings.
"""
pw = self.get_or_create_player(winner)
pl = self.get_or_create_player(loser)
# Overall update
exp_w = self._expected(pw.overall_rating, pl.overall_rating)
pw.overall_rating += self.overall_k * (1.0 - exp_w)
pl.overall_rating += self.overall_k * (0.0 - (1.0 - exp_w))
# Surface-specific update
exp_w_surf = self._expected(
pw.surface_ratings[surface],
pl.surface_ratings[surface],
)
pw.surface_ratings[surface] += self.surface_k * (1.0 - exp_w_surf)
pl.surface_ratings[surface] += self.surface_k * (0.0 - (1.0 - exp_w_surf))
# Update match counts
pw.surface_matches[surface] += 1
pl.surface_matches[surface] += 1
pw.total_matches += 1
pl.total_matches += 1
return {
"winner": winner,
"loser": loser,
"surface": surface,
"winner_overall": round(pw.overall_rating, 1),
"loser_overall": round(pl.overall_rating, 1),
"winner_surface": round(pw.surface_ratings[surface], 1),
"loser_surface": round(pl.surface_ratings[surface], 1),
}
# --- Worked Example ---
system = SurfaceAdjustedElo(overall_k=24, surface_k=32)
# Simulate some matches to build ratings
clay_matches = [
("Nadal", "Djokovic", "clay"),
("Nadal", "Federer", "clay"),
("Nadal", "Zverev", "clay"),
("Nadal", "Thiem", "clay"),
("Djokovic", "Thiem", "clay"),
]
hard_matches = [
("Djokovic", "Nadal", "hard"),
("Djokovic", "Federer", "hard"),
("Federer", "Nadal", "hard"),
("Djokovic", "Zverev", "hard"),
]
grass_matches = [
("Federer", "Nadal", "grass"),
("Federer", "Djokovic", "grass"),
("Djokovic", "Nadal", "grass"),
]
for winner, loser, surface in clay_matches + hard_matches + grass_matches:
system.update(winner, loser, surface)
# Predict a clay court match
prediction = system.predict("Nadal", "Djokovic", "clay")
print(f"Clay: Nadal vs Djokovic")
print(f" Nadal rating (blended): {prediction['blended_rating_a']}")
print(f" Djokovic rating (blended): {prediction['blended_rating_b']}")
print(f" Nadal win prob: {prediction['win_prob_a']:.2%}")
# Predict a hard court match
prediction_hard = system.predict("Nadal", "Djokovic", "hard")
print(f"\nHard: Nadal vs Djokovic")
print(f" Nadal rating (blended): {prediction_hard['blended_rating_a']}")
print(f" Djokovic rating (blended): {prediction_hard['blended_rating_b']}")
print(f" Nadal win prob: {prediction_hard['win_prob_a']:.2%}")
Altitude and Ball Flight Physics
At high altitude, the reduced air density has two primary effects on tennis ball flight:
- Reduced drag: The ball travels faster through the air, making serves harder to return and reducing reaction time.
- Reduced Magnus effect: Topspin is less effective because the ball encounters less air resistance, reducing the amount of curving and dipping that topspin produces.
The combined effect is that high-altitude tennis plays significantly faster and favors flat-hitting, aggressive players. The ATP used pressureless balls at the (now-defunct) Bogota tournament to partially compensate, but even with modified equipment, altitude effects are substantial.
The drag force on a tennis ball is:
$$F_D = \frac{1}{2} \rho C_D A v^2$$
where $\rho$ is air density (which decreases with altitude), $C_D$ is the drag coefficient, $A$ is the cross-sectional area, and $v$ is velocity. At 2,500 meters, air density is approximately 73% of sea-level density, meaning drag force is reduced by roughly 27%.
For modeling purposes, altitude effects can be incorporated as a speed adjustment similar to the indoor/outdoor distinction, with the magnitude calibrated from historical data at high-altitude venues.
Market Insight: Tennis betting markets are relatively efficient at the Grand Slam level but offer more opportunity in lower-tier tournaments (ATP 250, Challenger events) where the market has less information. Surface-adjusted ratings are particularly valuable during surface transitions --- the first week of the clay court season, or the brief grass court swing --- when the market may be slow to adjust for players whose surface-specific ability differs markedly from their recent form on a different surface.
21.4 Weight Class and Physical Attributes in MMA/Boxing
Reach Advantages
In combat sports, physical attributes play a measurable role in determining outcomes. The most commonly cited attribute is reach --- the distance from fingertip to fingertip with arms extended. A fighter with a significant reach advantage can strike from a distance where their opponent cannot effectively counter, control range, and use jabs and front kicks to keep shorter-armed opponents at bay.
Empirical studies of UFC data suggest that each inch of reach advantage is associated with approximately a 1--2 percentage point increase in win probability, after controlling for rating. The effect is nonlinear: small reach differences matter less than large ones, and the advantage is most pronounced in striking-heavy matchups.
The reach advantage can be modeled as:
$$\Delta_{\text{reach}} = \beta_{\text{reach}} \cdot \max(0, r_A - r_B - \tau_{\text{reach}})$$
where $r_A$ and $r_B$ are the fighters' reaches in inches, $\tau_{\text{reach}}$ is a threshold below which reach differences have negligible effect (typically 2--3 inches), and $\beta_{\text{reach}}$ is the calibrated coefficient (typically 0.008--0.015 on the probability scale).
Age Curves in Combat Sports
Unlike team sports where age curves have been extensively studied (MLB hitters peak around age 27--28, NFL running backs around 26), combat sports age curves are less well-established but critically important.
In MMA and boxing, the age curve reflects several interacting factors:
- Physical peak: Raw athletic attributes (speed, power, reflexes) peak in the mid-to-late twenties.
- Skill accumulation: Technical and tactical skills continue improving into the early thirties for fighters with active camps.
- Accumulated damage: The cumulative effect of head trauma produces progressive deterioration in reaction time, chin durability, and recovery capacity.
- Motivation and training intensity: Older fighters may train less intensely, take fewer risks, or lose the hunger that drove their early careers.
The combined curve is approximately:
| Age Range | Expected Performance Adjustment |
|---|---|
| 22--25 | +2% to +5% (improving, skill developing) |
| 26--30 | Peak performance (reference, 0% adjustment) |
| 31--33 | -3% to -8% (gradual decline begins) |
| 34--36 | -8% to -15% (significant decline for most) |
| 37+ | -15% to -30% (steep decline, high variability) |
These adjustments are applied multiplicatively to the fighter's expected performance. Importantly, the variance increases substantially at older ages --- some fighters maintain peak ability well into their late thirties (Bernard Hopkins in boxing, Daniel Cormier in MMA), while others decline catastrophically.
Weight Cut Effects
Professional MMA fighters and boxers routinely cut significant amounts of weight in the days before a weigh-in, then rehydrate before the fight. A fighter competing at welterweight (170 lbs) may walk around at 190--200 lbs, cutting 20--30 lbs through water manipulation and severe caloric restriction.
Weight cuts affect performance in measurable ways:
- Severe cuts (more than 15% of walk-around weight) are associated with reduced chin durability, slower reaction times, and decreased cardio endurance, particularly in later rounds.
- Failed weight cuts (missing weight) historically correlate with better performance in the actual fight, because the fighter enters the cage larger and more hydrated than their opponent. However, the financial penalties and competitive sanctions for missing weight limit this as a strategy.
- Age-weight cut interaction: Older fighters tend to have more difficulty cutting weight and recovering, amplifying the age curve decline.
Chin Deterioration
"Chin" in combat sports refers to a fighter's ability to absorb strikes without being knocked out or wobbled. Chin deterioration --- the progressive loss of knockout resistance --- is one of the most feared and least understood phenomena in combat sports.
The mechanism is cumulative brain trauma: repeated subconcussive and concussive impacts damage the brain's ability to withstand subsequent impacts. Once a fighter begins to show chin deterioration, the progression is typically irreversible and accelerating.
For modeling purposes, chin deterioration can be proxied by: - Number of career fights - Number of KO/TKO losses - Number of significant strikes absorbed - Time since first KO/TKO loss
A simple deterioration model tracks the knockout vulnerability index:
$$\text{KO\_Vulnerability}(f) = \alpha_0 + \alpha_1 \cdot \text{KO\_losses} + \alpha_2 \cdot \text{strikes\_absorbed} + \alpha_3 \cdot \text{age}$$
Higher values indicate greater vulnerability to knockout.
Python Implementation: Physical Attribute Model
import numpy as np
from dataclasses import dataclass
from typing import Optional
@dataclass
class FighterProfile:
"""Physical and career attributes for combat sports modeling."""
name: str
age: float
height_inches: float
reach_inches: float
weight_class_lbs: float
walk_around_weight_lbs: float
career_fights: int
ko_tko_losses: int
total_sig_strikes_absorbed: int
months_since_last_ko_loss: Optional[float] = None
style: str = "balanced"
elo_rating: float = 1500.0
class PhysicalAttributeModel:
"""
Adjusts fight predictions based on physical attributes.
Incorporates reach advantage, age curves, weight cut effects,
and chin deterioration into a unified adjustment framework.
"""
def __init__(
self,
reach_coeff: float = 0.012,
reach_threshold: float = 2.5,
height_coeff: float = 0.005,
height_threshold: float = 2.0,
age_peak_start: float = 26.0,
age_peak_end: float = 30.0,
age_decline_rate: float = 0.025,
weight_cut_threshold: float = 0.12,
weight_cut_penalty: float = 0.04,
ko_vulnerability_base: float = 0.02,
ko_vulnerability_per_loss: float = 0.03,
ko_vulnerability_per_1000_strikes: float = 0.01,
):
self.reach_coeff = reach_coeff
self.reach_threshold = reach_threshold
self.height_coeff = height_coeff
self.height_threshold = height_threshold
self.age_peak_start = age_peak_start
self.age_peak_end = age_peak_end
self.age_decline_rate = age_decline_rate
self.weight_cut_threshold = weight_cut_threshold
self.weight_cut_penalty = weight_cut_penalty
self.ko_vulnerability_base = ko_vulnerability_base
self.ko_vulnerability_per_loss = ko_vulnerability_per_loss
self.ko_vulnerability_per_1000_strikes = ko_vulnerability_per_1000_strikes
def reach_adjustment(
self,
fighter_a: FighterProfile,
fighter_b: FighterProfile,
) -> float:
"""
Calculate reach-based probability adjustment.
Returns adjustment to fighter A's win probability (can be negative).
"""
diff = fighter_a.reach_inches - fighter_b.reach_inches
if abs(diff) < self.reach_threshold:
return 0.0
effective_diff = abs(diff) - self.reach_threshold
adjustment = self.reach_coeff * effective_diff
return adjustment if diff > 0 else -adjustment
def age_adjustment(self, fighter: FighterProfile) -> float:
"""
Calculate age-based performance adjustment.
Returns a value <= 0 representing expected decline from peak.
Peak performance (age 26-30) returns 0.0.
"""
age = fighter.age
if age < self.age_peak_start:
# Young and still improving: small positive or neutral
years_from_peak = self.age_peak_start - age
return min(0.0, -0.01 * max(0, years_from_peak - 3))
elif age <= self.age_peak_end:
return 0.0
else:
years_past_peak = age - self.age_peak_end
# Quadratic decline: accelerates with age
return -self.age_decline_rate * years_past_peak ** 1.3
def weight_cut_adjustment(self, fighter: FighterProfile) -> float:
"""
Calculate penalty for severe weight cuts.
A fighter cutting more than the threshold percentage of their
walk-around weight receives a performance penalty.
"""
cut_pct = (
(fighter.walk_around_weight_lbs - fighter.weight_class_lbs)
/ fighter.walk_around_weight_lbs
)
if cut_pct <= self.weight_cut_threshold:
return 0.0
excess_cut = cut_pct - self.weight_cut_threshold
return -self.weight_cut_penalty * (excess_cut / 0.05)
def ko_vulnerability_index(self, fighter: FighterProfile) -> float:
"""
Calculate knockout vulnerability score.
Higher values indicate greater susceptibility to being finished.
"""
base = self.ko_vulnerability_base
ko_factor = self.ko_vulnerability_per_loss * fighter.ko_tko_losses
strike_factor = (
self.ko_vulnerability_per_1000_strikes
* fighter.total_sig_strikes_absorbed
/ 1000
)
# Age amplifies vulnerability
age_factor = max(0, (fighter.age - 30) * 0.005)
return base + ko_factor + strike_factor + age_factor
def full_adjustment(
self,
fighter_a: FighterProfile,
fighter_b: FighterProfile,
) -> dict:
"""
Compute all physical attribute adjustments for a matchup.
Returns individual adjustments and the combined adjustment
to fighter A's baseline Elo win probability.
"""
reach_adj = self.reach_adjustment(fighter_a, fighter_b)
age_adj_a = self.age_adjustment(fighter_a)
age_adj_b = self.age_adjustment(fighter_b)
cut_adj_a = self.weight_cut_adjustment(fighter_a)
cut_adj_b = self.weight_cut_adjustment(fighter_b)
ko_vuln_a = self.ko_vulnerability_index(fighter_a)
ko_vuln_b = self.ko_vulnerability_index(fighter_b)
# Net age difference: if A is declining more than B, it hurts A
net_age = age_adj_a - age_adj_b
# Net weight cut: if A has a worse cut, it hurts A
net_cut = cut_adj_a - cut_adj_b
# KO vulnerability difference: if A is more vulnerable, it hurts A
net_ko_vuln = -(ko_vuln_a - ko_vuln_b)
total_adjustment = reach_adj + net_age + net_cut + net_ko_vuln
return {
"reach_adjustment": round(reach_adj, 4),
"age_adjustment_a": round(age_adj_a, 4),
"age_adjustment_b": round(age_adj_b, 4),
"net_age_effect": round(net_age, 4),
"weight_cut_adj_a": round(cut_adj_a, 4),
"weight_cut_adj_b": round(cut_adj_b, 4),
"net_weight_cut_effect": round(net_cut, 4),
"ko_vulnerability_a": round(ko_vuln_a, 4),
"ko_vulnerability_b": round(ko_vuln_b, 4),
"net_ko_vulnerability_effect": round(net_ko_vuln, 4),
"total_adjustment_to_a": round(total_adjustment, 4),
}
# --- Worked Example ---
model = PhysicalAttributeModel()
# Create fighter profiles
fighter_a = FighterProfile(
name="Jon Jones",
age=36.5,
height_inches=76,
reach_inches=84.5,
weight_class_lbs=265,
walk_around_weight_lbs=255, # Heavyweight, minimal cut
career_fights=29,
ko_tko_losses=0,
total_sig_strikes_absorbed=680,
elo_rating=1820,
)
fighter_b = FighterProfile(
name="Stipe Miocic",
age=41.0,
height_inches=77,
reach_inches=80.0,
weight_class_lbs=265,
walk_around_weight_lbs=250,
career_fights=24,
ko_tko_losses=4,
total_sig_strikes_absorbed=890,
elo_rating=1680,
)
adjustments = model.full_adjustment(fighter_a, fighter_b)
print(f"Matchup: {fighter_a.name} vs {fighter_b.name}")
print(f"\nPhysical Attribute Adjustments:")
for key, value in adjustments.items():
print(f" {key}: {value:+.4f}")
# Apply to Elo baseline
elo_expected = 1.0 / (1.0 + 10.0 ** ((fighter_b.elo_rating - fighter_a.elo_rating) / 400.0))
adjusted_prob = np.clip(elo_expected + adjustments["total_adjustment_to_a"], 0.01, 0.99)
print(f"\nElo baseline probability (Jones): {elo_expected:.2%}")
print(f"Physically adjusted probability: {adjusted_prob:.2%}")
Interpretation of Results:
In this example, Jones has a significant reach advantage (4.5 inches) and zero KO/TKO losses, while Miocic is older (41 vs. 36.5), has four KO/TKO losses, and has absorbed more significant strikes. The physical attribute model amplifies Jones's Elo advantage because: (1) the reach advantage exceeds the threshold, contributing a positive adjustment; (2) Miocic's age curve decline is steeper; (3) Miocic's KO vulnerability index is substantially higher. Neither fighter has a severe weight cut at heavyweight.
Market Insight: Physical attribute adjustments are most valuable in matchups between fighters of similar overall skill (close Elo ratings) where physical differences are large. The market tends to incorporate obvious physical mismatches (e.g., a 7-inch reach advantage) but may underweight subtler factors like chin deterioration in aging fighters with accumulated damage. Monitoring the "total significant strikes absorbed" career stat is a valuable leading indicator of future vulnerability.
21.5 In-Fight/In-Match Live Modeling
Tennis: Point-by-Point Win Probability
Tennis has a unique nested scoring structure: points make games, games make sets, and sets make matches. This hierarchical structure means that win probability can be calculated exactly given just two parameters:
- $p_s$: probability that the server wins a point on their serve
- $p_r$: probability that the returner wins a point on the opponent's serve (equivalently, $1 - p_s$ for the opponent)
Given these parameters and the current score state, the match win probability can be computed recursively by working backward from the possible endpoints.
Game-level probability. In a standard game (not a tiebreak), the server needs 4 points to win, with deuce at 3--3. If $p$ is the server's point-win probability, the probability of winning a game from deuce is:
$$P(\text{win from deuce}) = \frac{p^2}{p^2 + (1-p)^2}$$
The full game win probability (from 0--0) is a sum over all paths to winning at least 4 points before the opponent does, accounting for the deuce mechanism:
$$P(\text{win game}) = \sum_{k=0}^{3} \binom{3+k}{k} p^4 (1-p)^k + \binom{6}{3} p^3 (1-p)^3 \cdot \frac{p^2}{p^2 + (1-p)^2}$$
Set-level probability. A set is won by the first player to win 6 games with a lead of at least 2, with a tiebreak at 6--6 (in most formats). The set win probability is computed by summing over all possible game scores.
Match-level probability. A best-of-three match requires winning 2 sets; best-of-five requires 3. The match probability is computed from the set probabilities.
Real-Time Updates
The power of this framework for live betting is that after each point, the model can instantly recalculate the match win probability given the new score state. The key insight is that the point-win probabilities ($p_s$ for each player) are not fixed --- they should be updated throughout the match based on observed performance.
Bayesian updating of serve performance. Before the match, we start with a prior estimate of each player's serve point win probability (derived from their career stats and surface-specific data). As points are played, we observe actual service outcomes and update:
$$p_s^{\text{posterior}} \sim \text{Beta}(\alpha_0 + \text{service points won}, \beta_0 + \text{service points lost})$$
where $\alpha_0$ and $\beta_0$ encode the prior. A strong prior (large $\alpha_0 + \beta_0$) means in-match data has less influence; a weak prior allows more rapid adjustment.
Momentum and Fatigue in Tennis
Two factors can systematically shift point-win probabilities during a match:
-
Momentum: After winning several consecutive points or games, a player may play with greater confidence and aggression, temporarily elevating their performance. Conversely, a player who has lost several games in a row may play tentatively. Empirical evidence for momentum in tennis is mixed, but some studies find small, statistically significant effects, particularly after break-of-serve sequences.
-
Fatigue: In long matches (particularly best-of-five), physical fatigue reduces serve speed, court coverage, and error resistance. Fatigue effects are most pronounced in the fourth and fifth sets, in hot conditions, and for players who have played deep into previous rounds.
Both effects can be modeled as time-varying adjustments to the base point-win probability:
$$p_s(t) = p_s^{\text{base}} + \delta_{\text{momentum}}(t) + \delta_{\text{fatigue}}(t)$$
Python Implementation: Tennis Live Win Probability
import math
from functools import lru_cache
from typing import Dict, Tuple
class TennisLiveModel:
"""
Point-by-point win probability model for tennis matches.
Computes exact win probabilities given the current score state
and serve-win probabilities for each player.
"""
def __init__(
self,
p_serve_a: float = 0.64,
p_serve_b: float = 0.62,
best_of: int = 3,
tiebreak_at_6_6: bool = True,
final_set_tiebreak: bool = True,
final_set_tiebreak_to: int = 10,
):
"""
Args:
p_serve_a: Prob. player A wins a point on their serve.
p_serve_b: Prob. player B wins a point on their serve.
best_of: Number of sets (3 or 5).
tiebreak_at_6_6: Whether a tiebreak is played at 6-6.
final_set_tiebreak: Whether the final set uses a tiebreak.
final_set_tiebreak_to: Points needed to win final set TB.
"""
self.p_serve_a = p_serve_a
self.p_serve_b = p_serve_b
self.best_of = best_of
self.sets_to_win = (best_of + 1) // 2
self.tiebreak_at_6_6 = tiebreak_at_6_6
self.final_set_tiebreak = final_set_tiebreak
self.final_set_tiebreak_to = final_set_tiebreak_to
def prob_win_tiebreak(self, p_a: float, p_b: float) -> float:
"""
Probability player A wins a standard 7-point tiebreak.
Uses recursive computation over all possible score paths.
"""
@lru_cache(maxsize=None)
def tb(a_pts: int, b_pts: int, a_serving: bool) -> float:
# Terminal conditions
if a_pts >= 7 and a_pts - b_pts >= 2:
return 1.0
if b_pts >= 7 and b_pts - a_pts >= 2:
return 0.0
# Who is serving?
p = p_a if a_serving else (1 - p_b)
# Tiebreak serve rotation: first server serves 1, then alternate 2
total_pts = a_pts + b_pts
if total_pts == 0:
next_serving = a_serving
else:
# After first point, server changes every 2 points
next_serving = ((total_pts + 1) // 2) % 2 == 0
if not a_serving:
next_serving = not next_serving
# Simplified: recalculate based on point count
# Server at start is a_serving, changes after 1, then every 2
pts_after_first = total_pts # 0-indexed
if pts_after_first == 0:
serving_a = a_serving
else:
serving_a = ((pts_after_first + 1) // 2) % 2 == (
0 if a_serving else 1
)
serving_a = a_serving if (
(pts_after_first + 1) // 2
) % 2 == 0 else not a_serving
# Simplified tiebreak: just use average serve probability
p_point = p_a if a_serving else (1 - p_b)
# Determine who serves this point
if total_pts == 0:
p_point = p_a if a_serving else (1 - p_b)
elif (total_pts % 4) in (0, 3) if a_serving else (total_pts % 4) in (1, 2):
p_point = p_a
else:
p_point = 1 - p_b
return p_point * tb(a_pts + 1, b_pts, a_serving) + \
(1 - p_point) * tb(a_pts, b_pts + 1, a_serving)
# For simplicity, use an analytical approximation
# Average point-win probability for A in tiebreak
p_avg = (p_a + (1 - p_b)) / 2
# Approximate tiebreak win probability using binomial-like recursion
return self._tiebreak_recursive(p_a, 1 - p_b, 0, 0, True)
@lru_cache(maxsize=4096)
def _tiebreak_recursive(
self,
p_a_serve: float,
p_a_return: float,
a_pts: int,
b_pts: int,
a_serves_first: bool,
) -> float:
"""Recursive tiebreak probability with proper serve rotation."""
if a_pts >= 7 and a_pts - b_pts >= 2:
return 1.0
if b_pts >= 7 and b_pts - a_pts >= 2:
return 0.0
total = a_pts + b_pts
# Determine who is serving this point
if total == 0:
a_serving = a_serves_first
else:
# After first point, server alternates every 2 points
segment = (total - 1) // 2
a_serving = a_serves_first if segment % 2 == 0 else not a_serves_first
# Correction: first server serves point 1, then other serves 2-3,
# first serves 4-5, etc.
if total <= 0:
a_serving = a_serves_first
else:
a_serving = a_serves_first if ((total + 1) // 2) % 2 == 1 else not a_serves_first
p = p_a_serve if a_serving else p_a_return
return p * self._tiebreak_recursive(
p_a_serve, p_a_return, a_pts + 1, b_pts, a_serves_first
) + (1 - p) * self._tiebreak_recursive(
p_a_serve, p_a_return, a_pts, b_pts + 1, a_serves_first
)
@lru_cache(maxsize=4096)
def prob_win_game(self, p: float, server_pts: int = 0,
returner_pts: int = 0) -> float:
"""
Probability that the server wins the current game.
Args:
p: Server's point-win probability.
server_pts: Current server points (0-3 = love to 40, 4+ = ad).
returner_pts: Current returner points.
"""
# Terminal states
if server_pts >= 4 and server_pts - returner_pts >= 2:
return 1.0
if returner_pts >= 4 and returner_pts - server_pts >= 2:
return 0.0
# Deuce: both at 3+
if server_pts >= 3 and returner_pts >= 3:
if server_pts == returner_pts:
# At deuce
return p * self.prob_win_game(p, server_pts + 1, returner_pts) + \
(1 - p) * self.prob_win_game(p, server_pts, returner_pts + 1)
elif server_pts > returner_pts:
# Server has advantage
return p * 1.0 + (1 - p) * self.prob_win_game(p, server_pts, server_pts)
else:
# Returner has advantage
return p * self.prob_win_game(p, returner_pts, returner_pts) + (1 - p) * 0.0
return p * self.prob_win_game(p, server_pts + 1, returner_pts) + \
(1 - p) * self.prob_win_game(p, server_pts, returner_pts + 1)
@lru_cache(maxsize=8192)
def prob_win_set(
self,
p_a: float,
p_b: float,
a_games: int = 0,
b_games: int = 0,
a_serving: bool = True,
is_final_set: bool = False,
) -> float:
"""
Probability player A wins the current set.
Args:
p_a: Prob A wins point on A's serve.
p_b: Prob B wins point on B's serve.
a_games: Games won by A in this set.
b_games: Games won by B in this set.
a_serving: Whether A is currently serving.
is_final_set: Whether this is the deciding set.
"""
# Terminal: set won
if a_games >= 6 and a_games - b_games >= 2:
return 1.0
if b_games >= 6 and b_games - a_games >= 2:
return 0.0
# Tiebreak at 6-6
if a_games == 6 and b_games == 6:
if self.tiebreak_at_6_6 and (not is_final_set or self.final_set_tiebreak):
return self._tiebreak_recursive(
p_a, 1 - p_b, 0, 0, a_serving,
)
# No tiebreak: must win by 2 games (simplified: approximate)
p_hold_a = self.prob_win_game(p_a)
p_hold_b = self.prob_win_game(p_b)
p_break_a = 1 - p_hold_b # A breaks B's serve
p_break_b = 1 - p_hold_a # B breaks A's serve
# Approximate: probability of winning 2 consecutive games
if a_serving:
p_win_2 = p_hold_a * p_break_a
p_lose_2 = (1 - p_hold_a) * (1 - p_break_a)
else:
p_win_2 = p_break_a * p_hold_a
p_lose_2 = (1 - p_break_a) * (1 - p_hold_a)
return p_win_2 / (p_win_2 + p_lose_2) if (p_win_2 + p_lose_2) > 0 else 0.5
# Current game probability
if a_serving:
p_a_wins_game = self.prob_win_game(p_a)
else:
p_a_wins_game = 1 - self.prob_win_game(p_b)
# A wins current game -> a_games + 1, server switches
# A loses current game -> b_games + 1, server switches
return (
p_a_wins_game * self.prob_win_set(
p_a, p_b, a_games + 1, b_games, not a_serving, is_final_set
)
+ (1 - p_a_wins_game) * self.prob_win_set(
p_a, p_b, a_games, b_games + 1, not a_serving, is_final_set
)
)
def prob_win_match(
self,
p_a: float,
p_b: float,
a_sets: int = 0,
b_sets: int = 0,
a_games: int = 0,
b_games: int = 0,
a_serving: bool = True,
) -> float:
"""
Probability player A wins the match from the current state.
"""
# Terminal: match won
if a_sets >= self.sets_to_win:
return 1.0
if b_sets >= self.sets_to_win:
return 0.0
is_final_set = (a_sets == self.sets_to_win - 1 and
b_sets == self.sets_to_win - 1)
# Current set probability from current game score
p_a_wins_set = self.prob_win_set(
p_a, p_b, a_games, b_games, a_serving, is_final_set
)
# If A wins this set
p_win = p_a_wins_set * self.prob_win_match(
p_a, p_b, a_sets + 1, b_sets, 0, 0, a_serving # simplified
)
# If B wins this set
p_lose = (1 - p_a_wins_set) * self.prob_win_match(
p_a, p_b, a_sets, b_sets + 1, 0, 0, a_serving
)
return p_win + p_lose
# --- Worked Example: Live Match Scenario ---
model = TennisLiveModel(
p_serve_a=0.66, # Strong server (e.g., Djokovic on hard court)
p_serve_b=0.63, # Solid server (e.g., Alcaraz on hard court)
best_of=3,
)
# Pre-match probability
pre_match = model.prob_win_match(0.66, 0.63, a_serving=True)
print(f"Pre-match win probability (Player A): {pre_match:.2%}")
# After Player A wins first set 6-4
after_first_set = model.prob_win_match(
0.66, 0.63,
a_sets=1, b_sets=0,
a_games=0, b_games=0,
a_serving=True,
)
print(f"After A wins set 1 (6-4): {after_first_set:.2%}")
# Score is 1-0 sets, 3-4 games in set 2, B serving
trailing_set2 = model.prob_win_match(
0.66, 0.63,
a_sets=1, b_sets=0,
a_games=3, b_games=4,
a_serving=False,
)
print(f"1-0 sets, 3-4 in set 2 (B serving): {trailing_set2:.2%}")
Round-by-Round MMA/Boxing Modeling
Combat sports live modeling differs fundamentally from tennis because the scoring structure is less granular and the probability of a fight-ending event (KO, submission, stoppage) is ever-present.
MMA live model components:
-
Round survival probability: The probability that the fight reaches the end of the current round without a finish. This depends on the pace of action, the type of strikes being landed, whether the fight is on the ground, and the fighters' historical finishing rates.
-
Scorecard probability: If the fight goes to decision, the probability that each fighter wins on the scorecards. This is updated round by round based on which fighter appears to be winning each round. MMA uses the 10-point must system, with 10-9 rounds being standard, 10-8 rounds for dominant performances, and rare 10-7 rounds.
-
Finish probability: The probability that a finish (KO/TKO/submission) occurs in the current round, and the conditional probability of which fighter finishes. This depends on the in-fight dynamics: a fighter who is landing heavy shots has a higher KO probability; a fighter who has secured a dominant grappling position has a higher submission probability.
The live win probability combines these:
$$P(A \text{ wins}) = P(\text{finish in round } r) \cdot P(A \text{ finishes} | \text{finish}) + P(\text{no finish in } r) \cdot P(A \text{ wins} | \text{no finish in } r)$$
where the second term recursively accounts for future rounds.
from dataclasses import dataclass, field
from typing import List
@dataclass
class RoundState:
"""Observed statistics for a single round."""
round_number: int
sig_strikes_a: int = 0
sig_strikes_b: int = 0
takedowns_a: int = 0
takedowns_b: int = 0
control_time_a: float = 0.0 # seconds
control_time_b: float = 0.0
knockdowns_a: int = 0
knockdowns_b: int = 0
submission_attempts_a: int = 0
submission_attempts_b: int = 0
class MMALiveModel:
"""
Round-by-round live win probability model for MMA fights.
Updates win probability based on observed round-level statistics,
accounting for scorecard accumulation and finish probability.
"""
def __init__(
self,
pre_fight_prob_a: float = 0.50,
total_rounds: int = 3,
base_finish_rate_per_round: float = 0.12,
knockdown_finish_boost: float = 0.15,
sub_attempt_finish_boost: float = 0.08,
):
self.pre_fight_prob_a = pre_fight_prob_a
self.total_rounds = total_rounds
self.base_finish_rate = base_finish_rate_per_round
self.knockdown_finish_boost = knockdown_finish_boost
self.sub_attempt_finish_boost = sub_attempt_finish_boost
self.rounds_completed: List[RoundState] = []
def score_round(self, round_state: RoundState) -> dict:
"""
Score a round and estimate the 10-point-must score.
Returns estimated scorecard entries and round winner probability.
"""
# Simple scoring model based on weighted statistics
score_a = (
round_state.sig_strikes_a * 1.0
+ round_state.takedowns_a * 3.0
+ round_state.control_time_a / 60 * 2.0
+ round_state.knockdowns_a * 8.0
+ round_state.submission_attempts_a * 2.0
)
score_b = (
round_state.sig_strikes_b * 1.0
+ round_state.takedowns_b * 3.0
+ round_state.control_time_b / 60 * 2.0
+ round_state.knockdowns_b * 8.0
+ round_state.submission_attempts_b * 2.0
)
total = score_a + score_b
if total == 0:
return {"winner": None, "score_a": 9, "score_b": 9, "dominance": 0}
dominance = (score_a - score_b) / total # -1 to +1
if abs(dominance) < 0.15:
# Close round: likely 10-9 either way with uncertainty
if dominance > 0:
return {"winner": "a", "score_a": 10, "score_b": 9,
"dominance": dominance, "confidence": 0.6}
else:
return {"winner": "b", "score_a": 9, "score_b": 10,
"dominance": dominance, "confidence": 0.6}
elif abs(dominance) < 0.45:
# Clear round: 10-9
if dominance > 0:
return {"winner": "a", "score_a": 10, "score_b": 9,
"dominance": dominance, "confidence": 0.85}
else:
return {"winner": "b", "score_a": 9, "score_b": 10,
"dominance": dominance, "confidence": 0.85}
else:
# Dominant round: 10-8
if dominance > 0:
return {"winner": "a", "score_a": 10, "score_b": 8,
"dominance": dominance, "confidence": 0.90}
else:
return {"winner": "b", "score_a": 8, "score_b": 10,
"dominance": dominance, "confidence": 0.90}
def round_finish_probability(self, round_state: RoundState) -> dict:
"""
Estimate probability of a finish occurring in the current round.
Based on in-round action and historical base rates.
"""
base = self.base_finish_rate
# Knockdowns dramatically increase finish probability
kd_boost_a = round_state.knockdowns_a * self.knockdown_finish_boost
kd_boost_b = round_state.knockdowns_b * self.knockdown_finish_boost
# Submission attempts increase finish probability
sub_boost_a = round_state.submission_attempts_a * self.sub_attempt_finish_boost
sub_boost_b = round_state.submission_attempts_b * self.sub_attempt_finish_boost
# Total finish probability (capped)
total_finish = min(0.80, base + kd_boost_a + kd_boost_b +
sub_boost_a + sub_boost_b)
# Conditional: who finishes given a finish occurs
a_danger = kd_boost_a + sub_boost_a + 0.01
b_danger = kd_boost_b + sub_boost_b + 0.01
p_a_finishes = a_danger / (a_danger + b_danger)
return {
"total_finish_prob": round(total_finish, 4),
"a_finishes_given_finish": round(p_a_finishes, 4),
"b_finishes_given_finish": round(1 - p_a_finishes, 4),
}
def update_win_probability(self, round_state: RoundState) -> dict:
"""
Update live win probability after observing a round.
Combines scorecard accumulation with finish probability
for remaining rounds.
"""
self.rounds_completed.append(round_state)
# Score all completed rounds
scorecard_a = 0
scorecard_b = 0
for rs in self.rounds_completed:
scored = self.score_round(rs)
scorecard_a += scored["score_a"]
scorecard_b += scored["score_b"]
rounds_remaining = self.total_rounds - len(self.rounds_completed)
# Decision probability given fight goes to distance
if rounds_remaining == 0:
# Fight is over (went to decision)
if scorecard_a > scorecard_b:
decision_prob_a = 0.92 # Account for judging variance
elif scorecard_b > scorecard_a:
decision_prob_a = 0.08
else:
decision_prob_a = 0.50 # Draw
else:
# Estimate: each remaining round is roughly 50-50 on cards
# but current leader has structural advantage
card_lead = scorecard_a - scorecard_b
# Each remaining round can swing by +/- 1 (10-9) or +/- 2 (10-8)
# Simplified: use lead and rounds remaining
decision_prob_a = 1 / (1 + 10 ** (-card_lead / (rounds_remaining + 1)))
# Finish probability for remaining rounds
finish_prob_per_round = self.base_finish_rate
# Fatigue increases finish probability in later rounds
fatigue_factor = 1.0 + 0.1 * len(self.rounds_completed)
adjusted_finish_rate = min(0.25, finish_prob_per_round * fatigue_factor)
# Combine paths to victory
# Path 1: Fight goes to decision
p_no_finish = (1 - adjusted_finish_rate) ** rounds_remaining
p_a_decision = p_no_finish * decision_prob_a
# Path 2: Finish in a remaining round
# Approximate: use pre-fight probability as proxy for who finishes
p_finish = 1 - p_no_finish
# Adjust finish probability based on momentum
momentum = (scorecard_a - scorecard_b) / max(len(self.rounds_completed), 1)
momentum_shift = min(0.15, max(-0.15, momentum * 0.05))
p_a_finishes_given_finish = self.pre_fight_prob_a + momentum_shift
p_a_finish = p_finish * p_a_finishes_given_finish
total_prob_a = p_a_decision + p_a_finish
return {
"rounds_completed": len(self.rounds_completed),
"scorecard": (scorecard_a, scorecard_b),
"card_lead_a": scorecard_a - scorecard_b,
"rounds_remaining": rounds_remaining,
"decision_prob_a": round(decision_prob_a, 4),
"p_goes_to_decision": round(p_no_finish, 4),
"p_finish_remaining": round(p_finish, 4),
"overall_win_prob_a": round(min(0.99, max(0.01, total_prob_a)), 4),
}
# --- Worked Example: 3-Round MMA Fight ---
live_model = MMALiveModel(
pre_fight_prob_a=0.55,
total_rounds=3,
base_finish_rate_per_round=0.12,
)
# Round 1: Fighter A dominates with takedowns and control
round1 = RoundState(
round_number=1,
sig_strikes_a=35, sig_strikes_b=18,
takedowns_a=3, takedowns_b=0,
control_time_a=120, control_time_b=10,
knockdowns_a=0, knockdowns_b=0,
submission_attempts_a=1, submission_attempts_b=0,
)
result1 = live_model.update_win_probability(round1)
print(f"After Round 1: Win Prob A = {result1['overall_win_prob_a']:.2%}")
print(f" Scorecard: {result1['scorecard']}")
# Round 2: Fighter B comes back with strikes, knockdown
round2 = RoundState(
round_number=2,
sig_strikes_a=22, sig_strikes_b=45,
takedowns_a=0, takedowns_b=1,
control_time_a=15, control_time_b=60,
knockdowns_a=0, knockdowns_b=1,
submission_attempts_a=0, submission_attempts_b=0,
)
result2 = live_model.update_win_probability(round2)
print(f"\nAfter Round 2: Win Prob A = {result2['overall_win_prob_a']:.2%}")
print(f" Scorecard: {result2['scorecard']}")
print(f" Card lead A: {result2['card_lead_a']}")
Identifying Live Betting Value
The practical application of live models is to compare the model's win probability to the live market price. In tennis, live markets adjust every few points, and the most profitable opportunities typically arise:
- After a break of serve: Markets often overreact to breaks, especially early breaks that have limited predictive value for the overall match.
- In the second set after a dominant first set: If Player A wins the first set 6-1, markets may overprice Player A, underestimating the likelihood that the first set was unrepresentatively one-sided.
- During momentum swings: Markets can lag behind genuine shifts in player condition (injury, fatigue) or overreact to superficial momentum (a few consecutive points).
In MMA, live betting opportunities arise between rounds and during action stoppages. The key is that the model systematically accounts for scorecard state, finish probability, and remaining time --- factors that casual bettors may misjudge, particularly in close fights where one knockdown or submission attempt dramatically shifts the probabilities.
Market Insight: Live betting in tennis represents one of the most accessible edges available to quantitative bettors. The match is long (often 90+ minutes), the market updates frequently, and a well-calibrated point-by-point model can identify dozens of opportunities per match where the live price deviates meaningfully from the model's assessment. The primary challenge is execution speed: live odds change rapidly, and the bettor must have systems in place to identify and execute on opportunities within seconds.
21.6 Chapter Summary
This chapter developed a comprehensive quantitative framework for modeling individual sports, with particular depth in tennis and combat sports (MMA and boxing).
Key concepts and methods covered:
-
Elo and Glicko-2 Ratings (Section 21.1): We adapted the classical Elo rating system for individual sports, addressing sport-specific challenges including K-factor calibration (20--32 for tennis, 100--200 for MMA), inactivity decay for irregular competition schedules, and promotion-tier-based initialization for new entrants. The Glicko-2 system extends Elo with rating deviation and volatility parameters that naturally handle the variable activity levels common in combat sports.
-
Style Matchup Analysis (Section 21.2): We quantified the "styles make fights" principle by constructing matchup matrices that encode how specific style interactions modify Elo-expected outcomes. The framework applies to both combat sports (striker vs. wrestler, grappler vs. pressure fighter) and tennis (serve-dominant vs. defensive baseline, net rusher vs. rally tolerance player). Matchup adjustments operate on the log-odds scale and are derived empirically from stratified historical data.
-
Surface and Venue Effects (Section 21.3): We demonstrated that tennis performance varies substantially across surfaces (hard, clay, grass), with surface affinity scores of +/-20% common among top professionals. The surface-adjusted Elo system maintains parallel ratings per surface, blended with the overall rating using a logistic weighting function that increases the surface component as more surface-specific data accumulates. Indoor/outdoor and altitude effects provide additional modeling dimensions.
-
Physical Attributes in Combat Sports (Section 21.4): Reach advantages, age curves, weight cut effects, and chin deterioration were modeled as systematic adjustments to baseline predictions. Each inch of reach advantage beyond a threshold contributes approximately 1.2% to win probability. Age curves show peak performance at 26--30 with accelerating decline thereafter. Weight cuts exceeding 12% of walk-around weight incur measurable performance penalties. Chin deterioration, proxied by KO losses and cumulative strikes absorbed, is progressive and irreversible.
-
Live In-Match/In-Fight Modeling (Section 21.5): Point-by-point tennis models exploit the hierarchical scoring structure (points to games to sets to matches) to compute exact win probabilities at any score state. Round-by-round MMA models combine scorecard accumulation, finish probability, and fatigue effects. Both frameworks identify live betting value where the market price diverges from the model's real-time assessment.
Practical implementation checklist for individual sport modeling:
- [ ] Collect and clean historical results data (match outcomes, scores, fighter statistics)
- [ ] Implement and calibrate an Elo or Glicko-2 system with sport-appropriate parameters
- [ ] Classify competitors into style archetypes and build empirical matchup matrices
- [ ] For tennis: maintain surface-specific ratings with proper blending
- [ ] For combat sports: incorporate physical attributes (reach, age, weight cut, chin)
- [ ] Build a live model for real-time win probability estimation
- [ ] Establish a systematic comparison framework between model probabilities and market prices
- [ ] Backtest the full pipeline on historical data before risking capital
The individual sports modeler who combines strong rating foundations with style, surface/physical, and live modeling components possesses a significant structural advantage over both the general public (which lacks systematic frameworks) and the market (which must price thousands of events and cannot specialize as deeply in each one).
Review Questions:
-
Why does the optimal K-factor for MMA Elo (100--200) differ so dramatically from the optimal K-factor for tennis Elo (20--32)? What properties of each sport drive this difference?
-
Explain why matchup adjustments should be applied on the log-odds scale rather than directly on the probability scale. What distortions can arise from additive adjustments on the probability scale?
-
A tennis player has an overall Elo rating of 1700, a clay-specific Elo rating of 1820, and has played 35 matches on clay. Using the logistic blending formula with a half-weight threshold of 50 matches, what is their blended clay rating?
-
Describe two mechanisms through which age affects combat sports performance differently than it affects team sport performance.
-
In a best-of-three tennis match where Player A has a serve-point-win probability of 0.68 and Player B has 0.62, compute the approximate pre-match win probability for Player A using the hierarchical point-game-set-match framework.
-
Why does the probability of a finish in an MMA fight tend to increase in later rounds? Identify both the physical and strategic factors involved.
Exercises:
-
(Programming) Extend the
CombatSportsEloclass to support weight-class-specific Elo ratings, analogous to the surface-specific tennis Elo. A fighter competing in multiple weight classes should carry separate ratings for each, blended with their overall rating. -
(Analysis) Using publicly available UFC statistics, classify the current UFC lightweight top 15 into style archetypes. Compute a 6x6 matchup matrix from historical lightweight fights and identify which style matchups produce the largest deviations from Elo expectations.
-
(Modeling) Build a surface-adjusted Elo system using ATP match data from the last 5 seasons. Evaluate its predictive accuracy (log-loss) against: (a) a standard single-surface Elo system, (b) ATP ranking-based predictions, and (c) closing market odds. Report the improvement from surface adjustment at the Grand Slam level versus lower-tier events.
-
(Live Modeling) Implement the full point-by-point tennis model and validate it against observed in-play odds from a sample of 50 matches. At which score states does the model most frequently identify value relative to market prices?