Case Study 2: MMA Fight Prediction System with Style Matchup and Physical Attributes
Overview
In this case study, we build a comprehensive MMA fight prediction system that combines three layers of analysis: Elo-based ratings for baseline skill estimation, style matchup adjustments that capture the "styles make fights" phenomenon, and physical attribute modeling that accounts for reach, age, weight cuts, and chin deterioration. We process a realistic sequence of UFC fights, generate predictions for upcoming matchups, and evaluate how each component contributes to predictive accuracy.
The Problem
Predicting MMA outcomes is uniquely challenging among sports. Fighters compete infrequently (2-3 bouts per year), meaning rating systems have few data points to work with. The outcome of any fight depends on the interaction between two specific skill profiles: a wrestler's dominance against strikers does not predict their performance against submission specialists. Physical attributes play a larger role than in most sports, with reach advantages, age-related decline, and the accumulated damage reflected in chin deterioration all creating measurable effects.
A model that uses only Elo ratings will miss the style-dependent variance in outcomes. A model that adds style matchups but ignores physical attributes will miss the slow degradation of aging fighters or the significance of a five-inch reach advantage. Our goal is to build a system that integrates all three layers into a single prediction, quantifying how much each contributes.
Data Requirements
Our system requires three categories of data for each fighter. The rating data consists of fight history including opponents, results, and method of victory. The style data consists of career statistics: significant strikes per minute, takedown average per 15 minutes, submission attempts per 15 minutes, strike defense percentage, and takedown defense percentage. The physical data consists of age, height, reach, weight class, walk-around weight, career fights, KO/TKO losses, and total significant strikes absorbed.
Implementation
"""
MMA Fight Prediction System
Integrates Elo ratings, style matchup adjustments, and physical attributes.
"""
import math
import numpy as np
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Tuple
from datetime import date
@dataclass
class MMAFighter:
"""Complete fighter profile for multi-layer prediction."""
name: str
elo: float = 1500.0
fights: int = 0
last_fight: Optional[date] = None
# Style statistics
sig_strikes_per_min: float = 4.0
takedown_avg: float = 1.5
submission_avg: float = 0.5
sig_strike_defense: float = 0.55
takedown_defense: float = 0.65
# Physical attributes
age: float = 28.0
height_inches: float = 72.0
reach_inches: float = 74.0
weight_class_lbs: float = 170.0
walk_around_weight_lbs: float = 190.0
ko_tko_losses: int = 0
total_sig_strikes_absorbed: int = 0
# Derived
style: str = ""
prediction_history: List[Dict] = field(default_factory=list)
class MMAFightPredictor:
"""
Multi-layer MMA fight prediction system.
Combines Elo rating, style matchup matrix, and physical attribute
adjustments into a single calibrated win probability.
Args:
base_k: Base K-factor for Elo updates.
new_fighter_k_mult: K-factor multiplier for fighters with < 5 fights.
reach_coeff: Probability adjustment per inch of reach beyond threshold.
reach_threshold: Minimum reach difference for adjustment to apply.
age_decline_rate: Decline rate per year past peak age window.
weight_cut_threshold: Fraction of walk-around weight defining severe cut.
weight_cut_penalty: Penalty per 5% excess cut.
ko_vuln_per_loss: Vulnerability increase per KO/TKO loss.
style_adjustment_weight: Scaling factor for matchup matrix adjustments.
"""
STYLE_ARCHETYPES = [
"striker", "wrestler", "grappler",
"balanced", "counter_striker", "pressure_fighter",
]
FINISH_MULTIPLIERS = {
"ko_tko": 1.25, "submission": 1.20,
"decision_unanimous": 1.00, "decision_split": 0.85,
"decision_majority": 0.92, "draw": 0.50,
}
DEFAULT_MATCHUP_MATRIX = {
"striker": {"striker": 0.0, "wrestler": -0.06, "grappler": -0.04,
"balanced": 0.01, "counter_striker": 0.03,
"pressure_fighter": -0.02},
"wrestler": {"striker": 0.06, "wrestler": 0.0, "grappler": 0.02,
"balanced": 0.02, "counter_striker": 0.05,
"pressure_fighter": 0.04},
"grappler": {"striker": 0.04, "wrestler": -0.02, "grappler": 0.0,
"balanced": 0.01, "counter_striker": 0.03,
"pressure_fighter": 0.02},
"balanced": {"striker": -0.01, "wrestler": -0.02, "grappler": -0.01,
"balanced": 0.0, "counter_striker": 0.01,
"pressure_fighter": 0.0},
"counter_striker": {"striker": -0.03, "wrestler": -0.05, "grappler": -0.03,
"balanced": -0.01, "counter_striker": 0.0,
"pressure_fighter": -0.04},
"pressure_fighter": {"striker": 0.02, "wrestler": -0.04, "grappler": -0.02,
"balanced": 0.0, "counter_striker": 0.04,
"pressure_fighter": 0.0},
}
def __init__(
self,
base_k: float = 120.0,
new_fighter_k_mult: float = 1.5,
reach_coeff: float = 0.012,
reach_threshold: float = 2.5,
age_decline_rate: float = 0.025,
weight_cut_threshold: float = 0.12,
weight_cut_penalty: float = 0.04,
ko_vuln_per_loss: float = 0.03,
style_adjustment_weight: float = 1.0,
):
self.base_k = base_k
self.new_fighter_k_mult = new_fighter_k_mult
self.reach_coeff = reach_coeff
self.reach_threshold = reach_threshold
self.age_decline_rate = age_decline_rate
self.weight_cut_threshold = weight_cut_threshold
self.weight_cut_penalty = weight_cut_penalty
self.ko_vuln_per_loss = ko_vuln_per_loss
self.style_adjustment_weight = style_adjustment_weight
self.matchup_matrix = self.DEFAULT_MATCHUP_MATRIX
self.fighters: Dict[str, MMAFighter] = {}
def add_fighter(self, fighter: MMAFighter) -> None:
"""Register a fighter with the system."""
fighter.style = self._classify_style(fighter)
self.fighters[fighter.name] = fighter
def _classify_style(self, fighter: MMAFighter) -> str:
"""Classify fighter into a style archetype."""
sspm = fighter.sig_strikes_per_min
td = fighter.takedown_avg
sub = fighter.submission_avg
str_def = fighter.sig_strike_defense
td_def = fighter.takedown_defense
if td > 3.5 and td_def > 0.70:
return "wrestler"
elif sub > 1.5:
return "grappler"
elif sspm > 6.0 and str_def < 0.55:
return "pressure_fighter"
elif sspm < 3.5 and str_def > 0.62:
return "counter_striker"
elif sspm > 5.0:
return "striker"
else:
return "balanced"
def _elo_probability(self, ra: float, rb: float) -> float:
"""Standard Elo expected score."""
return 1.0 / (1.0 + 10.0 ** ((rb - ra) / 400.0))
def _style_adjustment(self, style_a: str, style_b: str) -> float:
"""Get matchup matrix adjustment value."""
adj = self.matchup_matrix.get(style_a, {}).get(style_b, 0.0)
return adj * self.style_adjustment_weight
def _reach_adj(self, fa: MMAFighter, fb: MMAFighter) -> float:
"""Reach-based probability adjustment."""
diff = fa.reach_inches - fb.reach_inches
if abs(diff) < self.reach_threshold:
return 0.0
effective = abs(diff) - self.reach_threshold
adj = self.reach_coeff * effective
return adj if diff > 0 else -adj
def _age_adj(self, fighter: MMAFighter) -> float:
"""Age-based performance adjustment (0 at peak, negative past peak)."""
if fighter.age <= 30.0:
return 0.0
years_past = fighter.age - 30.0
return -self.age_decline_rate * years_past ** 1.3
def _weight_cut_adj(self, fighter: MMAFighter) -> float:
"""Weight cut penalty for severe cuts."""
cut_pct = (
(fighter.walk_around_weight_lbs - fighter.weight_class_lbs)
/ fighter.walk_around_weight_lbs
)
if cut_pct <= self.weight_cut_threshold:
return 0.0
excess = cut_pct - self.weight_cut_threshold
return -self.weight_cut_penalty * (excess / 0.05)
def _ko_vulnerability(self, fighter: MMAFighter) -> float:
"""Knockout vulnerability index."""
base = 0.02
ko_factor = self.ko_vuln_per_loss * fighter.ko_tko_losses
strike_factor = 0.01 * fighter.total_sig_strikes_absorbed / 1000
age_factor = max(0, (fighter.age - 30) * 0.005)
return base + ko_factor + strike_factor + age_factor
def predict(self, name_a: str, name_b: str) -> Dict:
"""
Generate comprehensive fight prediction.
Combines Elo baseline, style matchup adjustment, and physical
attribute adjustments into a single probability estimate.
"""
fa = self.fighters[name_a]
fb = self.fighters[name_b]
# Layer 1: Elo baseline
elo_prob = self._elo_probability(fa.elo, fb.elo)
# Layer 2: Style matchup (on log-odds scale)
style_adj = self._style_adjustment(fa.style, fb.style)
# Layer 3: Physical attributes
reach_adj = self._reach_adj(fa, fb)
age_a = self._age_adj(fa)
age_b = self._age_adj(fb)
net_age = age_a - age_b
cut_a = self._weight_cut_adj(fa)
cut_b = self._weight_cut_adj(fb)
net_cut = cut_a - cut_b
ko_a = self._ko_vulnerability(fa)
ko_b = self._ko_vulnerability(fb)
net_ko = -(ko_a - ko_b)
total_physical = reach_adj + net_age + net_cut + net_ko
# Combine on log-odds scale
elo_clipped = max(0.01, min(0.99, elo_prob))
log_odds = math.log(elo_clipped / (1 - elo_clipped))
adjusted_log_odds = log_odds + style_adj + total_physical
final_prob = 1.0 / (1.0 + math.exp(-adjusted_log_odds))
final_prob = max(0.01, min(0.99, final_prob))
return {
"fighter_a": name_a,
"fighter_b": name_b,
"style_a": fa.style,
"style_b": fb.style,
"elo_a": round(fa.elo, 1),
"elo_b": round(fb.elo, 1),
"elo_prob_a": round(elo_prob, 4),
"style_adjustment": round(style_adj, 4),
"physical_adjustment": round(total_physical, 4),
"components": {
"reach": round(reach_adj, 4),
"net_age": round(net_age, 4),
"net_weight_cut": round(net_cut, 4),
"net_ko_vulnerability": round(net_ko, 4),
},
"final_prob_a": round(final_prob, 4),
"final_prob_b": round(1 - final_prob, 4),
}
def update_after_fight(
self,
winner: str,
loser: str,
method: str,
fight_date: date,
) -> Dict:
"""Update Elo ratings and fighter stats after a fight."""
fw = self.fighters[winner]
fl = self.fighters[loser]
pre_w, pre_l = fw.elo, fl.elo
exp_w = self._elo_probability(fw.elo, fl.elo)
k_w = self.base_k
k_l = self.base_k
if fw.fights < 5:
k_w *= self.new_fighter_k_mult
if fl.fights < 5:
k_l *= self.new_fighter_k_mult
finish_mult = self.FINISH_MULTIPLIERS.get(method, 1.0)
k_w *= finish_mult
k_l *= finish_mult
fw.elo += k_w * (1.0 - exp_w)
fl.elo += k_l * (0.0 - (1.0 - exp_w))
fw.fights += 1
fl.fights += 1
fw.last_fight = fight_date
fl.last_fight = fight_date
if method in ("ko_tko",):
fl.ko_tko_losses += 1
return {
"winner": winner, "loser": loser, "method": method,
"pre_elo": (round(pre_w, 1), round(pre_l, 1)),
"post_elo": (round(fw.elo, 1), round(fl.elo, 1)),
}
def component_contribution_analysis(
self, name_a: str, name_b: str
) -> Dict:
"""
Analyze how much each model layer contributes to the final prediction.
Computes the prediction with each layer in isolation and combined.
"""
fa = self.fighters[name_a]
fb = self.fighters[name_b]
# Elo only
elo_only = self._elo_probability(fa.elo, fb.elo)
# Elo + Style
style_adj = self._style_adjustment(fa.style, fb.style)
log_odds_elo = math.log(max(0.01, min(0.99, elo_only)) / (1 - max(0.01, min(0.99, elo_only))))
elo_style = 1.0 / (1.0 + math.exp(-(log_odds_elo + style_adj)))
# Full model
full = self.predict(name_a, name_b)
return {
"matchup": f"{name_a} vs {name_b}",
"elo_only_prob_a": round(elo_only, 4),
"elo_plus_style_prob_a": round(elo_style, 4),
"full_model_prob_a": full["final_prob_a"],
"style_contribution": round(elo_style - elo_only, 4),
"physical_contribution": round(full["final_prob_a"] - elo_style, 4),
"total_adjustment": round(full["final_prob_a"] - elo_only, 4),
}
def main() -> None:
"""Run the MMA fight prediction case study."""
print("=" * 70)
print("Case Study: MMA Multi-Layer Fight Prediction System")
print("=" * 70)
system = MMAFightPredictor()
# Create fighter profiles for a realistic lightweight division
fighters_data = [
MMAFighter("Islam Makhachev", elo=1780, fights=25,
sig_strikes_per_min=4.2, takedown_avg=4.1,
submission_avg=0.9, sig_strike_defense=0.63,
takedown_defense=0.88,
age=32.5, height_inches=70, reach_inches=70.5,
weight_class_lbs=155, walk_around_weight_lbs=180,
ko_tko_losses=0, total_sig_strikes_absorbed=420),
MMAFighter("Charles Oliveira", elo=1720, fights=42,
sig_strikes_per_min=3.5, takedown_avg=2.2,
submission_avg=1.8, sig_strike_defense=0.52,
takedown_defense=0.58,
age=34.5, height_inches=70, reach_inches=74.0,
weight_class_lbs=155, walk_around_weight_lbs=178,
ko_tko_losses=3, total_sig_strikes_absorbed=650),
MMAFighter("Justin Gaethje", elo=1650, fights=28,
sig_strikes_per_min=7.6, takedown_avg=0.5,
submission_avg=0.0, sig_strike_defense=0.54,
takedown_defense=0.72,
age=35.5, height_inches=69, reach_inches=70.0,
weight_class_lbs=155, walk_around_weight_lbs=180,
ko_tko_losses=4, total_sig_strikes_absorbed=780),
MMAFighter("Dustin Poirier", elo=1680, fights=38,
sig_strikes_per_min=5.8, takedown_avg=0.8,
submission_avg=0.8, sig_strike_defense=0.52,
takedown_defense=0.62,
age=35.0, height_inches=69, reach_inches=72.0,
weight_class_lbs=155, walk_around_weight_lbs=182,
ko_tko_losses=3, total_sig_strikes_absorbed=700),
MMAFighter("Arman Tsarukyan", elo=1700, fights=22,
sig_strikes_per_min=5.2, takedown_avg=3.8,
submission_avg=0.4, sig_strike_defense=0.60,
takedown_defense=0.85,
age=27.5, height_inches=69, reach_inches=72.0,
weight_class_lbs=155, walk_around_weight_lbs=177,
ko_tko_losses=0, total_sig_strikes_absorbed=290),
]
for f in fighters_data:
system.add_fighter(f)
# Display fighter profiles
print("\nFighter Profiles:")
print(f" {'Name':<22} {'Elo':>6} {'Style':<18} {'Age':>5} {'Reach':>6}")
print(f" {'-'*22} {'-'*6} {'-'*18} {'-'*5} {'-'*6}")
for f in fighters_data:
print(
f" {f.name:<22} {f.elo:>6.0f} {f.style:<18} "
f"{f.age:>5.1f} {f.reach_inches:>5.1f}\""
)
# Generate predictions for key matchups
matchups = [
("Islam Makhachev", "Charles Oliveira"),
("Islam Makhachev", "Justin Gaethje"),
("Islam Makhachev", "Arman Tsarukyan"),
("Charles Oliveira", "Dustin Poirier"),
("Justin Gaethje", "Dustin Poirier"),
("Arman Tsarukyan", "Charles Oliveira"),
]
print("\n" + "=" * 70)
print("Fight Predictions (Multi-Layer)")
print("=" * 70)
for fa_name, fb_name in matchups:
pred = system.predict(fa_name, fb_name)
analysis = system.component_contribution_analysis(fa_name, fb_name)
print(f"\n {fa_name} vs {fb_name}")
print(f" Styles: {pred['style_a']} vs {pred['style_b']}")
print(f" Elo: {pred['elo_a']} vs {pred['elo_b']}")
print(f" Elo-only probability: {analysis['elo_only_prob_a']:.1%}")
print(f" + Style matchup: {analysis['style_contribution']:+.1%}")
print(f" + Physical attributes: {analysis['physical_contribution']:+.1%}")
print(f" = Final probability: {pred['final_prob_a']:.1%} / {pred['final_prob_b']:.1%}")
print(f" Physical breakdown: reach={pred['components']['reach']:+.3f}, "
f"age={pred['components']['net_age']:+.3f}, "
f"KO_vuln={pred['components']['net_ko_vulnerability']:+.3f}")
# Process some fight results
print("\n" + "=" * 70)
print("Processing Fight Results")
print("=" * 70)
fight_results = [
("Islam Makhachev", "Dustin Poirier", "submission", date(2024, 6, 1)),
("Arman Tsarukyan", "Charles Oliveira", "decision_unanimous", date(2024, 6, 1)),
]
for winner, loser, method, d in fight_results:
result = system.update_after_fight(winner, loser, method, d)
print(f"\n {winner} def. {loser} via {method}")
print(f" Pre-fight Elo: {result['pre_elo']}")
print(f" Post-fight Elo: {result['post_elo']}")
# Re-predict with updated ratings
print("\n Updated prediction: Makhachev vs Tsarukyan")
updated = system.predict("Islam Makhachev", "Arman Tsarukyan")
print(f" Final: {updated['final_prob_a']:.1%} / {updated['final_prob_b']:.1%}")
print("\n" + "=" * 70)
if __name__ == "__main__":
main()
Results and Analysis
The multi-layer system reveals important dynamics that a pure Elo system would miss. Consider the Makhachev versus Gaethje matchup. Makhachev's Elo advantage (1780 vs 1650) gives him a strong baseline probability. The style matchup layer further favors Makhachev: as a wrestler facing a pressure_fighter, the matchup matrix adds a positive adjustment. The physical attribute layer compounds this: Gaethje at 35.5 with four KO/TKO losses and 780 significant strikes absorbed has an elevated knockout vulnerability index, while Makhachev at 32.5 with zero KO losses is still near peak. The total adjustment from Elo-only to the full model adds approximately 4-7 percentage points to Makhachev's probability.
In contrast, the Makhachev versus Tsarukyan matchup shows how physical attributes can narrow a gap. Tsarukyan is younger (27.5 versus 32.5), with zero KO losses and fewer absorbed strikes. The age and durability advantages partially offset Makhachev's Elo edge, making the full model closer than the pure rating difference suggests.
Practical Betting Application
The component contribution analysis is directly useful for betting. When the market price aligns with the Elo-only prediction, but the full model differs significantly, the style and physical adjustments represent exploitable information. The most common pattern is that aging former champions retain market respect beyond what their current physical condition warrants. A fighter with three recent KO losses trading at implied probabilities that reflect their peak-era Elo is a systematic value opportunity that this framework identifies.
Limitations
The matchup matrix values in this case study are illustrative. In production, they should be estimated from a large database of classified fights (ideally 50+ per style pairing). The physical attribute coefficients are drawn from published research estimates but should be calibrated against a specific dataset. The style classification is heuristic and would benefit from clustering algorithms applied to high-dimensional fight statistics. Despite these limitations, the multi-layer architecture provides a rigorous framework for combining fundamentally different types of information into a single prediction.