Case Study: Quarterback Consistency Analysis
"The most important thing for a quarterback is consistency. You don't want a guy who throws for 400 yards one week and 100 the next." — Bill Parcells
Executive Summary
In this case study, you'll analyze quarterback performance using descriptive statistics to answer a key question: Is a consistent quarterback more valuable than a boom-or-bust performer? You'll build a comprehensive comparison framework using mean, standard deviation, and distribution analysis.
Skills Applied: - Central tendency comparison - Variability measurement - Distribution analysis - Z-score standardization - Composite scoring
Background
The Debate
Fantasy football players and NFL scouts often face this decision:
Quarterback A: Reliable 250-yard passer with occasional 300-yard games Quarterback B: Alternates between 180-yard duds and 350-yard explosions
Both might average 265 yards per game, but they're very different players. This case study quantifies that difference.
The Data
We'll analyze two quarterbacks over a 12-game sample:
import pandas as pd
import numpy as np
from scipy import stats
# Create quarterback comparison data
np.random.seed(42)
qb_data = pd.DataFrame({
"game": list(range(1, 13)) * 2,
"quarterback": ["Steady Steve"] * 12 + ["Volatile Vic"] * 12,
"passing_yards": [
# Steady Steve: consistent performer
255, 268, 242, 275, 251, 263, 248, 272, 259, 245, 267, 255,
# Volatile Vic: boom-or-bust
185, 342, 210, 318, 175, 355, 195, 328, 165, 360, 190, 335
],
"touchdowns": [
2, 2, 1, 3, 2, 2, 2, 3, 2, 1, 2, 2, # Steve
0, 4, 1, 3, 0, 4, 1, 4, 0, 5, 1, 3 # Vic
],
"interceptions": [
1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, # Steve
2, 0, 1, 1, 2, 0, 2, 0, 3, 0, 1, 0 # Vic
],
"completion_pct": [
65, 68, 63, 70, 64, 67, 66, 69, 65, 64, 68, 66, # Steve
52, 72, 55, 71, 50, 75, 53, 73, 48, 76, 54, 70 # Vic
],
"passer_rating": [
88.5, 95.2, 82.4, 102.3, 86.1, 93.8, 91.2, 98.5, 89.3, 80.5, 94.1, 90.8,
62.5, 118.5, 70.2, 105.3, 55.8, 125.2, 65.4, 115.8, 50.2, 132.5, 68.3, 108.2
]
})
print(qb_data.head(15))
Phase 1: Central Tendency Comparison
Calculate Basic Averages
def calculate_qb_averages(df: pd.DataFrame) -> pd.DataFrame:
"""
Calculate average statistics for each quarterback.
Parameters
----------
df : pd.DataFrame
Quarterback game data
Returns
-------
pd.DataFrame
Average statistics per quarterback
"""
averages = df.groupby("quarterback").agg({
"passing_yards": "mean",
"touchdowns": "mean",
"interceptions": "mean",
"completion_pct": "mean",
"passer_rating": "mean"
}).round(1)
return averages
averages = calculate_qb_averages(qb_data)
print("\nQUARTERBACK AVERAGES")
print("=" * 60)
print(averages)
Expected Output:
passing_yards touchdowns interceptions completion_pct passer_rating
quarterback
Steady Steve 258.3 2.0 0.4 66.3 91.1
Volatile Vic 263.2 2.2 1.0 62.4 89.8
Initial Observations
At first glance: - Volatile Vic averages slightly more yards (263 vs 258) - Volatile Vic has slightly more TDs (2.2 vs 2.0) - But Vic also has more than double the interceptions (1.0 vs 0.4) - Steve has higher completion percentage and passer rating
The averages are close, but they hide important differences.
Phase 2: Variability Analysis
Standard Deviation Comparison
def calculate_qb_variability(df: pd.DataFrame) -> pd.DataFrame:
"""
Calculate variability metrics for each quarterback.
Parameters
----------
df : pd.DataFrame
Quarterback game data
Returns
-------
pd.DataFrame
Variability statistics per quarterback
"""
variability = df.groupby("quarterback").agg({
"passing_yards": ["mean", "std", "min", "max"],
"touchdowns": ["mean", "std"],
"passer_rating": ["mean", "std"]
}).round(1)
# Flatten column names
variability.columns = [f"{col[0]}_{col[1]}" for col in variability.columns]
# Add coefficient of variation
variability["yards_cv"] = (
variability["passing_yards_std"] / variability["passing_yards_mean"] * 100
).round(1)
variability["rating_cv"] = (
variability["passer_rating_std"] / variability["passer_rating_mean"] * 100
).round(1)
return variability
variability = calculate_qb_variability(qb_data)
print("\nVARIABILITY ANALYSIS")
print("=" * 60)
print(variability)
Expected Output:
passing_yards_mean passing_yards_std passing_yards_min passing_yards_max ... yards_cv rating_cv
quarterback ...
Steady Steve 258.3 10.2 242 275 ... 4.0 6.8
Volatile Vic 263.2 74.5 165 360 ... 28.3 31.2
Key Findings
- Yards variability: Steve's std is 10.2 yards; Vic's is 74.5 yards (7x higher!)
- Yards CV: Steve 4.0% vs Vic 28.3%
- Range: Steve's range is 33 yards; Vic's is 195 yards
- Rating CV: Steve 6.8% vs Vic 31.2%
Volatile Vic is truly volatile—his performance swings are massive.
Phase 3: Distribution Analysis
Visualizing the Distributions
def analyze_distributions(df: pd.DataFrame, stat: str) -> pd.DataFrame:
"""
Analyze distribution shape for a statistic.
Parameters
----------
df : pd.DataFrame
Quarterback data
stat : str
Statistic column to analyze
Returns
-------
pd.DataFrame
Distribution metrics
"""
results = []
for qb in df["quarterback"].unique():
qb_data_subset = df[df["quarterback"] == qb][stat]
results.append({
"quarterback": qb,
"statistic": stat,
"mean": qb_data_subset.mean(),
"median": qb_data_subset.median(),
"skewness": stats.skew(qb_data_subset),
"kurtosis": stats.kurtosis(qb_data_subset),
"q1": qb_data_subset.quantile(0.25),
"q3": qb_data_subset.quantile(0.75),
"iqr": qb_data_subset.quantile(0.75) - qb_data_subset.quantile(0.25)
})
return pd.DataFrame(results).round(2)
yards_dist = analyze_distributions(qb_data, "passing_yards")
print("\nYARDS DISTRIBUTION ANALYSIS")
print("=" * 60)
print(yards_dist)
rating_dist = analyze_distributions(qb_data, "passer_rating")
print("\nPASSER RATING DISTRIBUTION")
print("=" * 60)
print(rating_dist)
Interpreting the Distributions
Steady Steve: - Mean ≈ Median (symmetric distribution) - Low skewness and kurtosis - Narrow IQR (consistent performance band)
Volatile Vic: - More variable measures - Potentially bimodal (good games vs bad games) - Wide IQR spanning nearly the entire range
Phase 4: Game-by-Game Floor Analysis
Calculating "Floor" Performance
In many contexts, the worst-case scenario matters more than the average.
def analyze_floor_ceiling(df: pd.DataFrame) -> pd.DataFrame:
"""
Analyze floor (worst games) and ceiling (best games) performance.
Parameters
----------
df : pd.DataFrame
Quarterback data
Returns
-------
pd.DataFrame
Floor and ceiling analysis
"""
results = []
for qb in df["quarterback"].unique():
qb_subset = df[df["quarterback"] == qb]
# Floor: worst 3 games
floor_games = qb_subset.nsmallest(3, "passer_rating")
# Ceiling: best 3 games
ceiling_games = qb_subset.nlargest(3, "passer_rating")
results.append({
"quarterback": qb,
"floor_avg_yards": floor_games["passing_yards"].mean(),
"floor_avg_rating": floor_games["passer_rating"].mean(),
"floor_avg_td": floor_games["touchdowns"].mean(),
"floor_avg_int": floor_games["interceptions"].mean(),
"ceiling_avg_yards": ceiling_games["passing_yards"].mean(),
"ceiling_avg_rating": ceiling_games["passer_rating"].mean(),
"ceiling_avg_td": ceiling_games["touchdowns"].mean(),
"ceiling_avg_int": ceiling_games["interceptions"].mean()
})
return pd.DataFrame(results).round(1)
floor_ceiling = analyze_floor_ceiling(qb_data)
print("\nFLOOR AND CEILING ANALYSIS")
print("=" * 60)
print(floor_ceiling.T)
Floor vs Ceiling Interpretation
Steady Steve: - Floor: ~80 rating, 245 yards, 1.3 TDs, 1.0 INTs - Ceiling: ~100 rating, 270 yards, 2.7 TDs, 0.3 INTs - Narrow gap between floor and ceiling
Volatile Vic: - Floor: ~55 rating, 175 yards, 0.0 TDs, 2.3 INTs - Ceiling: ~125 rating, 350 yards, 4.3 TDs, 0.0 INTs - Massive gap between floor and ceiling
Phase 5: Standardized Comparison
Z-Score Analysis
def calculate_zscore_profile(df: pd.DataFrame) -> pd.DataFrame:
"""
Calculate z-scores for each performance relative to that QB's average.
Parameters
----------
df : pd.DataFrame
Quarterback data
Returns
-------
pd.DataFrame
Data with z-scores added
"""
result = df.copy()
# Calculate z-scores within each QB
for qb in df["quarterback"].unique():
mask = result["quarterback"] == qb
for stat in ["passing_yards", "touchdowns", "passer_rating"]:
qb_data_subset = result.loc[mask, stat]
result.loc[mask, f"{stat}_z"] = (
(qb_data_subset - qb_data_subset.mean()) / qb_data_subset.std()
)
return result
qb_with_z = calculate_zscore_profile(qb_data)
# Show extreme performances
print("\nEXTREME PERFORMANCES (|z| > 1.5)")
print("=" * 60)
extreme = qb_with_z[
(abs(qb_with_z["passing_yards_z"]) > 1.5) |
(abs(qb_with_z["passer_rating_z"]) > 1.5)
][["quarterback", "game", "passing_yards", "passer_rating",
"passing_yards_z", "passer_rating_z"]]
print(extreme.round(2))
Cross-QB Comparison
def compare_across_qbs(df: pd.DataFrame) -> pd.DataFrame:
"""
Create league-wide z-scores for cross-QB comparison.
Parameters
----------
df : pd.DataFrame
All quarterback data
Returns
-------
pd.DataFrame
Comparison metrics
"""
# Calculate league-wide stats
league_stats = {
"passing_yards": {"mean": df["passing_yards"].mean(),
"std": df["passing_yards"].std()},
"touchdowns": {"mean": df["touchdowns"].mean(),
"std": df["touchdowns"].std()},
"interceptions": {"mean": df["interceptions"].mean(),
"std": df["interceptions"].std()},
"passer_rating": {"mean": df["passer_rating"].mean(),
"std": df["passer_rating"].std()}
}
results = []
for qb in df["quarterback"].unique():
qb_subset = df[df["quarterback"] == qb]
profile = {"quarterback": qb}
for stat, params in league_stats.items():
qb_mean = qb_subset[stat].mean()
z = (qb_mean - params["mean"]) / params["std"]
profile[f"{stat}_z"] = round(z, 2)
results.append(profile)
return pd.DataFrame(results)
cross_comparison = compare_across_qbs(qb_data)
print("\nCROSS-QB COMPARISON (League Z-Scores)")
print("=" * 60)
print(cross_comparison)
Phase 6: Decision Framework
Building a Composite Score
def calculate_consistency_score(df: pd.DataFrame) -> pd.DataFrame:
"""
Calculate a composite consistency score.
Weights:
- Average performance: 40%
- Consistency (inverse of CV): 30%
- Floor performance: 20%
- Ceiling upside: 10%
Parameters
----------
df : pd.DataFrame
Quarterback data
Returns
-------
pd.DataFrame
Consistency scores
"""
results = []
for qb in df["quarterback"].unique():
qb_subset = df[df["quarterback"] == qb]
# Average performance (normalized 0-100)
avg_rating = qb_subset["passer_rating"].mean()
avg_score = (avg_rating - 50) / 100 * 100 # Normalize from 50-150 range
# Consistency (inverse of CV, capped)
cv = qb_subset["passer_rating"].std() / qb_subset["passer_rating"].mean()
consistency_score = max(0, 100 - cv * 200) # Lower CV = higher score
# Floor (worst 3 games average)
floor_rating = qb_subset.nsmallest(3, "passer_rating")["passer_rating"].mean()
floor_score = (floor_rating - 50) / 100 * 100
# Ceiling (best 3 games average)
ceiling_rating = qb_subset.nlargest(3, "passer_rating")["passer_rating"].mean()
ceiling_score = (ceiling_rating - 50) / 100 * 100
# Composite
composite = (
avg_score * 0.40 +
consistency_score * 0.30 +
floor_score * 0.20 +
ceiling_score * 0.10
)
results.append({
"quarterback": qb,
"avg_score": round(avg_score, 1),
"consistency_score": round(consistency_score, 1),
"floor_score": round(floor_score, 1),
"ceiling_score": round(ceiling_score, 1),
"composite": round(composite, 1)
})
return pd.DataFrame(results).sort_values("composite", ascending=False)
scores = calculate_consistency_score(qb_data)
print("\nCOMPOSITE CONSISTENCY SCORES")
print("=" * 60)
print(scores)
Phase 7: Conclusions and Recommendations
Final Analysis
def generate_recommendation(df: pd.DataFrame) -> str:
"""
Generate quarterback recommendation based on analysis.
Parameters
----------
df : pd.DataFrame
Quarterback data
Returns
-------
str
Recommendation text
"""
scores = calculate_consistency_score(df)
variability = calculate_qb_variability(df)
lines = []
lines.append("QUARTERBACK RECOMMENDATION REPORT")
lines.append("=" * 60)
lines.append("")
# Overall winner
winner = scores.iloc[0]["quarterback"]
lines.append(f"RECOMMENDED: {winner}")
lines.append("")
# Key findings
lines.append("KEY FINDINGS:")
lines.append("-" * 40)
for _, row in scores.iterrows():
qb = row["quarterback"]
lines.append(f"\n{qb}:")
lines.append(f" Composite Score: {row['composite']}")
lines.append(f" Average Performance: {row['avg_score']}")
lines.append(f" Consistency: {row['consistency_score']}")
lines.append(f" Floor Protection: {row['floor_score']}")
lines.append(f" Ceiling Upside: {row['ceiling_score']}")
lines.append("")
lines.append("CONTEXT-SPECIFIC RECOMMENDATIONS:")
lines.append("-" * 40)
lines.append("• Need reliable starter: Steady Steve")
lines.append("• Looking for boom potential: Volatile Vic")
lines.append("• Risk-averse team: Steady Steve")
lines.append("• Behind in 4th quarter: Volatile Vic")
return "\n".join(lines)
recommendation = generate_recommendation(qb_data)
print(recommendation)
Discussion Questions
-
How would the analysis change if interceptions were weighted more heavily?
-
In what game situations would you prefer Volatile Vic over Steady Steve?
-
How might sample size (12 games vs 48 games) affect our confidence in these conclusions?
-
What other statistics would you want to include in a more comprehensive analysis?
-
How would you adjust this framework for comparing running backs or receivers?
Your Turn: Extensions
Option A: Multi-Quarterback Analysis
Extend this analysis to compare 4+ quarterbacks with different profiles: - Consistent average performer - Boom-or-bust player - High-floor, low-ceiling - Rookie with small sample
Option B: Situational Splits
Analyze how consistency changes based on: - Home vs away games - Against winning vs losing teams - First half vs second half of season
Option C: Predictive Value
Investigate: Does first-half season consistency predict second-half performance?
Key Takeaways
-
Averages hide variability: Two quarterbacks with identical averages can be fundamentally different players
-
Consistency has value: A reliable floor often matters more than occasional ceilings
-
Context matters: The "better" player depends on team needs and game situations
-
Multiple metrics needed: No single statistic captures the full picture
-
Standard deviation is essential: It quantifies what the mean cannot show