Key Takeaways: Statistical Foundations for Football Analytics

DataField.Dev

Key Takeaways: Statistical Foundations for Football Analytics

One-page reference for Chapter 5 concepts

Core Metrics

EPA (Expected Points Added)

EPA = EP_after_play - EP_before_play

Measures how much a play changed expected point outcome
Positive = good play, negative = bad play
Average around 0 by construction

Win Probability (WP) & WPA

WPA = WP_after_play - WP_before_play

WP: Probability of winning at any game state
WPA: How much play changed win probability
Context-dependent (game situation matters)

Key Probability Formulas

Conditional Probability

$$P(A|B) = \frac{P(A \cap B)}{P(B)}$$

Bayes' Theorem

$$P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}$$

Expected Value

$$E[X] = \sum_{i} x_i \cdot P(x_i)$$

Hypothesis Testing Quick Reference

Test	Use Case
One-sample t-test	Is this QB's EPA different from league avg?
Two-sample t-test	Do home teams have higher EPA than away?
Paired t-test	Same player, two conditions (home/away)
z-test for proportions	Is this catch rate different from 65%?
Chi-square	Association between two categorical variables

P-value Interpretation

p < 0.05: Statistically significant at 95% confidence
p = 0.03: "If null is true, 3% chance of this extreme result"
p ≠ P(hypothesis is true)

Effect Size Guidelines

Cohen's d	Interpretation
< 0.2	Negligible
0.2 - 0.5	Small
0.5 - 0.8	Medium
> 0.8	Large

Football EPA Thresholds

Difference	Meaning
< 0.02 EPA	No practical difference
0.02-0.05	Small but noticeable
0.05-0.10	Meaningful
> 0.10	Large difference

Confidence Interval Formula

$$\bar{x} \pm t_{crit} \cdot \frac{s}{\sqrt{n}}$$

from scipy import stats
ci = stats.t.interval(0.95, df=n-1, loc=mean, scale=se)

Regression Quick Reference

Linear Regression

import statsmodels.api as sm
model = sm.OLS(y, sm.add_constant(X)).fit()
print(model.summary())

Logistic Regression

from sklearn.linear_model import LogisticRegression
model = LogisticRegression().fit(X, y)
odds_ratios = np.exp(model.coef_)

Common Pitfalls

Pitfall	Description	Solution
Small sample	Wide CIs, unreliable estimates	Report sample size, use CI
Multiple comparisons	False positives accumulate	Bonferroni or FDR correction
Regression to mean	Extreme values regress toward avg	Don't overreact to outliers
Survivorship bias	Only analyzing successes	Include failures in analysis
Selection bias	Non-random sample	Acknowledge limitations
p-hacking	Testing until significant	Pre-register hypotheses

Quick Statistical Tests

from scipy import stats

# Two-sample t-test
t, p = stats.ttest_ind(group1, group2)

# One-sample t-test
t, p = stats.ttest_1samp(data, population_mean)

# Correlation
r, p = stats.pearsonr(x, y)

# Chi-square
chi2, p, dof, expected = stats.chi2_contingency(table)

When to Use What

Question	Analysis
Is X different from average?	One-sample t-test
Are A and B different?	Two-sample t-test
Predict continuous Y	Linear regression
Predict binary Y	Logistic regression
Reduce Type I error	Multiple testing correction
Practical importance	Effect size (Cohen's d)

Red Flags in Analysis

"Significant" with tiny effect size
No sample size reported
Cherry-picked time period
No confidence intervals
Correlation treated as causation
p = 0.049 after many tests

Preview: Chapter 6

Part 2 begins with Quarterback Evaluation—applying these statistical foundations to measure passing performance.