Limits of prediction 2. **"Market Efficiency in Gambling Markets"** - Sportsbook mechanics 3. **"The Profitability of NFL Betting"** - Historical analysis 4. **"Closing Line Value as a Skill Metric"** - CLV validation 5. **"Information Aggregation in Markets"** - How prices form → Further Reading: Betting Market Analysis
Sleep disruption for away team - Circadian rhythm effects - Physical recovery constraints - Particularly acute for west-to-east travel → Chapter 25: Home Field Advantage Deep Dive
2003-2006: The Amateur Revolution
**Football Outsiders** launches (2003), introducing DVOA (Defense-adjusted Value Over Average) and bringing rigorous analysis to public football discourse - **Pro Football Reference** begins comprehensive statistical archives - Academic researchers begin publishing football analytics papers → Chapter 1: Introduction to Football Analytics
2006-2010: NFL Takes Interest
Teams begin hiring dedicated analytics staff - The New England Patriots, under Bill Belichick, become known for analytically-informed decisions (though they maintain secrecy about methods) - Fourth-down analysis gains prominence, with researchers demonstrating that teams punt too often → Chapter 1: Introduction to Football Analytics
Deep methodological understanding 2. **Causal inference methods** - Isolating true injury impact 3. **ML prediction models** - Forecasting injury probability 4. **Real-time systems** - Building production injury models → Further Reading: Injuries and Their Impact
Academic market efficiency papers
Theoretical foundations 2. **Advanced statistical texts** - Modeling improvements 3. **CLV and sharp money research** - Professional approaches 4. **Information theory applications** - Signal vs noise → Further Reading: Betting Market Analysis
Statistical foundations 2. **Projection systems** - Building your own model 3. **Historical analysis** - Backtesting approaches 4. **Economic optimization** - Trade value maximization → Further Reading: Draft Analysis
Team went 3-6 remainder - Average margin: -4.2 points - Model predicted margin shift: -8.0 points - Actual shift: -7.5 points → Chapter 23: Injuries and Their Impact
Betting data provides excellent prediction benchmarks - Market-derived ratings can supplement your models - Line movement reveals information flow → Chapter 22: Betting Market Analysis
Separation (distance from defender at catch point) - Target separation (how open the receiver was) - Catch probability based on throw difficulty - Route depth and direction → Chapter 8: Receiving Analytics
Available in standard data:
Targets, receptions, yards, touchdowns - EPA on targets - Air yards and yards after catch - Completion percentage (catch rate) → Chapter 8: Receiving Analytics
Available Metrics:
Completion probability over expected - Separation at catch point - Time to throw - Rushing yards over expected - Pass rush win rate → Chapter 2: The NFL Data Ecosystem
Available Prospects:
WR A: 82 grade (best WR available) - EDGE A: 78 grade (best EDGE available) - OT A: 85 grade (best overall, not a need) → Exercises: Draft Analysis
Use SOS alongside other metrics, not in isolation - Consider both past and future SOS - Account for home/away splits in opponent difficulty - Recognize that different calculation methods serve different purposes → Chapter 16: Strength of Schedule
Better approach:
Separate components (kicking, punting, returns, coverage) - Regress toward mean for rare events (return TDs) - Account for opportunity differences → Quiz: Special Teams Analytics
understanding how betting markets work, what market prices tell us about team strength, and how to evaluate model performance against market efficiency. → Chapter 21: Game Simulation
Full probability distributions - Confidence intervals - Scenario probabilities - Path-dependent outcomes → Key Takeaways: Game Simulation
Body clock timing
Playing at 10 AM local = 7 AM on body clock; athletes not at peak 2. **Travel fatigue** - 3 timezone adjustment, likely arrived 1-2 days prior 3. **Sleep disruption** - West-to-East travel harder on circadian rhythm 4. **Game week preparation** - May have shortened practice schedule due to travel 5. → Quiz: Schedule and Rest Analysis
Books:
*Automate the Boring Stuff with Python* (free online) - *Python for Data Analysis* by Wes McKinney - *Python Data Science Handbook* by Jake VanderPlas → Prerequisites
ACL tear in 2020 could have derailed everything 2. **Chase over Sewell** - OL still struggled, Burrow was sacked 70 times combined 3. **Thin OL investment** - Nearly cost them the Super Bowl 4. **FA misses** - Waynes was a bust → Case Study: The Cincinnati Bengals Rebuild
Physical recovery time - Extra preparation for opponent - No new injuries from previous game - Strategic scheming opportunity → Chapter 26: Schedule and Rest Analysis
Bye week studies
Quantifying rest advantages 2. **TNF injury research** - Short rest health effects 3. **Circadian rhythm papers** - Travel timing science 4. **SOS methodology** - Proper adjustment techniques 5. **Market efficiency studies** - How markets price schedule → Further Reading: Schedule and Rest Analysis
Avoid repeated downloads 2. **Filter early** - Reduce memory usage 3. **Validate after loading** - Check for expected values 4. **Document versions** - Track data source dates 5. **Separate raw from processed** - Maintain data lineage → Key Takeaways: The NFL Data Ecosystem
Cap is a constraint
Every dollar matters 2. **Positions aren't equal** - Pay for premium, draft the rest 3. **Timing matters** - Windows open and close 4. **Draft > FA** - For building sustainable success 5. **QB economics dominate** - Most important roster decision → Key Takeaways: Team Building and Roster Construction
Classification:
High yards + low red zone TD% + low points = Bend but don't break - Low yards + low points = True elite defense - High yards + high points = Actually bad defense → Quiz: Defensive Analytics
Fans see themselves as impacting games 2. **Strategic timing** - Coordinated noise on opponent's snaps 3. **Season ticket loyalty** - Consistent, knowledgeable fanbase 4. **Weather tolerance** - Fans stay loud in rain/cold → Case Study: The 12th Man Effect
Massive data volume (2-3 million rows per game) - Requires specialized processing tools - Storage and memory constraints → Chapter 2: The NFL Data Ecosystem
First international game: Apply standard travel penalty (-0.5 to -1.0) - Second consecutive: Cumulative fatigue factor (+50% penalty) - If team stays overseas between games: Reduce second game penalty - If team returns home then travels again: Full double penalty - Consider jet lag direction both wa → Quiz: Schedule and Rest Analysis
Play-by-play data (1999-present) - Expected Points Added (EPA) - Win probability - Player participation - Next Gen Stats integration → Appendix C: NFL Data Sources
Records without SOS are incomplete 2. **Multiple methods exist** - Choose based on application 3. **Future matters** - Past SOS explains, future SOS predicts 4. **Division effects** - NFL scheduling creates systematic patterns 5. **Adjust metrics** - EPA, efficiency need opponent context → Key Takeaways: Strength of Schedule
Context matters
distance, situation, environment all affect performance 5. **Over expected metrics** better isolate skill from circumstance → Chapter 11: Special Teams Analytics
Context:
Week 14, Sunday Night Football - New England traveling from East coast - Temperature: 28°F - KC ranked as high-HFA venue (+0.8) - Not a divisional game → Exercises: Home Field Advantage Deep Dive
Contextual Complexity
Impact depends on the opponent - Backup quality varies dramatically - Scheme may adjust around absences - Other injuries may compound effects → Chapter 23: Injuries and Their Impact
Conversion rates are high
Teams convert 4th & 1 at ~73%, 4th & 2 at ~63% 2. **Punts don't gain much** - Net 35-40 yards on average 3. **Field position value is overrated** - Difference between own 20 and own 35 is small 4. **Failed attempts aren't catastrophic** - Opponent EP increase is manageable → Case Study: The Fourth Down Revolution
Core Idea:
Generate many random scenarios - Aggregate results to estimate probabilities - Explore full distribution of outcomes → Key Takeaways: Game Simulation
Salary difference - How many wins does 0.05 EPA/play add? - ~0.05 × 500 passes = 25 EPA ≈ 2.5 points/season → Quiz: Quarterback Evaluation
COVID-19 no-fans studies
Natural experiment on crowd effects 2. **Referee bias research** - Home penalty advantages 3. **Travel fatigue literature** - Circadian rhythm effects 4. **Stadium acoustics** - How noise affects communication 5. **Historical HFA trends** - Long-term perspective on changes → Further Reading: Home Field Advantage Deep Dive
Critical Situation:
Away team played Monday Night Football in Week 7 - Now playing Thursday Night in Week 8 (away) - Home team played Sunday in Week 7 → Exercises: Schedule and Rest Analysis
Crowd effects are real
~1.5 points attributable to fans 2. **Referee bias exists** - Penalty differential disappeared 3. **Travel effects persist** - Still some home advantage without fans 4. **Routine matters** - Home team still had familiarity benefits → Chapter 25: Home Field Advantage Deep Dive
Crowd noise
affects communication, false starts 2. **Travel fatigue** - especially long distances 3. **Time zones** - west-to-east travel hardest 4. **Referee bias** - small but measurable 5. **Familiarity** - knowing the venue → Chapter 15: Home Field Advantage
**All 32 NFL teams** employ analytics staff, ranging from 2-3 people to departments of 15+ - **Player tracking data** captures location, speed, and acceleration for every player on every play - **Expected Points Added (EPA)** has become the lingua franca of football analysis - **Public tools** (nflf → Chapter 1: Introduction to Football Analytics
nflfastR/nflfastpy for play-by-play - Pro Football Reference for historical - ESPN API for current stats - FantasyPros for projections → Part 7: Capstone Projects
Data Structures
Lists and list comprehensions - Dictionaries - Sets and tuples - Basic understanding of classes and objects → Prerequisites
Data to track:
Individual bet CLV - Moving average CLV (last 50 bets) - Cumulative CLV - Win rate vs expected win rate from CLV → Exercises: Betting Market Analysis
If teams always passed, defenses would always play pass coverage 2. **Situational needs** - Short-yardage, clock management, red zone 3. **Player limitations** - Not all QBs can handle 50+ attempts 4. **Risk management** - Passes have higher variance (sacks, INTs) 5. **Injury concerns** - Protecting → Chapter 13: Pace and Play Calling
Defensive positions and their roles - Difference between zone and man coverage - Basic understanding of blitzing - Run defense vs. pass defense → Prerequisites
Mean, median, mode, and other measures of central tendency - Standard deviation, variance, and other measures of spread - Percentiles and quartiles - Interpreting histograms, box plots, and scatter plots → Prerequisites
Designed for:
Undergraduate students in data science, statistics, or sports management - Aspiring sports analysts seeking comprehensive training - Professionals transitioning into football analytics - Fantasy sports enthusiasts seeking analytical depth → Professional Football Analytics and Visualization
Development Environment
Using Jupyter notebooks - Running Python scripts - Installing packages with pip - Basic debugging strategies → Prerequisites
Developmental Uncertainty
College-to-NFL transition varies by player - Coaching and scheme fit impact outcomes - Character and work ethic difficult to quantify → Chapter 28: Draft Analysis
DFS optimization
Linear programming 2. **Ownership dynamics** - Game theory 3. **Machine learning projections** - Advanced methods 4. **Bankroll management** - Long-term sustainability → Further Reading: Fantasy Football Analytics
Passes and kicks 2. **Asymmetric by possession** - Teams take turns facing wind 3. **Harder to prepare for** - Can't practice in wind tunnel 4. **Variable during games** - Conditions can change → Chapter 24: Weather Effects
Disadvantages:
Circular logic (opponent records include games vs each other) - Doesn't account for home/away or margin of victory - All games weighted equally → Chapter 16: Strength of Schedule
No ball tracking in current public data - Can't see player eye movement or hand placement - Doesn't record pre-snap communication → Chapter 2: The NFL Data Ecosystem
Draft Mistakes:
**Raiders (Ruggs at 12):** Overdrafted by ~30 picks based on speed alone - **Eagles (Reagor at 21):** Drafted before Jefferson; massive mistake - **Cowboys (Lamb at 17):** Perfect value selection → Case Study: Evaluating the 2020 Wide Receiver Draft Class
Draft value basics
Understanding pick economics 2. **Combine data interpretation** - What testing means 3. **Production metrics** - YPRR, Dominator, breakout age 4. **Position-specific models** - Evaluation by role → Further Reading: Draft Analysis
**Bud Wilkinson** (1950s-60s): Oklahoma's legendary coach kept meticulous records of play success rates, pioneering what we might now call success rate analysis. - **Bill Walsh** (1980s): The 49ers architect famously scripted his first 15 plays, a proto-analytical approach to game planning. - **Home → Chapter 1: Introduction to Football Analytics
Early-season SOS limitations:
Small sample sizes make opponent records unreliable - Teams haven't yet established their true quality - Early schedule often includes easier opponents by design - Injuries and roster changes haven't fully materialized - Better to use preseason power ratings than early W-L records → Quiz: Schedule and Rest Analysis
the backbone of most NFL prediction systems. You'll learn how to build rating systems that automatically adjust based on game results, handle margin of victory, and account for opponent strength. → Chapter 18: Introduction to Prediction Models
Foundation of many systems 2. **FiveThirtyEight methodology posts** - Modern implementation 3. **Brier's scoring rule paper** - Proper probability evaluation 4. **Recent NFL ML papers** - Current state of the art 5. **Market efficiency papers** - Understanding betting lines → Further Reading: Introduction to Prediction Models
Elo's original work
Foundation understanding 2. **FiveThirtyEight methodology** - Modern NFL implementation 3. **Glicko system paper** - Understanding rating uncertainty 4. **DVOA methodology** - Efficiency-based alternative 5. **Market efficiency papers** - Context for rating value → Further Reading: Elo and Power Ratings
Emerging Applications:
AWS Next Gen Stats coverage metrics - ESPN win rate for pass rushers - Completion probability over expectation - Expected YAC → Quiz: Defensive Analytics
Foundation for player valuation 2. **VORP/WAR methodology** - Value over replacement concepts 3. **Market efficiency in sports** - How betting markets price information 4. **Recovery curve research** - Return-to-play performance patterns 5. **Compound injury effects** - Non-linear impact of multiple → Further Reading: Injuries and Their Impact
Equipment Effects:
Ball becomes harder and slicker - Grip aids less effective - Cleats interact differently with surface → Chapter 24: Weather Effects
Example:
A left tackle might dominate every block - But if right guard fails, run still fails - Team metrics blame all five equally → Quiz: Offensive Line Analytics
Foundation for scoring probability 2. **Win Probability Added** - Burke's original methodology 3. **Monte Carlo methods textbook** - Statistical foundations 4. **Distribution comparison tests** - Validation methodology 5. **NFL outcome prediction** - Academic benchmarks → Further Reading: Game Simulation
Expected make probability: ~60% - Make: +3 points, opponent gets ball at 25 = +3 - 0 EP ≈ +3.0 - Miss: opponent gets ball at 42 = -0.6 EP - FG EV = 0.60 * 3.0 + 0.40 * (-0.6) = 1.8 - 0.24 = **+1.56 EP** → Quiz: Special Teams Analytics
Field position matters most
"Don't give them a short field" 2. **Take the points** - A field goal is "certain" points 3. **Trust the defense** - Pin them deep and let your D work 4. **Avoid embarrassment** - Failed 4th downs look bad on film → Case Study: The Fourth Down Revolution
Find comparable by matching:
Size within 2" height, 10 lbs weight - Speed within 0.05s - Similar production profile → Exercises: Draft Analysis
Finding:
42% of carries came when ahead by 7+ (clock-killing mode) - EPA was -0.12 when ahead (low-value carries) - EPA was +0.05 in close games (actual competitive value) → Case Study: The Workhorse RB Debate
Sacks and QB hits allowed - Rushing yards and success rates - Time to throw (via tracking) - Scrambles and pressured plays → Chapter 9: Offensive Line Analytics
Importance of field position - Clock management basics - Situational football (red zone, third down, two-minute drill) - General strategic considerations → Prerequisites
using Monte Carlo methods to simulate thousands of game outcomes and generate probability distributions for any metric, from win probability to specific score predictions. → Chapter 20: Machine Learning for NFL Prediction
Temperature: 22°F - Wind: 22 mph, gusting to 35 mph - Heavy, lake-effect snow - Visibility: Poor (< 100 yards at times) - Accumulation: 4-6 inches during game → Case Study: The 2017 Wild Card Snow Game
General HFA research
Understand fundamentals 2. **NFL-specific studies** - Football context 3. **COVID data analysis** - Natural experiment 4. **Market pricing** - How to value HFA → Further Reading: Home Field Advantage Deep Dive
Not daily averages 2. **Use stadium location** - Not city center 3. **Check roof status** - Retractable roofs may close 4. **Update before game** - Weather changes 5. **Store forecasts** - For validation → Chapter 24: Weather Effects
Team A is coming off their bye week (Week 7 bye) - Team B played last week (normal rest) - Both teams normally get 6 days rest → Exercises: Schedule and Rest Analysis
Given:
Elite QB over average backup: 5 points - Pro Bowl WR1 over backup: 1.5 points - Starting RT over backup: 0.8 points - Position weights: QB=1.0, WR=0.25, RT=0.30 → Exercises: Injuries and Their Impact
Go for it
Attempt to convert the fourth down 2. **Punt** - Give the ball to the opponent deep 3. **Field goal** - Attempt a 54-yard field goal → Case Study: Should They Have Gone For It?
Go for it:
Conversion probability (4th & 3): ~55% - Convert: New 1st down at 32, EP ≈ +2.0 - Fail: Opponent at 35, EP = -0.6 - Go EV = 0.55 * 2.0 + 0.45 * (-0.6) = 1.1 - 0.27 = **+0.83 EP** → Quiz: Special Teams Analytics
Teams with poor medical staff have more injuries - Artificial turf associated with specific injuries - Age correlates with injury risk → Chapter 23: Injuries and Their Impact
Historical data for picks 15-25 at WR:
40 players drafted - 22 became starters (32+ starts) - 6 made Pro Bowl - 8 busted (career AV < 10) - Average career AV: 32 → Exercises: Draft Analysis
moving beyond the standard 2.5-3 point home advantage to understand what drives it, how it varies by team and situation, and how it has evolved over time. → Chapter 24: Weather Effects
QB: Healthy (Elite tier) - RB1: Out (Above average tier) - WR1: Questionable, 50% (Pro Bowl tier) - LT: Out (Average tier) - CB1: Questionable, 30% (Good tier) → Exercises: Injuries and Their Impact
I
Identification:
`game_id` - Unique game identifier - `play_id` - Play number within game - `posteam` / `defteam` - Possession and defensive teams → Key Takeaways: The NFL Data Ecosystem
If Green Bay extends at "elite" rates:
Total investment risk: ~$45M over contract - Expected performance decline: Regression to ~20th ranked defense - Value mismatch: Paying top-10 money for 20th ranked performance → Case Study: The Curious Case of the Turnover Machine
Ignoring VORP
Drafting QBs early when replacement QBs are adequate 2. **Chasing TDs** - TD rates regress heavily; volume is more stable 3. **Ignoring Variance** - Boom players when you're favored waste ceiling 4. **Overreacting Weekly** - One-week samples are noise; trust projections 5. **Neglecting Bankroll** - → Key Takeaways: Fantasy Football Analytics
Shows opportunity level - Indicates team's confidence in receiver - Correlates with raw production - Baseline for all volume stats → Quiz: Receiving Analytics
In this chapter, you will learn to:
Explain what distinguishes analytics from traditional statistics - Describe major milestones in the evolution of football analytics - Categorize analytical questions by type and tractability - Apply the football analytics workflow to a simple problem - Navigate the organizational structure of NFL an → Chapter 1: Introduction to Football Analytics
High volume doesn't mean high efficiency - May reflect lack of options, not quality - Doesn't account for target quality (depth, situation) - Can be inflated by garbage time - Ignores QB and scheme contribution → Quiz: Receiving Analytics
Indianapolis Colts (4-12)
Wild card via AFC South - Indoor dome team (Lucas Oil Stadium) - Pass-heavy offensive scheme - Limited cold-weather experience → Case Study: The 2017 Wild Card Snow Game
Official report: Full market adjustment - Social media rumors: Partial adjustment - Game-time announcement: Final adjustment → Chapter 23: Injuries and Their Impact
Information Uncertainty
Injury reports are often vague ("questionable") - Game-time decisions create last-minute uncertainty - Severity is rarely disclosed accurately - Recovery timelines are unpredictable → Chapter 23: Injuries and Their Impact
Raw coordinates need context - "Separation" depends on how you measure it - Quarterback decisions aren't directly observable → Chapter 2: The NFL Data Ecosystem
Interpretation:
EPA > 0 → Offense helped their scoring chances - EPA < 0 → Offense hurt their scoring chances - EPA ≈ 0 → Neutral play → Key Takeaways: The NFL Data Ecosystem
Interpreting RACR:
**RACR > 1.0**: Gaining more yards than targeted (YAC contribution) - **RACR = 1.0**: Perfect conversion of air yards to real yards - **RACR < 1.0**: Not converting air yards (drops, incompletions) → Chapter 8: Receiving Analytics
Investigation needed:
Opponent adjustment - Filter to neutral game scripts - Check turnover regression - Split by situation → Quiz: Defensive Analytics
Isolates preference from situation
removes trailing/leading bias 2. **Predicts future efficiency** - correlates with offensive quality 3. **Identifies philosophy** - true offensive identity 4. **Stable across games** - less affected by opponent or score → Chapter 13: Pace and Play Calling
J
Job security
Failed 4th downs generate criticism 2. **Outcome bias** - Failed attempts are remembered more than punts 3. **Trust issues** - Coaches don't fully trust analytics 4. **Sample size concerns** - "Our team is different" 5. **Competitive balance** - If everyone goes for it, advantage disappears → Case Study: The Fourth Down Revolution
Visibility dropped to near-zero at times - Both teams struggled to throw - Buffalo controlled time of possession - Turnovers favored home team → Case Study: The 2017 Wild Card Snow Game
**Director/VP of Analytics**: Sets strategic direction, interfaces with leadership - **Strategy Analysts**: Provide game-planning support, in-game recommendations - **Research Analysts**: Develop new metrics, build models, conduct long-term studies - **Data Engineers**: Maintain data infrastructure, → Chapter 1: Introduction to Football Analytics
Kicker evaluation:
**Binary outcomes**: Make or miss - **Distance as primary difficulty**: Can model expected % - **Points directly scored**: Clear value - **Higher sample stability**: More consistent year-to-year - **Metrics**: FG%, FG over expected, EPA → Quiz: Special Teams Analytics
L
Late Breakout Age
Production after age 21 concerns 2. **Low Dominator** - Couldn't dominate college competition 3. **Poor Conference** - FCS/weak FBS production inflated 4. **Age at Draft** - Older prospects have less development time 5. **One-Dimensional** - RBs who can't catch, slow WRs → Key Takeaways: Draft Analysis
**High-tempo teams:** 68-72 plays per game - **League average:** 62-65 plays per game - **Low-tempo teams:** 56-60 plays per game → Chapter 13: Pace and Play Calling
Legal Considerations:
Sports betting legality varies by jurisdiction - This chapter is educational, not gambling advice - Always comply with local laws → Chapter 22: Betting Market Analysis
Accurate on short, low-value throws - Poor decision-making (accurate but wrong decisions) - Bad luck on turnover-worthy plays that got intercepted - Poor supporting cast (accurate throws into tight coverage) → Quiz: Quarterback Evaluation
building on these rating foundations with gradient boosting, neural networks, and feature engineering approaches that can capture complex patterns in NFL data. → Chapter 19: Elo and Power Ratings
Fumble recoveries going against them 2. **Close game losses** - Losing one-score games at abnormal rates 3. **Injury timing** - Key players hurt at critical moments 4. **Poor situational football** - Underperforming in red zone or on 3rd down → Case Study: The 2023 Efficiency Surprises
Offensive positions and their roles - Basic formations (shotgun, under center, etc.) - Difference between run and pass plays - Basic route concepts (go, slant, out, etc.) → Prerequisites
One lineman out: ~0.5 point impact - Two linemen out: ~1.5 point impact (not 1.0) - Three linemen out: ~3.0+ point impact → Chapter 23: Injuries and Their Impact
Khan Academy Statistics and Probability - Coursera: Statistics with R Specialization - edX: Introduction to Probability → Prerequisites
Open bowl with overhanging roofs
Sound reflects back to field 2. **Metal seating sections** - Create resonance 3. **Proximity to field** - Fans are close to sidelines 4. **No gaps** - Continuous seating traps sound → Case Study: The 12th Man Effect
Free tier available - Forecast and historical - Easy integration - https://openweathermap.org/api → Chapter 24: Weather Effects
Option A: Situational Deep Dive
Focus on a specific game situation (e.g., fourth-and-1 inside opponent's 5-yard line) - Collect detailed data on outcomes - Analyze whether teams have reached optimal aggression in this situation - Deliverable: 1,500-word analysis with visualizations → Case Study: The Fourth-Down Revolution
Select one team known for fourth-down aggression (Eagles, Ravens, Lions) - Track their fourth-down decisions over 3-5 seasons - Analyze whether aggression correlated with wins - Deliverable: Team-specific report with recommendations → Case Study: The Fourth-Down Revolution
Apply the same expected value framework to two-point decisions - Estimate the optimal two-point attempt rate - Compare actual rates to optimal - Deliverable: Parallel analysis to this case study → Case Study: The Fourth-Down Revolution
Lateral movement before hitting hole - Cutback opportunities - Generally higher YPC - Requires vision and patience → Chapter 7: Rushing Analytics
Overall Offensive Performance:
Total yards: 6,432 (2nd in NFL) - EPA/play: +0.095 (5th) - Yards per play: 6.1 (4th) - First downs: 368 (6th) - Points per game: 25.8 (7th) → Case Study: The Red Zone Paradox
Overall, the market is better
Don't bet every game 2. **High-conviction disagreements have value** - Focus on 3+ point differences 3. **Efficiency metrics add value** - Continue developing this component 4. **Recent form is already priced** - Reduce weight on this feature 5. **Early season is our edge** - Emphasize projections o → Case Study: Evaluating Model Performance Against Market Efficiency
examining how teams choose between pass and run, how pace affects efficiency, and whether teams make optimal decisions. → Chapter 12: Team Efficiency Metrics
Accepted that field goals weren't failures - Reduced turnovers in red zone - Prioritized efficiency over explosiveness → Case Study: The Red Zone Paradox
Position: RB - Age: 27 - Last 3 seasons: 280 pts, 300 pts, 260 pts - Peak age for RB: 25 - Decline rate: 5% per year past peak → Exercises: Fantasy Football Analytics
Team Y (9-5) has remaining opponents with combined 30-18 record - Team Z (9-5) has remaining opponents with combined 18-30 record → Exercises: Schedule and Rest Analysis
Position Value Framework
Understanding which positions matter most 2. **Player Valuation** - Quantifying individual value over backups 3. **Probability Assessment** - Converting injury status to miss probability 4. **Compound Effects** - Accounting for multiple injury interactions 5. **Uncertainty Modeling** - Widening conf → Chapter 23: Injuries and Their Impact
Road teams practice with speakers 2. **Silent counts** - Home offense uses visual signals 3. **Defensive schemes** - Aggressive plays that benefit from crowd → Case Study: The 12th Man Effect
Not guesses, but mathematical transformations of data 2. **Evaluation is essential** - Without proper metrics, you can't distinguish skill from luck 3. **Pitfalls are everywhere** - Overfitting, leakage, variance, and small samples trap many modelers 4. **Building blocks combine** - Ratings + HFA + → Chapter 18: Introduction to Prediction Models
Understand data available 2. **Football Outsiders AGL** - See aggregate injury metrics 3. **nflfastR documentation** - Learn EPA framework 4. **Market efficiency papers** - Understand how injuries are priced → Further Reading: Injuries and Their Impact
Probability Fundamentals
Basic probability rules (complement, union, intersection) - Conditional probability and independence - Expected value and variance of random variables - Common distributions (normal, binomial, Poisson) → Prerequisites
Production Analysis
Normalize college stats for comparison 2. **Physical Testing** - Combine metrics predict athleticism, not success 3. **Profile Metrics** - Breakout age and dominator rating indicate alpha potential 4. **Position Models** - Each position requires unique evaluation criteria 5. **Draft Value** - Pick v → Chapter 28: Draft Analysis
Give up possession, gain field position 2. **Field Goal** - Attempt 3 points 3. **Go for it** - Try to convert → Chapter 13: Pace and Play Calling
Punt:
Expected net: ~35 yards - Opponent starts at own 5: EP = -0.5 - Punt EV = -(-0.5) = **+0.5 EP** → Quiz: Special Teams Analytics
Punter evaluation:
**Continuous outcomes**: Net yards on spectrum - **Situation-dependent**: Different goals by field position - **Field position value**: Indirect point impact - **Coverage unit interaction**: Team affects outcomes - **Metrics**: Net average, inside-20%, hangtime (if available) → Quiz: Special Teams Analytics
Python 3.8 or higher
Download from python.org - Verify installation: `python --version` → Prerequisites
Non-negotiable, took Burrow #1 2. **WR** - Took Chase over OL, paid off 3. **EDGE** - Paid Hendrickson premium 4. **CB** - Found value (Awuzie), avoided overpays → Case Study: The Cincinnati Bengals Rebuild
QB injuries are massive
No other position creates 8+ point swings 2. **Markets price efficiently** - Major injuries are quickly incorporated 3. **Backup quality varies widely** - Assessment is crucial 4. **Compound effects exist** - Add 10-15% for scheme disruption 5. **Variance increases** - Widen confidence intervals sig → Case Study: The Ripple Effect of a Franchise Quarterback Injury
measuring the home team's edge - **Historical trends** - how HFA has evolved - **Causal factors** - what creates home advantage - **Team-specific HFA** - which teams benefit most - **Venue effects** - stadium factors that matter - **Analytical applications** - using HFA in predictions → Chapter 15: Home Field Advantage
Most common precipitation type - Affects grip and footing - Ball becomes slippery - Passing accuracy decreases → Chapter 24: Weather Effects
RB Draft Premium Explanation:
RBs have shortest career spans (~4-5 years average) - Position has highest replacement rate (waiver/UDFA production common) - Offensive line affects RB production significantly - Pass-catching and receiving increasingly important - Historical bust rate is highest among skill positions - Teams can fi → Quiz: Draft Analysis
scoring inside the opponent's 20 - **Third down conversions** - the critical "money down" - **Two-minute offense** - hurry-up execution - **Goal-to-go situations** - short-field scoring - **Late and close games** - performance under pressure - **Clutch vs choke patterns** - does "clutch" exist? → Chapter 14: Situational Football
Red Zone Performance:
Red zone trips: 62 (above average) - Red zone TD rate: 49% (26th) - Red zone points per trip: 3.8 (24th) - Red zone EPA: -0.08 (29th) → Case Study: The Red Zone Paradox
Normal Sunday-to-Sunday: 6 days - Thursday Night: 3 days (short week) - Monday Night to Thursday: 2 days (extremely short) → Chapter 26: Schedule and Rest Analysis
Home teams won 51.0% (vs 57% historical) - Home team margin: +0.7 points (vs +2.7 historical) - Penalty differential nearly eliminated → Chapter 25: Home Field Advantage Deep Dive
Retractable Roof (4 teams):
Lucas Oil Stadium (Colts) - NRG Stadium (Texans) - Hard Rock Stadium (Dolphins, partial) → Chapter 24: Weather Effects
Risks:
Even skilled bettors lose most of the time - The house edge makes consistent profit extremely difficult - Past performance doesn't guarantee future results → Chapter 22: Betting Market Analysis
Robust RB Philosophy:
Locks up elite RB workloads early - Accepts WR depth as adequate - Works when: Elite RBs healthy, WR production distributed → Chapter 27: Fantasy Football Analytics
Build through the draft for cost-controlled talent - Pay premium only for premium positions (QB, EDGE, OT, CB) - Avoid long-term RB contracts - Manage the cap to maintain flexibility - Understand your competitive window and act accordingly → Chapter 17: Team Building and Roster Construction
More play-action from under center - Fade routes to larger receivers - Power running plays with pulling guards → Case Study: The Red Zone Paradox
Score Correlation:
High-scoring games: both teams elevated - Low-scoring games: both teams depressed - Typical correlation: ~0.1-0.15 → Key Takeaways: Game Simulation
Scoring Guide:
⭐ Foundational (5-10 min each) - ⭐⭐ Intermediate (10-20 min each) - ⭐⭐⭐ Challenging (20-40 min each) - ⭐⭐⭐⭐ Advanced/Research (40+ min each) → Exercises: Introduction to Football Analytics
Scoring:
5 correct: Well prepared - 3-4 correct: Minor review needed - 0-2 correct: Significant review needed → Prerequisites
Expected wins rest of season: from 7.5 to 4.2 - Playoff probability: from 95% to 45% - Super Bowl odds: from 12% to 0.5% → Chapter 23: Injuries and Their Impact
Only see outcomes for drafted players - Undrafted successes complicate models - Draft position affects opportunity → Chapter 28: Draft Analysis
Severity of conditions
Even aggressive adjustments weren't enough 2. **Both teams' struggles** - Expected Bills to perform better relatively 3. **Variance** - Snow games have extreme variance → Case Study: The 2017 Wild Card Snow Game
Full probability distributions - Confidence intervals - Scenario analysis ("What if?" questions) - Path-dependent outcomes (how games unfold) → Chapter 21: Game Simulation
Single-point predictions have limitations:
A predicted spread of -7 doesn't tell us the probability of winning by 14+ - Win probability doesn't reveal the distribution of possible scores - Uncertainty isn't captured by point estimates → Chapter 21: Game Simulation
Expected value analysis - Historical data interpretation - Decision-making frameworks - Communication of analytical insights → Case Study: The Fourth-Down Revolution
Sleep science basics
Understanding rest importance 2. **NFL schedule overview** - How schedules are made 3. **Basic SOS calculation** - Opponent quality assessment 4. **Market pricing** - How bettors use schedule → Further Reading: Schedule and Rest Analysis
Snow:
Dramatic visual impact - Field markings obscured - Running game advantages - Passing significantly impaired → Chapter 24: Weather Effects
Strong regularization (low max_depth, high min_samples) - Early stopping with validation set - Cross-validation for hyperparameter selection - Feature selection to reduce noise → Chapter 20: Machine Learning for NFL Prediction
Special Teams Analytics
the often-neglected third phase that can swing close games. We'll examine kicking, punting, and return game evaluation using EPA frameworks. → Chapter 10: Defensive Analytics
Open ends funnel noise onto field 2. **Seismic activity** - Fans have caused detectable earthquakes 3. **Consistent sellouts** - 12th Man engagement 4. **Surface noise** - Measured at 137.6 dB (2013 record) → Chapter 25: Home Field Advantage Deep Dive
Analysis without action is wasted effort 2. **Respect the noise** — Small samples mean humility about conclusions 3. **Quantify uncertainty** — Point estimates without confidence intervals mislead 4. **Integrate, don't replace** — Analytics complements human judgment 5. **Communicate clearly** — The → Key Takeaways: Introduction to Football Analytics
Point estimates and sampling distributions - Confidence intervals and their interpretation - Hypothesis testing (null/alternative hypotheses, p-values, Type I/II errors) - t-tests and chi-square tests → Prerequisites
Calculate summary statistics (mean, median, std of EPA) - Create histograms of EPA distribution for each - Note any differences in distribution shape → Quiz: Exploratory Data Analysis for Football
Step 3: Direct comparison
Create overlapping density plots or side-by-side box plots - Calculate and compare key metrics: EPA/dropback, CPOE, success rate - Test for statistical significance if sample sizes allow → Quiz: Exploratory Data Analysis for Football
Step 3: Sample size considerations
Require minimum 10 pressure kicks - Calculate confidence intervals - Use Bayesian approach for small samples → Quiz: Special Teams Analytics
Step 4: Contextual analysis
Compare in different situations (down, field position, score) - Look at supporting cast (receiver quality, O-line pressure) - Visualize with a multi-panel comparison dashboard → Quiz: Exploratory Data Analysis for Football
Release quality off the line - Route-running precision - Contested catch ability (truly isolating it) - Blocking on running plays → Chapter 8: Receiving Analytics
Created more turnovers than expected 2. **Close game success** - Won a disproportionate share of one-score games 3. **Clutch performance** - Better than average in high-leverage situations 4. **Special teams contributions** - Not captured in basic EPA → Case Study: The 2023 Efficiency Surprises
Compare 2022 Eagles (14-3) to 2022 49ers (13-4) - Analyze the 2021 AFC playoff field - Look at historical Super Bowl winners' SOS → Case Study: The 2023 Schedule Mirage
Surowiecki's "Wisdom of Crowds"
Understand collective intelligence 2. **Miller & Davidow's "Logic of Sports Betting"** - Modern betting framework 3. **Pinnacle Betting Resources** - Practical applications 4. **Academic papers on CLV** - Advanced skill measurement 5. **Kahneman's "Thinking, Fast and Slow"** - Cognitive biases → Further Reading: Betting Market Analysis
System Dependencies
College production heavily scheme-dependent - Competition level varies dramatically - Role in college may differ from NFL role → Chapter 28: Draft Analysis
Systematic > Intuitive
Models beat gut feelings long-term 2. **Evaluation is mandatory** - No testing = no credibility 3. **Variance is real** - Even perfect models have bad weeks 4. **Simple often wins** - Complexity ≠ accuracy 5. **Continuous improvement** - Update with new data → Key Takeaways: Introduction to Prediction Models
T
Tactical Effects:
Passing game generally suffers - Running game relatively favored - Kicking becomes less reliable → Chapter 24: Weather Effects
Starting QB (Elite tier): Questionable, 40% miss probability - WR1 (Pro Bowl caliber): Out - Starting RT: Questionable, 60% miss probability → Exercises: Injuries and Their Impact
how successful teams allocate resources across positions, the value of draft picks, free agency strategy, and the economics of building a competitive NFL roster. → Chapter 16: Strength of Schedule
25 interceptions (1st in NFL) - 32 total turnovers (T-3rd) - -0.02 EPA per play allowed (16th) - 45.2% success rate allowed (19th) - 240.1 pass yards per game allowed (18th) → Case Study: The Curious Case of the Turnover Machine
Season 1: 52 player-games missed - Season 2: 78 player-games missed - Season 3: 45 player-games missed - League average: 58 player-games missed → Exercises: Injuries and Their Impact
Technical Requirements:
Python implementation with pandas/numpy - Database storage (SQLite or PostgreSQL) - Automated data collection pipeline - Visualization dashboard (matplotlib/plotly) → Part 7: Capstone Projects
measuring how fast teams operate - **Play selection patterns** - pass/run ratios and tendencies - **Situational adjustments** - how decisions change with game state - **Optimal decision analysis** - identifying suboptimal calls - **Predictability and tendency exploitation** → Chapter 13: Pace and Play Calling
Textbooks:
*OpenIntro Statistics* (free online) - Excellent introduction - *Statistics* by Freedman, Pisani, and Purves - Intuitive explanations - *Introduction to Statistical Learning* - For those heading toward machine learning → Prerequisites
$224.8M in 2023, creating hard constraints - **The draft** - Primary source of cost-controlled talent - **Free agency** - Expensive but immediate talent acquisition - **Contract structure** - Guaranteed money, cap manipulation, and timing - **Position value** - Not all positions contribute equally t → Chapter 17: Team Building and Roster Construction
The salary cap is a hard constraint
Elite roster construction works within ~$225M limit 2. **Draft picks have calculable surplus value** - Rookie contracts provide cost-controlled talent 3. **Position value varies dramatically** - QB and EDGE worth more than RB and S 4. **Free agency is generally inefficient** - Buyers typically overp → Chapter 17: Team Building and Roster Construction
Massive dropoff after top 3-5 2. **Running Back** - Significant dropoff, especially for elite workloads 3. **Wide Receiver** - Moderate dropoff, deep position 4. **Quarterback** - Minimal scarcity, streaming viable → Chapter 27: Fantasy Football Analytics
Most teams punt too often 2. **Under-passing early** - 1st down pass rates often too low 3. **Over-running when ahead** - Leading teams run too much 4. **Predictable tendencies** - Formation-based tells → Key Takeaways: Pace and Play Calling
treat extreme rates with skepticism 2. **Look at non-turnover performance** for true quality assessment 3. **Historical regression** is your friend - use it 4. **Premium contracts require sustainable performance** - not luck 5. **Analytics can save millions** in avoided bad contracts → Case Study: The Curious Case of the Turnover Machine
Two Players at Same Position:
Player X: 15.0 PPG, playoff opponents ranked #25, #28, #30 vs position - Player Y: 14.0 PPG, playoff opponents ranked #5, #8, #12 vs position → Exercises: Fantasy Football Analytics
You have play-by-play data - You need offense/defense splits - You want maximum predictive power - Context matters (situational analysis) → Chapter 19: Elo and Power Ratings
Use Elo when:
You need cross-season comparisons - You want intuitive, interpretable ratings - You're building a public-facing system - Simplicity is important → Chapter 19: Elo and Power Ratings
Use SRS when:
You're analyzing a single season - You want pure point-spread predictions - Strength of schedule adjustment is crucial - Path independence matters → Chapter 19: Elo and Power Ratings
Uses:
Playoff probability - Division title odds - Draft position estimates - Schedule strength impact → Key Takeaways: Game Simulation
Positional scarcity determines true value 2. **PPR Favors Volume** - Reception points change player rankings 3. **Regression is Essential** - TD rates regress 75% to mean 4. **Variance is Contextual** - Match variance strategy to situation 5. **DFS = Projection + Game Theory** - Ownership matters in → Quiz: Fantasy Football Analytics
VORP methodology
Understanding replacement level 2. **Regression to mean in sports** - Projection accuracy 3. **Kelly Criterion applications** - Bankroll sizing 4. **DFS optimization research** - Mathematical approaches 5. **Market efficiency studies** - Finding edge → Further Reading: Fantasy Football Analytics
VORP/Replacement theory
Foundation of value 2. **Scoring system analysis** - Know your league 3. **Basic projections** - Volume × efficiency 4. **Draft strategy** - Apply knowledge → Further Reading: Fantasy Football Analytics
Learn where to get data 2. **Basic meteorology** - Understand conditions 3. **NFL weather research** - Sport-specific effects 4. **API integration** - Build data pipelines → Further Reading: Weather Effects
Week 1-2: Data Foundation
Build college stats database - Implement conference adjustments - Create production metrics (YPRR, Dominator, etc.) → Part 7: Capstone Projects
Week 1-2: EPA Foundation
Load play-by-play data - Calculate expected points by field position - Implement basic EPA → Part 7: Capstone Projects
Week 1-2: Foundation
Set up data pipeline for historical games (2018-present) - Implement basic Elo rating system - Calculate initial power ratings → Part 7: Capstone Projects
Week 1-2: Market Data
Build odds tracking database - Implement probability conversions - Track line movements → Part 7: Capstone Projects
Week 1-2: Projections
Build player projection model - Implement regression to mean - Create efficiency and volume forecasts → Part 7: Capstone Projects
Week 1-2: Team Profile
Calculate team power ratings - Build offensive/defensive efficiency analysis - Create personnel valuations → Part 7: Capstone Projects
Week 18: Playoff implications
Team A (home): Clinched playoff spot, likely to rest starters - Team B (away): Must win for playoffs - Both played Sunday, normal rest - Travel: 1 timezone → Exercises: Schedule and Rest Analysis
Week 3-4: Adjustments
Add home field advantage (static and dynamic) - Implement schedule factors (bye weeks, rest) - Add weather adjustments → Part 7: Capstone Projects
Week 3-4: Athletic Profiles
Process combine data - Calculate Speed Score, RAS equivalents - Build athletic comparison system → Part 7: Capstone Projects
Week 3-4: Deep Analysis
Evaluate coaching decisions - Analyze play-calling tendencies - Study situational performance → Part 7: Capstone Projects
Components, inputs, and outputs - **Evaluation metrics** - How to measure if your model actually works - **Common pitfalls** - Why most prediction attempts fail - **Building blocks** - The fundamental approaches we'll explore in subsequent chapters → Chapter 18: Introduction to Prediction Models
Comparing teams with similar records - Building prediction models - Projecting playoff races - Evaluating draft position tiebreakers - Adjusting efficiency metrics for opponent quality → Chapter 16: Strength of Schedule
Circadian rhythms favor "gaining" hours - 1:00 PM EST game is 10:00 AM body time for west coast team - 4:00 PM EST game is 1:00 PM body time (manageable) - But 1:00 PM PST is 4:00 PM body time for east coast team → Chapter 25: Home Field Advantage Deep Dive
Wide Receiver
Market inefficient, draft preferred 6. **Defensive Tackle** - Important but replaceable 7. **Linebacker** - Declining positional value 8. **Safety** - Good value in FA market → Key Takeaways: Team Building and Roster Construction
Direct effect on ball flight 2. **55°F threshold** - Cold effects begin below this 3. **Snow = high variance** - Outcomes are unpredictable 4. **Totals > spreads** - Weather affects scoring more than balance 5. **Denver is unique** - Only meaningful altitude adjustment 6. **Update forecasts** - Use → Quiz: Weather Effects
should be high (the "bend") 2. **Red zone TD rate** - should be low (the "don't break") 3. **3rd down conversion** - should be low (getting off field) 4. **Points per drive** - should be low despite yards → Quiz: Defensive Analytics
Current QB: Aging veteran, 0.02 EPA/play (league average) - Offensive Line: Top 10 in pass protection - Receivers: Young, developing corps - Cap Space: $50M available - Draft Capital: Picks 18 and 50 → Case Study: The Free Agent Quarterback Decision
Z
Zero RB Philosophy:
Targets elite WRs/TEs early - Banks on RB variance and waiver pickups - Works when: WR scarcity high, RB injury rates high → Chapter 27: Fantasy Football Analytics