Baseball Analytics

Master Data-Driven Baseball Analysis

Master the art of baseball analytics from sabermetrics fundamentals to advanced Statcast analysis

Resources

250

Tutorials

6

Examples

15

Datasets

37

Metrics

Learning Paths

Foundations and Data Infrastructure
6 Topics

Introduction to baseball analytics, data sources, programming tools, and statistical fundamentals

Explore Topics
Getting Started
5 Topics

Introduction to baseball analytics and key concepts

Explore Topics
Part I: Foundations and Data Infrastructure
6 Topics

Foundational concepts including sabermetrics, data sources, and analytics tools

Explore Topics
Basic Statistics
5 Topics

Traditional baseball statistics and their calculations

Explore Topics
Hitting and Offensive Analytics
8 Topics

Comprehensive analysis of hitting metrics, plate discipline, power, batted ball data, and offensive performance

Explore Topics
Part II: Hitting and Offensive Analytics
8 Topics

Comprehensive hitting metrics, plate discipline, and offensive analysis

Explore Topics
Advanced Metrics
5 Topics

Modern sabermetrics including WAR, FIP, and wRC+

Explore Topics
Part III: Pitching Analytics
10 Topics

Pitching statistics, arsenal analysis, and pitcher evaluation

Explore Topics
Pitching Analytics
10 Topics

In-depth pitching statistics, pitch design, velocity, movement, and pitcher performance evaluation

Explore Topics
Fielding and Defensive Analytics
6 Topics

Defensive metrics, positioning, range evaluation, and comprehensive fielding analysis

Explore Topics
Part IV: Fielding and Defensive Analytics
6 Topics

Defensive metrics, positioning, and field analytics

Explore Topics
Statcast Data
5 Topics

Launch angle, exit velocity, and tracking data analysis

Explore Topics
Advanced Metrics and Sabermetrics
5 Topics

WAR, win probability, run expectancy, park factors, aging curves, and projection systems

Explore Topics
Data Collection
5 Topics

Web scraping, APIs, and data sources for baseball

Explore Topics
Part V: Advanced Metrics and Sabermetrics
6 Topics

WAR, win probability, run expectancy, and advanced statistical methods

Explore Topics
Part VI: Team Strategy and Game Theory
6 Topics

Lineup construction, bullpen management, and strategic analysis

Explore Topics
Visualization
5 Topics

Creating spray charts, heat maps, and baseball visualizations

Explore Topics
Machine Learning
5 Topics

Predictive modeling and ML applications in baseball

Explore Topics
Part VII: Player Evaluation and Scouting
6 Topics

Draft analytics, player development, and evaluation methods

Explore Topics
Part VIII: Contracts, Salary, and Team Building
5 Topics

Financial aspects of baseball and roster construction

Explore Topics
Part IX: Advanced Statistical Methods
5 Topics

Machine learning, Bayesian statistics, and advanced analysis

Explore Topics
Part X: Historical Baseball and Era Comparisons
4 Topics

Historical eras and era adjustments

Explore Topics
Part XI: Advanced Hitting Concepts
8 Topics

Advanced hitting mechanics and analysis

Explore Topics
Part XII: Advanced Pitching Concepts
10 Topics

Advanced pitching mechanics and strategy

Explore Topics
Part XIII: Fielding and Defense Deep Dives
6 Topics

Comprehensive defensive analysis by position

Explore Topics
Part XIV: Baserunning and Speed
5 Topics

Speed analytics and baserunning strategy

Explore Topics
Part XV: In-Game Strategy and Tactics
7 Topics

Managerial decisions and in-game strategy

Explore Topics
Part XVI: Player Development and Minor Leagues
6 Topics

Prospect development and minor league analysis

Explore Topics
Part XVII: International Baseball
5 Topics

International leagues and player analysis

Explore Topics
Part XVIII: Business, Economics, and Operations
7 Topics

Baseball business and economic analysis

Explore Topics
Part XIX: Historical Eras and Evolution
6 Topics

Baseball era analysis and evolution

Explore Topics
Part XX: Specialized Topics and Niches
7 Topics

Specialized baseball analytics topics

Explore Topics
Part XXI: Medical, Health, and Safety
5 Topics

Injury analytics and player health

Explore Topics
Part XXII: Technology and Innovation
6 Topics

Baseball technology and tracking systems

Explore Topics
Part XXIII: Umpiring and Rules
4 Topics

Umpire analytics and rule analysis

Explore Topics
Part XXIV: Specific Situations and Strategy
5 Topics

Special game situations and strategy

Explore Topics
Advanced Methods
8 Topics

Advanced statistical and machine learning methods for sports analytics

Explore Topics
Part XXV: Prospect Analysis and Scouting
5 Topics

Comprehensive prospect analysis

Explore Topics
Game Theory & Strategy
7 Topics

Strategic decision-making and game theory in baseball

Explore Topics
Part XXVI: Coaching and Instruction
4 Topics

Coaching impact and instruction methods

Explore Topics
Draft & Prospect Analytics
2 Topics

MLB draft analysis and prospect evaluation

Explore Topics
Part XXVII: Psychology and Mental Performance
4 Topics

Mental skills and psychology in baseball

Explore Topics
Contracts & Salary
1 Topics

Player valuation, arbitration, and contract analysis

Explore Topics

Quick Start Guide

1
Learn Basics
Start with fundamental statistics
2
Advanced Metrics
Explore sport-specific analytics
3
Practice with Data
Use real datasets and examples
4
Apply Knowledge
Build your own analytics projects

Key Baseball Metrics

BABIP BABIP

Batting average on balls put into play. Used to identify luck or unsustainable performance; league average is around .300. Variables: H = Hits, HR = Home Runs, AB = At-Bats, K = Strikeouts, SF = Sacrifice Flies

\text{BABIP} = \frac{H - HR}{AB - K - HR + SF}
Isolated Power ISO

Measures raw power by removing singles from slugging percentage. Shows extra bases per at-bat. Variables: SLG = Slugging Percentage, AVG = Batting Average, 2B = Doubles, 3B = Triples, HR = Home Runs

\text{ISO} = \text{SLG} - \text{AVG} = \frac{2B + 2(3B) + 3(HR)}{AB}
OPS Plus OPS+

Park and league-adjusted OPS. 100 is league average, with each point representing a percentage above or below. Variables: OBP = On-Base Percentage, SLG = Slugging Percentage, lgOBP = League OBP, lgSLG = League SLG (adjusted for park)

\text{OPS+} = 100 \times \left( \frac{\text{OBP}}{\text{lgOBP}} + \frac{\text{SLG}}{\text{lgSLG}} - 1 \right)
Weighted Runs Above Average wRAA

Offensive runs above average based on wOBA. Foundation for the batting component of WAR. Variables: wOBA = Player wOBA, lgwOBA = League wOBA, wOBA Scale = ~1.15 (converts to runs), PA = Plate Appearances

\text{wRAA} = \frac{(\text{wOBA} - \text{lgwOBA})}{\text{wOBA Scale}} \times PA
wOBA wOBA

Weights each method of reaching base by its actual run value. Considered the best single measure of offensive contribution. Linear weights are updated annually. Variables: BB = Walks, HBP = Hit By Pitch, 1B = Singles, 2B = Doubles, 3B = Triples, HR = Home Runs (weights are approximate, vary by year)

\text{wOBA} = \frac{0.69(BB) + 0.72(HBP) + 0.88(1B) + 1.24(2B) + 1.56(3B) + 2.00(HR)}{AB + BB + SF + HBP}
wRC+ wRC+

Park and league-adjusted runs created. 100 is league average; each point above/below represents a percentage better/worse than average. Variables: wRAA = Weighted Runs Above Average, PA = Plate Appearances, lgR/PA = League Runs per PA, lgwRC/PA = League wRC per PA

\text{wRC+} = \left( \frac{\text{wRAA}}{\text{PA}} + \text{lgR/PA} \right) \times \frac{1}{\text{Park Factor}} \times \frac{1}{\text{lgwRC/PA}} \times 100
Defensive Runs Saved DRS

Comprehensive defensive metric from Baseball Info Solutions. Measures runs saved across all defensive components. Variables: rPM = Plus/Minus Runs, rGDP = GDP Runs, rARM = Outfield Arm Runs, rHR = Home Run Robbing Runs, rGFP = Good Fielding Play Runs

\text{DRS} = \text{rPM} + \text{rGDP} + \text{rARM} + \text{rHR} + \text{rGFP}
Ultimate Zone Rating UZR

Advanced defensive metric measuring runs saved above average based on play-by-play data and zone information. Variables: RngR = Range Runs, ErrR = Error Runs, DPR = Double Play Runs

\text{UZR} = \sum(\text{Zone Runs}) = \text{RngR} + \text{ErrR} + \text{DPR}
Expected Fielding Independent Pitching xFIP

FIP with home runs regressed to league-average HR/FB rate. Better at predicting future performance than FIP. Variables: FB = Fly Balls, lgHR/FB = League HR per Fly Ball rate (~10-11%), others same as FIP

\text{xFIP} = \frac{13(FB \times \text{lgHR/FB}) + 3(BB + HBP) - 2(K)}{IP} + C
FIP FIP

Estimates ERA based only on strikeouts, walks, HBP, and home runs—outcomes the pitcher controls. C is a constant to put FIP on ERA scale (usually around 3.10). Variables: HR = Home Runs, BB = Walks, HBP = Hit By Pitch, K = Strikeouts, IP = Innings Pitched, C = FIP Constant (~3.10)

\text{FIP} = \frac{13(HR) + 3(BB + HBP) - 2(K)}{IP} + C

Datasets & Resources

Dataset Description Format Size Action
MLB Statcast Data 2023 Complete Statcast data for 2023 MLB season including exit velocity, launch angle, and sprint speed CSV N/A
Lahman Baseball Database Historical baseball statistics from 1871 to present SQL N/A
Lahman Baseball Database Comprehensive historical baseball statistics from 1871 to present CSV 25MB
Statcast Data MLB pitch-by-pitch tracking data including velocity, spin, and trajectory CSV 500MB+
Baseball Savant Statcast Data Pitch-level data including velocity, spin rate, exit velocity, launch angle, and expected stats from MLB Statcast system. CSV 2+ GB/season

Ready to Master Baseball Analytics?

Start with the basics and work your way up to advanced machine learning applications.

View Glossary