Introduction to Statcast

Intermediate 10 min read 0 views Nov 26, 2025

Introduction to Statcast

Statcast represents a revolutionary tracking technology system that has fundamentally transformed how we analyze and understand baseball. Introduced by Major League Baseball in 2015, Statcast uses a sophisticated combination of high-resolution cameras and radar equipment to capture and measure previously unmeasurable aspects of the game. This system tracks everything from the trajectory of a batted ball to the precise movements of players on the field, providing unprecedented insight into athletic performance and game dynamics.

The implementation of Statcast has democratized access to advanced metrics that were once only available to professional teams with dedicated analytics departments. Today, fans, analysts, and amateur researchers can access this wealth of data to better understand player performance, make more informed predictions, and appreciate the subtle nuances that separate good players from great ones.

How Statcast Works: The Technology Behind the Data

Statcast's data collection relies on two primary technological systems working in concert. The first component is the Hawk-Eye camera system, which consists of multiple high-resolution optical cameras strategically positioned throughout each MLB stadium. These cameras, the same technology used in tennis for line calls and in cricket for tracking, capture every movement on the field at incredibly high frame rates.

The Hawk-Eye system tracks the baseball from the moment it leaves the pitcher's hand through its entire trajectory, whether that results in a ball in play, a foul ball, or a ball caught by the catcher. When the ball is put in play, the cameras continue tracking its flight path, measuring its exit velocity, launch angle, distance traveled, and ultimate landing point. The system is so precise that it can measure distances to within inches and velocities to within tenths of a mile per hour.

The second component is a military-grade Trackman radar system, which was used exclusively before the introduction of Hawk-Eye cameras in 2020. While Hawk-Eye now handles most tracking duties, radar technology still plays a complementary role in data collection and validation.

Key Statcast Metrics Explained

Exit Velocity

Exit velocity measures the speed of the baseball as it comes off the bat, expressed in miles per hour (mph). This metric is crucial because it correlates strongly with success at the plate. A ball hit harder has a better chance of becoming a hit, as it gives fielders less time to react and is more likely to find gaps or clear outfield walls.

Launch Angle

Launch angle represents the vertical angle at which the ball leaves the bat, measured in degrees. A launch angle of zero degrees means the ball was hit on a perfectly horizontal plane, positive values indicate the ball was hit in the air, and negative values indicate the ball was hit into the ground.

Sprint Speed

Sprint speed measures a player's fastest one-second running speed during a sprint opportunity, expressed in feet per second (ft/sec). This metric provides a more accurate and objective measure of player speed than traditional stolen base totals.

Spin Rate

Spin rate measures the number of revolutions per minute (rpm) that a pitched baseball completes during its flight to home plate. This metric has become essential for understanding pitch effectiveness.

Catch Probability

Catch probability is a defensive metric that estimates the likelihood that a batted ball will be caught by a fielder, based on the distance the fielder must travel, the time available to reach the ball, and the direction of travel.

Statcast Metric Benchmarks

Metric Elite Above Average Average Below Average
Exit Velocity (avg) 92+ mph 90-92 mph 87-90 mph <87 mph
Max Exit Velocity 115+ mph 112-115 mph 108-112 mph <108 mph
Sprint Speed 30+ ft/sec 28-30 ft/sec 27-28 ft/sec <27 ft/sec
Fastball Spin Rate 2,500+ rpm 2,300-2,500 rpm 2,100-2,300 rpm <2,100 rpm
Barrel Rate 10%+ 7-10% 5-7% <5%
Hard Hit Rate 45%+ 40-45% 35-40% <35%

Working with Statcast Data in Python

import pandas as pd
from pybaseball import statcast, statcast_batter, statcast_pitcher
import matplotlib.pyplot as plt
import seaborn as sns

# Pull all Statcast data for a date range
data = statcast(start_dt='2024-06-01', end_dt='2024-06-30')

print(f"Total pitches tracked: {len(data)}")
print(f"Columns available: {len(data.columns)}")

# Filter for batted balls only
batted_balls = data[data['type'] == 'X'].copy()

# Calculate average exit velocity and launch angle by player
batter_stats = batted_balls.groupby('player_name').agg({
    'launch_speed': 'mean',
    'launch_angle': 'mean',
    'estimated_ba_using_speedangle': 'mean',
    'events': 'count'
}).reset_index()

batter_stats.columns = ['player_name', 'avg_exit_velo', 'avg_launch_angle',
                        'xBA', 'batted_balls']

# Filter for players with at least 50 batted balls
qualified = batter_stats[batter_stats['batted_balls'] >= 50].copy()
qualified = qualified.sort_values('avg_exit_velo', ascending=False)

print("\nTop 10 Exit Velocity Leaders:")
print(qualified[['player_name', 'avg_exit_velo', 'avg_launch_angle', 'xBA']].head(10))

# Pull data for a specific player
player_data = statcast_batter('2024-06-01', '2024-06-30', 592450)  # Aaron Judge

# Create exit velocity vs launch angle scatter plot
plt.figure(figsize=(12, 8))
scatter = plt.scatter(batted_balls['launch_angle'],
                     batted_balls['launch_speed'],
                     c=batted_balls['estimated_ba_using_speedangle'],
                     cmap='RdYlGn', alpha=0.6, s=20)

plt.colorbar(scatter, label='Expected Batting Average')
plt.xlabel('Launch Angle (degrees)', fontsize=12)
plt.ylabel('Exit Velocity (mph)', fontsize=12)
plt.title('Exit Velocity vs Launch Angle (June 2024)', fontsize=14, fontweight='bold')
plt.axhline(y=95, color='red', linestyle='--', alpha=0.5, label='Elite EV')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('statcast_analysis.png', dpi=300)

# Analyze barrel rate
barrels = batted_balls[batted_balls['barrel'] == 1.0]
barrel_rate_by_player = batted_balls.groupby('player_name').agg({
    'barrel': 'sum',
    'events': 'count'
}).reset_index()

barrel_rate_by_player['barrel_rate'] = (
    barrel_rate_by_player['barrel'] / barrel_rate_by_player['events'] * 100
)

barrel_leaders = barrel_rate_by_player[barrel_rate_by_player['events'] >= 50].copy()
barrel_leaders = barrel_leaders.sort_values('barrel_rate', ascending=False)

print("\nTop 10 Barrel Rate Leaders:")
print(barrel_leaders[['player_name', 'barrel_rate', 'events']].head(10))

Working with Statcast Data in R

library(baseballr)
library(dplyr)
library(ggplot2)
library(lubridate)

# Scrape Statcast data for a date range
statcast_data <- scrape_statcast_savant(
  start_date = "2024-06-01",
  end_date = "2024-06-30",
  player_type = "batter"
)

cat("Total rows retrieved:", nrow(statcast_data), "\n")

# Filter for batted balls and clean data
batted_balls <- statcast_data %>%
  filter(type == "X", !is.na(launch_speed), !is.na(launch_angle)) %>%
  mutate(
    launch_speed = as.numeric(launch_speed),
    launch_angle = as.numeric(launch_angle),
    estimated_ba_using_speedangle = as.numeric(estimated_ba_using_speedangle)
  )

# Calculate player-level statistics
player_stats <- batted_balls %>%
  group_by(player_name) %>%
  summarise(
    batted_balls = n(),
    avg_exit_velo = mean(launch_speed, na.rm = TRUE),
    max_exit_velo = max(launch_speed, na.rm = TRUE),
    avg_launch_angle = mean(launch_angle, na.rm = TRUE),
    xBA = mean(estimated_ba_using_speedangle, na.rm = TRUE),
    barrel_count = sum(barrel == 1, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  filter(batted_balls >= 50) %>%
  mutate(barrel_rate = barrel_count / batted_balls * 100) %>%
  arrange(desc(avg_exit_velo))

# Display top performers
cat("\nTop 10 Exit Velocity Leaders:\n")
print(player_stats %>%
  select(player_name, avg_exit_velo, max_exit_velo, xBA, barrel_rate) %>%
  head(10))

# Create visualization
ggplot(batted_balls %>% sample_n(min(5000, nrow(batted_balls))),
       aes(x = launch_angle, y = launch_speed, color = estimated_ba_using_speedangle)) +
  geom_point(alpha = 0.5, size = 2) +
  scale_color_gradient2(
    low = "red", mid = "yellow", high = "green",
    midpoint = 0.500,
    name = "xBA"
  ) +
  geom_hline(yintercept = 95, linetype = "dashed", color = "red", size = 1) +
  geom_vline(xintercept = c(10, 30), linetype = "dashed", color = "blue", size = 0.8) +
  labs(
    title = "Exit Velocity vs Launch Angle Analysis",
    subtitle = "June 2024 Statcast Data",
    x = "Launch Angle (degrees)",
    y = "Exit Velocity (mph)"
  ) +
  theme_minimal(base_size = 14)

ggsave("statcast_r_analysis.png", width = 12, height = 8, dpi = 300)

# Analyze hard hit balls (95+ mph)
hard_hit_analysis <- batted_balls %>%
  mutate(hard_hit = launch_speed >= 95) %>%
  group_by(player_name) %>%
  summarise(
    batted_balls = n(),
    hard_hit_count = sum(hard_hit, na.rm = TRUE),
    .groups = 'drop'
  ) %>%
  filter(batted_balls >= 50) %>%
  mutate(hard_hit_rate = hard_hit_count / batted_balls * 100) %>%
  arrange(desc(hard_hit_rate))

cat("\nTop 10 Hard Hit Rate Leaders:\n")
print(hard_hit_analysis %>% select(player_name, hard_hit_rate, batted_balls) %>% head(10))

Expected Statistics: xBA, xSLG, and xwOBA

One of the most powerful applications of Statcast data is the calculation of expected statistics. These metrics use exit velocity, launch angle, and sprint speed to estimate what a player's batting average, slugging percentage, or weighted on-base average should have been based on the quality of contact, independent of fielding or luck.

Key Takeaways

  • Exit velocity correlates strongly with batting success: Harder hit balls are more likely to become hits and extra-base hits.
  • Launch angle optimization matters: The 10-30 degree range typically produces the best outcomes for hitters.
  • Sprint speed provides objective speed measurement: More reliable than stolen base totals for evaluating raw speed.
  • Spin rate affects pitch effectiveness: Higher spin rates generally make pitches harder to hit.
  • Expected stats reveal true talent: xBA, xSLG, and xwOBA strip away luck to show underlying performance quality.

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.