Sprint Speed and Baserunning Metrics

Intermediate 10 min read 20 views Nov 26, 2025

# Sprint Speed & Baserunning Analytics ## Introduction to Sprint Speed Sprint Speed is one of Statcast's most revolutionary metrics, providing an objective measurement of a player's maximum running velocity on the baseball field. Introduced in 2015 as part of MLB's Statcast tracking system, Sprint Speed quantifies raw speed in feet per second (ft/s), offering unprecedented insights into player athleticism and baserunning ability. Unlike traditional stolen base totals or subjective scouting grades, Sprint Speed provides a standardized, context-neutral measurement that allows for direct comparisons across players, positions, and eras. This metric has fundamentally changed how teams evaluate speed, influencing decisions on baserunning strategy, defensive positioning, roster construction, and player development. ## What is Sprint Speed? **Sprint Speed** measures a player's top running speed in feet per second (ft/s), calculated based on their fastest one-second window during competitive plays. Statcast captures this data using a sophisticated array of high-resolution cameras and radar equipment installed in every MLB stadium. ### Key Characteristics: - **Unit of measurement**: Feet per second (ft/s) - **Measurement window**: Maximum speed during any one-second interval - **Minimum qualification**: 10 "competitive runs" (excluding home run trots) - **Typical range**: 23-30+ ft/s for MLB players - **League average**: Approximately 27 ft/s (varies slightly by season) ### What Constitutes a "Competitive Run"? Statcast considers the following scenarios as competitive runs: - Home to first on ground balls and bunt attempts - Stolen base attempts (successful or caught stealing) - Going first to third on singles - Advancing on fly balls and ground outs - Scoring from second on singles or from first on doubles - Any baserunning play where the player is attempting to advance **Excluded scenarios**: Home run trots, jogging to first on walks, or any situation where the player is not running at competitive speed. ## How Statcast Measures Sprint Speed Statcast employs a multi-camera tracking system combined with sophisticated algorithms to capture and calculate Sprint Speed with remarkable precision. ### The Technology: 1. **Camera Array**: Each MLB stadium has multiple high-resolution cameras (typically 10-12) positioned strategically around the field 2. **Frame Rate**: Cameras capture 30 frames per second, providing granular positional data 3. **Triangulation**: Multiple camera angles allow for precise 3D positioning of the player 4. **Radar Integration**: Some systems incorporate Trackman radar for additional verification ### The Calculation Process: 1. **Position Tracking**: The system identifies the player's center of mass in each frame 2. **Distance Calculation**: Frame-by-frame position changes are converted to distance traveled 3. **Speed Computation**: Distance divided by time yields instantaneous speed 4. **Window Analysis**: The algorithm scans all one-second windows during the play 5. **Maximum Selection**: The highest one-second average becomes that play's Sprint Speed 6. **Seasonal Average**: All qualifying competitive runs are averaged for the player's season Sprint Speed ### Why a One-Second Window? The one-second measurement window balances several considerations: - **Acceleration Smoothing**: Minimizes noise from the acceleration phase - **Peak Performance**: Captures true top speed rather than brief spikes - **Statistical Reliability**: Provides consistent, reproducible measurements - **Practical Relevance**: Reflects sustained speed over baserunning-relevant distances ## League Average Benchmarks and Speed Tiers Understanding where a player's Sprint Speed falls relative to league norms is essential for proper evaluation. ### Historical League Averages (2016-2024): | Season | League Average Sprint Speed | |--------|----------------------------| | 2016 | 27.0 ft/s | | 2017 | 27.1 ft/s | | 2018 | 27.0 ft/s | | 2019 | 27.0 ft/s | | 2020 | 27.2 ft/s | | 2021 | 27.1 ft/s | | 2022 | 27.0 ft/s | | 2023 | 27.1 ft/s | | 2024 | 27.0 ft/s | The league average has remained remarkably stable around **27.0 ft/s**, with minimal year-to-year variation. ### Speed Classification Tiers: **Elite Speed (29.5+ ft/s)** - Top 5% of MLB players - Examples: Byron Buxton (30.5), Trea Turner (30.1), Bobby Witt Jr. (30.3) - Characteristics: Game-changing speed, elite stolen base threat, frequent infield hits **Plus Speed (28.5-29.4 ft/s)** - Top 10-20% of players - Above-average stolen base success, strong baserunning value - Examples: Elly De La Cruz (30.0), Ronald Acuña Jr. (29.8) **Above Average (27.5-28.4 ft/s)** - Better than 50% of league - Solid baserunning, occasional stolen bases - Examples: Fernando Tatis Jr. (28.7), Mookie Betts (28.4) **Average (26.5-27.4 ft/s)** - League typical - Selective baserunning, limited stealing opportunities - Represents the median MLB player **Below Average (25.5-26.4 ft/s)** - Slower than typical - Conservative baserunning approach - Many corner infielders and catchers **Poor Speed (24.5-25.4 ft/s)** - Bottom 20% of players - Rarely attempts steals, station-to-station baserunning **Very Poor Speed (<24.5 ft/s)** - Bottom 5% of players - Significant baserunning liability - Often first basemen, designated hitters, or catchers ## Elite Speed Thresholds and Impact Players who reach certain Sprint Speed thresholds gain tangible advantages that translate directly into on-field value. ### The 30 ft/s Club Breaking the 30 ft/s barrier represents truly elite speed. Players in this category: - Can beat out routine ground balls for infield hits - Create extreme pressure on opposing defenses - Excel at going first-to-third on singles (70%+ success rate vs. 38% league average) - Maintain 80%+ stolen base success rates when properly utilized **Active 30+ ft/s Players (2024)**: - Elly De La Cruz: 30.6 ft/s - Bobby Witt Jr.: 30.3 ft/s - Byron Buxton: 30.5 ft/s - Corbin Carroll: 30.1 ft/s ### Speed's Defensive Impact Sprint Speed isn't just about baserunning—it dramatically affects defensive range: **Outfield**: Each 1 ft/s increase in Sprint Speed correlates with approximately: - 10-15 feet of additional range - 2-3% higher catch probability on "50/50" balls - Better positioning on balls hit to the gaps **Infield**: Speed advantages translate to: - Expanded range on balls up the middle or in the hole - Higher conversion rate on "tough" plays - Improved double-play pivot execution ## Comprehensive Baserunning Metrics Sprint Speed provides the foundation for understanding several advanced baserunning metrics that quantify a player's total baserunning value. ### BsR (Base Running Runs Above Average) **BsR** measures the total runs a player contributes (or costs) through baserunning, relative to a league-average player. This comprehensive metric includes: **Components**: 1. **wSB** (weighted Stolen Bases): Value from stealing bases 2. **UBR** (Ultimate Base Running): Value from all non-stolen base advancement 3. Additional small adjustments for specific scenarios **Calculation Formula**: ``` BsR = wSB + UBR ``` **Interpretation**: - +5 BsR: Elite baserunner, worth half a win from baserunning alone - +3 BsR: Excellent baserunner - 0 BsR: League average - -3 BsR: Poor baserunner - -5 BsR: Significant liability on the basepaths **Top BsR Leaders (2024)**: - Elly De La Cruz: +8.2 BsR - Bobby Witt Jr.: +7.8 BsR - Corbin Carroll: +6.4 BsR ### UBR (Ultimate Base Running) **UBR** isolates the value of baserunning advancement outside of stolen bases, measuring: - Taking extra bases on hits (1st-to-3rd, 2nd-to-home, etc.) - Advancement on outs (tagging up, advancing on ground outs) - Avoiding double plays - Not getting thrown out on ill-advised advancement attempts **What UBR Captures**: ``` UBR = (Extra Bases Taken Value) + (Outs on Bases Value) + (Advancement on Outs Value) ``` **Positive UBR Drivers**: - High first-to-third percentage on singles - Aggressive but smart advancement from second on singles - Good reads on fly balls for tagging - Avoiding unnecessary outs **Negative UBR Factors**: - Getting thrown out stretching singles to doubles - Poor reads leading to being doubled off - Slow home-to-first times leading to GIDP ### wSB (Weighted Stolen Bases) **wSB** quantifies the run value contribution from stolen base attempts, accounting for both successes and failures. **Formula**: ``` wSB = (SB × runSB) + (CS × runCS) ``` Where: - **runSB**: Run value of a successful steal (~0.2 runs for 2B, ~0.15 for 3B) - **runCS**: Run value of being caught stealing (~-0.45 runs) **Break-Even Success Rate**: Given the asymmetric values, the break-even stolen base success rate is approximately **70%**: ``` 0 = (0.2 × x) + (-0.45 × (1-x)) 0 = 0.2x - 0.45 + 0.45x 0.45 = 0.65x x = 0.692 ≈ 70% ``` **Interpretation**: - Players must succeed >70% of the time for stealing to be value-positive - Elite success rates (80%+) make stealing highly valuable - Success rates below 70% actually hurt the team ## Stolen Base Success Rates and Strategy Modern analytics have revolutionized stolen base strategy, moving away from "green light" speed merchants toward optimized attempts based on success probability. ### Modern Success Rate Benchmarks: **League-Wide Trends**: - MLB Average Success Rate (2024): 79.8% - This represents a dramatic increase from historical ~68% rates - Driven by better base stealing analytics and selective attempt strategies ### Factors Influencing Success Rate: 1. **Sprint Speed** (Primary Factor): - <27 ft/s: ~72% success rate - 27-28 ft/s: ~76% success rate - 28-29 ft/s: ~80% success rate - 29-30 ft/s: ~83% success rate - 30+ ft/s: ~86% success rate 2. **Lead Distance and Jump Efficiency**: - Primary lead: Typically 9-12 feet - Secondary lead: Additional 3-5 feet - Jump efficiency: Time from secondary to full sprint 3. **Pitcher Delivery Time**: - Slow delivery (>1.40s to home): Steal success >85% - Average delivery (1.30-1.40s): Success ~78% - Quick delivery (<1.30s): Success <70% 4. **Catcher Pop Time**: - Slow (>2.1s): Very favorable for steals - Average (1.95-2.05s): Neutral - Elite (<1.90s): Difficult, requires elite speed 5. **Count Leverage**: - Pitcher's counts (0-2, 1-2): Lower success, catcher expecting - Hitter's counts (2-0, 3-1): Higher success, pitcher focused on strike - Even counts (1-1, 2-2): Neutral ### The Analytics Revolution in Stealing: Modern teams use sophisticated models incorporating: - Real-time pitcher delivery metrics - Catcher arm strength and accuracy data - Historical matchup success rates - Game situation (score, inning, outs) - Expected run value calculations **Result**: Teams steal less frequently but with much higher success rates, maximizing run value. ## Home-to-First Times and Infield Hit Probability Sprint Speed directly determines how quickly a player can reach first base on ground balls, creating infield hit opportunities and avoiding double plays. ### Average Home-to-First Times by Speed Tier: | Sprint Speed | Avg Home-to-First Time | Infield Hit Rate | |-------------|------------------------|------------------| | 30+ ft/s | 4.0-4.1s | 8.5% | | 29-30 ft/s | 4.1-4.2s | 6.8% | | 28-29 ft/s | 4.2-4.3s | 5.2% | | 27-28 ft/s | 4.3-4.4s | 3.8% | | 26-27 ft/s | 4.4-4.5s | 2.7% | | 25-26 ft/s | 4.5-4.6s | 1.9% | | <25 ft/s | 4.6-4.8s | 1.2% | ### The Infield Hit Advantage: For elite speedsters (30+ ft/s), approximately **8-9%** of all ground balls become infield hits. For slow players (<25 ft/s), that drops to **1-2%**. **Annual Impact Example**: - Player with 400 ground balls per season - Elite speed (8.5% infield hit rate): 34 infield hits - Poor speed (1.5% infield hit rate): 6 infield hits - **Difference**: 28 additional hits = ~20-25 point batting average boost ### Batted Ball Quality Interaction: Sprint Speed's impact varies by ground ball type: **Weakly Hit Grounders** (<75 mph exit velocity): - Elite speed can convert 15-20% into hits - Slow players: <5% hit rate **Medium Ground Balls** (75-90 mph): - Elite speed: ~6-8% hits - Slow players: ~1-2% hits **Hard Ground Balls** (90+ mph): - Speed matters less; fielding difficulty dominates - Elite speed: ~12-15% hits - Slow players: ~8-10% hits ### Bunt Hit Probability: Sprint Speed dramatically affects bunt success: - 30+ ft/s: 65-75% bunt hit success - 27-29 ft/s: 45-55% success - <27 ft/s: 25-35% success ## Speed Score Formula and Applications Before Statcast, Bill James developed the **Speed Score** formula to estimate player speed using traditional statistics. While less precise than Sprint Speed, it remains useful for historical analysis. ### Speed Score Formula: ``` Speed Score = ( (SB / (1B + BB)) × 0.25 + (SB% × 3) + ((Triples / (2B + 3B)) × 0.5) + ((Runs - HR) / (H + BB - HR)) × 0.25 + (SB / (GDP + SB)) × 0.1 ) × 20 ``` Where: - **SB**: Stolen bases - **1B**: Singles - **BB**: Walks - **SB%**: Stolen base success rate (SB / (SB + CS)) - **2B**: Doubles - **3B**: Triples - **Runs**: Runs scored - **H**: Hits - **HR**: Home runs - **GDP**: Grounded into double plays ### Interpreting Speed Score: - **8-10**: Elite speed (equivalent to ~29.5+ ft/s) - **6-8**: Above average speed (~27.5-29.5 ft/s) - **4-6**: Average speed (~26.0-27.5 ft/s) - **2-4**: Below average speed (~24.5-26.0 ft/s) - **0-2**: Poor speed (<24.5 ft/s) ### Correlation with Sprint Speed: Speed Score correlates moderately well with Sprint Speed (r ≈ 0.72), but has limitations: - Doesn't account for managerial decisions on stealing - Influenced by batting order position (affects run scoring) - Small sample size issues with triples - Can be gamed by selective stealing **Modern Use**: Primarily for historical player analysis pre-Statcast era (before 2015). ## Age-Related Speed Decline Curves Sprint Speed follows predictable aging patterns, with peak speeds typically occurring in players' mid-20s and gradual decline thereafter. ### Typical Speed Aging Curve: | Age Range | Relative Sprint Speed | Annual Decline Rate | |-----------|----------------------|---------------------| | 20-23 | Peak (100%) | +0.1 ft/s (still improving) | | 24-26 | Peak Plateau (100%) | 0.0 ft/s | | 27-29 | Early Decline (98%) | -0.08 ft/s per year | | 30-32 | Moderate Decline (95%) | -0.12 ft/s per year | | 33-35 | Accelerated Decline (90%) | -0.18 ft/s per year | | 36+ | Steep Decline (85%) | -0.25 ft/s per year | ### Example Aging Pattern: **Elite Speedster Career Arc**: - Age 23: 30.2 ft/s (elite) - Age 26: 30.1 ft/s (elite) - Age 29: 29.5 ft/s (plus-plus) - Age 32: 28.6 ft/s (plus) - Age 35: 27.4 ft/s (average) - Age 38: 26.0 ft/s (below average) ### Implications for Team Building: **Speed Premium Value Window**: Teams should maximize speed-dependent strategies during players' age 24-30 seasons, when speed remains elite or plus. After age 30, expect measurable decline requiring adjusted baserunning aggression. **Positional Changes**: Speed decline often triggers defensive position changes: - Center field → Corner outfield (age 30-32) - Shortstop → Second base or third base (age 32-34) - Speed-dependent defenders → DH (age 34+) ### Injury Impact on Speed: Certain injuries accelerate speed decline: - **Achilles injuries**: Average -0.6 ft/s sustained decline - **Hamstring strains**: Temporary -0.3 ft/s, usually recovers - **ACL tears**: Average -0.4 ft/s sustained decline - **Hip injuries**: Variable, but often -0.3 to -0.5 ft/s ## Impact on Infield Hit Probability Sprint Speed's most direct offensive impact comes through increased infield hit probability, effectively boosting batting average for fast players. ### Infield Hit Regression Model: **Probability of Infield Hit** = f(Sprint Speed, Exit Velocity, Launch Angle, Fielder Positioning, Handedness) **Key Factors**: 1. **Sprint Speed** (Primary Driver): - Each +1 ft/s increase = +1.8% absolute increase in infield hit rate - Effect is multiplicative with weak contact 2. **Exit Velocity** (Negative Correlation): - <70 mph: High infield hit probability (if fair) - 70-85 mph: Moderate probability - >85 mph: Low probability (fielders can't react) 3. **Launch Angle**: - -15° to +5°: Optimal for infield hits (ground balls) - +5° to +15°: Low probability (line drives, caught) - >+15°: No infield hit opportunity (fly balls) 4. **Batter Handedness**: - Left-handed batters: +2.3% infield hit rate (shorter distance to first) - Right-handed batters: Baseline ### Sprint Speed × Exit Velocity Interaction: | Sprint Speed | <70 mph EV | 70-80 mph EV | 80-90 mph EV | 90+ mph EV | |-------------|-----------|-------------|-------------|-----------| | 30+ ft/s | 22% | 8% | 6% | 11% | | 28-30 ft/s | 16% | 5% | 4% | 9% | | 26-28 ft/s | 11% | 3% | 2% | 7% | | <26 ft/s | 7% | 2% | 1% | 5% | **Insight**: The speed advantage is largest on weakly-hit balls, where fielders must charge and make difficult throws. ### Strategic Implications: **For Fast Players**: - Bunting becomes viable offensive weapon - Contact-oriented approach can be optimal - Leg out close plays increases BABIP by 15-25 points **For Slow Players**: - Must rely on hard contact and extra-base hits - No infield hit safety valve on poor contact - GIDP risk elevated on ground balls ## Python Code Examples ### Example 1: Fetching Sprint Speed Data from Statcast ```python import pandas as pd from pybaseball import statcast_sprint_speed import matplotlib.pyplot as plt import seaborn as sns # Fetch sprint speed data for 2024 season sprint_speed_data = statcast_sprint_speed(2024, min_competitive_runs=10) # Display top 10 fastest players top_speedsters = sprint_speed_data.nlargest(10, 'sprint_speed')[ ['player_name', 'sprint_speed', 'competitive_runs', 'bolts'] ] print("Top 10 Fastest Players (2024):") print(top_speedsters) # Calculate percentile rankings sprint_speed_data['speed_percentile'] = sprint_speed_data['sprint_speed'].rank(pct=True) * 100 # Classify players into speed tiers def classify_speed(speed): if speed >= 29.5: return 'Elite' elif speed >= 28.5: return 'Plus' elif speed >= 27.5: return 'Above Average' elif speed >= 26.5: return 'Average' elif speed >= 25.5: return 'Below Average' elif speed >= 24.5: return 'Poor' else: return 'Very Poor' sprint_speed_data['speed_tier'] = sprint_speed_data['sprint_speed'].apply(classify_speed) # Distribution of speed tiers tier_distribution = sprint_speed_data['speed_tier'].value_counts() print("\nSpeed Tier Distribution:") print(tier_distribution) # Save to CSV for further analysis sprint_speed_data.to_csv('sprint_speed_2024.csv', index=False) print("\nData saved to sprint_speed_2024.csv") ``` ### Example 2: Analyzing Speed Distributions by Position ```python import pandas as pd from pybaseball import statcast_sprint_speed import matplotlib.pyplot as plt import seaborn as sns import numpy as np # Fetch sprint speed data sprint_data = statcast_sprint_speed(2024, min_competitive_runs=10) # For positional data, we need to merge with player positions # This example assumes you have position data available from pybaseball import chadwick_register # Get player position data (you may need alternative source) # Here's a simplified example with hypothetical position mapping positions = { 'C': 'Catcher', '1B': 'First Base', '2B': 'Second Base', '3B': 'Third Base', 'SS': 'Shortstop', 'LF': 'Left Field', 'CF': 'Center Field', 'RF': 'Right Field', 'DH': 'Designated Hitter' } # Create visualization comparing speed by position # Assuming we have position data in sprint_data dataframe plt.figure(figsize=(14, 8)) # Position order from fastest to slowest (typical) position_order = ['CF', 'SS', '2B', 'LF', 'RF', '3B', 'DH', '1B', 'C'] # Create boxplot sns.boxplot(data=sprint_data, x='position', y='sprint_speed', order=position_order, palette='viridis') plt.axhline(y=27.0, color='red', linestyle='--', label='League Average (27.0 ft/s)') plt.xlabel('Position', fontsize=12) plt.ylabel('Sprint Speed (ft/s)', fontsize=12) plt.title('Sprint Speed Distribution by Position (2024)', fontsize=14, fontweight='bold') plt.legend() plt.grid(axis='y', alpha=0.3) plt.tight_layout() plt.savefig('speed_by_position.png', dpi=300) plt.show() # Calculate summary statistics by position position_stats = sprint_data.groupby('position')['sprint_speed'].agg([ ('Count', 'count'), ('Mean', 'mean'), ('Median', 'median'), ('Std Dev', 'std'), ('Min', 'min'), ('Max', 'max') ]).round(2) print("Sprint Speed Statistics by Position:") print(position_stats) # Perform ANOVA to test if position differences are significant from scipy import stats position_groups = [group['sprint_speed'].values for name, group in sprint_data.groupby('position')] f_stat, p_value = stats.f_oneway(*position_groups) print(f"\nANOVA Results:") print(f"F-statistic: {f_stat:.4f}") print(f"P-value: {p_value:.6f}") print(f"Significant difference: {'Yes' if p_value < 0.05 else 'No'}") ``` ### Example 3: Correlating Speed with Stolen Base Success ```python import pandas as pd import numpy as np from pybaseball import statcast_sprint_speed, batting_stats import matplotlib.pyplot as plt import seaborn as sns from scipy.stats import pearsonr, spearmanr # Fetch sprint speed data sprint_data = statcast_sprint_speed(2024, min_competitive_runs=10) # Fetch batting stats (includes SB, CS) batting_data = batting_stats(2024, qual=50) # Merge datasets on player name merged_data = pd.merge( sprint_data[['player_name', 'sprint_speed', 'bolts']], batting_data[['Name', 'SB', 'CS', 'PA']], left_on='player_name', right_on='Name', how='inner' ) # Calculate stolen base success rate merged_data['SB_attempts'] = merged_data['SB'] + merged_data['CS'] merged_data['SB_success_rate'] = np.where( merged_data['SB_attempts'] > 0, merged_data['SB'] / merged_data['SB_attempts'] * 100, np.nan ) # Filter to players with at least 10 SB attempts for meaningful analysis sb_analysis = merged_data[merged_data['SB_attempts'] >= 10].copy() print(f"Analyzing {len(sb_analysis)} players with 10+ SB attempts") # Calculate correlation between sprint speed and SB success rate correlation, p_value = pearsonr( sb_analysis['sprint_speed'], sb_analysis['SB_success_rate'] ) print(f"\nPearson Correlation: {correlation:.4f}") print(f"P-value: {p_value:.6f}") # Create scatter plot with regression line plt.figure(figsize=(12, 8)) sns.regplot( data=sb_analysis, x='sprint_speed', y='SB_success_rate', scatter_kws={'s': sb_analysis['SB_attempts'] * 3, 'alpha': 0.6}, line_kws={'color': 'red', 'linewidth': 2} ) # Add league average lines plt.axvline(x=27.0, color='gray', linestyle='--', alpha=0.5, label='Avg Speed') plt.axhline(y=70.0, color='orange', linestyle='--', alpha=0.5, label='Break-even SB% (70%)') plt.xlabel('Sprint Speed (ft/s)', fontsize=12) plt.ylabel('Stolen Base Success Rate (%)', fontsize=12) plt.title(f'Sprint Speed vs. Stolen Base Success Rate (2024)\nCorrelation: {correlation:.3f}', fontsize=14, fontweight='bold') plt.legend() plt.grid(alpha=0.3) plt.tight_layout() plt.savefig('speed_vs_sb_success.png', dpi=300) plt.show() # Analyze by speed tiers def speed_tier(speed): if speed >= 29.5: return 'Elite (29.5+)' elif speed >= 28.0: return 'Plus (28-29.5)' elif speed >= 27.0: return 'Above Avg (27-28)' else: return 'Below Avg (<27)' sb_analysis['speed_category'] = sb_analysis['sprint_speed'].apply(speed_tier) tier_stats = sb_analysis.groupby('speed_category').agg({ 'SB_success_rate': ['mean', 'median', 'count'], 'SB_attempts': 'sum', 'SB': 'sum' }).round(2) print("\nStolen Base Success by Speed Tier:") print(tier_stats) # Calculate expected vs actual SB success by speed # Linear regression model from sklearn.linear_model import LinearRegression X = sb_analysis[['sprint_speed']].values y = sb_analysis['SB_success_rate'].values model = LinearRegression() model.fit(X, y) sb_analysis['expected_sb_rate'] = model.predict(X) sb_analysis['sb_rate_above_expected'] = ( sb_analysis['SB_success_rate'] - sb_analysis['expected_sb_rate'] ) # Top over-performers (best reads/jumps) top_overperformers = sb_analysis.nlargest(10, 'sb_rate_above_expected')[ ['player_name', 'sprint_speed', 'SB_success_rate', 'expected_sb_rate', 'sb_rate_above_expected', 'SB_attempts'] ] print("\nTop 10 SB Success Over-Performers (Best Reads/Jumps):") print(top_overperformers) ``` ### Example 4: Visualizing Speed vs Baserunning Value ```python import pandas as pd import numpy as np from pybaseball import statcast_sprint_speed import matplotlib.pyplot as plt import seaborn as sns # Fetch sprint speed data sprint_data = statcast_sprint_speed(2024, min_competitive_runs=10) # For this example, we'll need baserunning runs (BsR, UBR, wSB) # These are available from FanGraphs or Baseball Reference # Here's a simulated example of merging with baserunning data # Simulated baserunning data (in practice, scrape from FanGraphs) np.random.seed(42) sprint_data['BsR'] = (sprint_data['sprint_speed'] - 27) * 2.5 + np.random.normal(0, 1.5, len(sprint_data)) sprint_data['UBR'] = (sprint_data['sprint_speed'] - 27) * 1.8 + np.random.normal(0, 1.2, len(sprint_data)) sprint_data['wSB'] = (sprint_data['sprint_speed'] - 27) * 0.7 + np.random.normal(0, 0.8, len(sprint_data)) # Create multi-panel visualization fig, axes = plt.subplots(2, 2, figsize=(16, 12)) # Panel 1: Sprint Speed vs BsR (Total Baserunning Runs) ax1 = axes[0, 0] scatter1 = ax1.scatter( sprint_data['sprint_speed'], sprint_data['BsR'], c=sprint_data['competitive_runs'], s=60, alpha=0.6, cmap='viridis' ) ax1.axhline(y=0, color='red', linestyle='--', alpha=0.5) ax1.axvline(x=27.0, color='gray', linestyle='--', alpha=0.5) ax1.set_xlabel('Sprint Speed (ft/s)', fontsize=11) ax1.set_ylabel('Baserunning Runs (BsR)', fontsize=11) ax1.set_title('Sprint Speed vs Total Baserunning Value', fontweight='bold') ax1.grid(alpha=0.3) plt.colorbar(scatter1, ax=ax1, label='Competitive Runs') # Panel 2: Sprint Speed vs UBR (Non-Steal Baserunning) ax2 = axes[0, 1] sns.regplot( data=sprint_data, x='sprint_speed', y='UBR', ax=ax2, scatter_kws={'alpha': 0.5}, line_kws={'color': 'red'} ) ax2.axhline(y=0, color='red', linestyle='--', alpha=0.5) ax2.axvline(x=27.0, color='gray', linestyle='--', alpha=0.5) ax2.set_xlabel('Sprint Speed (ft/s)', fontsize=11) ax2.set_ylabel('Ultimate Baserunning (UBR)', fontsize=11) ax2.set_title('Sprint Speed vs Non-Steal Baserunning', fontweight='bold') ax2.grid(alpha=0.3) # Panel 3: Sprint Speed vs wSB (Stolen Base Runs) ax3 = axes[1, 0] sns.regplot( data=sprint_data, x='sprint_speed', y='wSB', ax=ax3, scatter_kws={'alpha': 0.5, 'color': 'green'}, line_kws={'color': 'darkgreen'} ) ax3.axhline(y=0, color='red', linestyle='--', alpha=0.5) ax3.axvline(x=27.0, color='gray', linestyle='--', alpha=0.5) ax3.set_xlabel('Sprint Speed (ft/s)', fontsize=11) ax3.set_ylabel('Weighted Stolen Base Runs (wSB)', fontsize=11) ax3.set_title('Sprint Speed vs Stolen Base Value', fontweight='bold') ax3.grid(alpha=0.3) # Panel 4: Distribution of BsR by Speed Tier ax4 = axes[1, 1] speed_tiers = pd.cut( sprint_data['sprint_speed'], bins=[0, 26, 27, 28, 29, 35], labels=['<26', '26-27', '27-28', '28-29', '29+'] ) sprint_data['speed_bin'] = speed_tiers violin_data = sprint_data.dropna(subset=['speed_bin', 'BsR']) sns.violinplot( data=violin_data, x='speed_bin', y='BsR', ax=ax4, palette='Set2' ) ax4.axhline(y=0, color='red', linestyle='--', alpha=0.5) ax4.set_xlabel('Sprint Speed Tier (ft/s)', fontsize=11) ax4.set_ylabel('Baserunning Runs (BsR)', fontsize=11) ax4.set_title('Baserunning Value Distribution by Speed', fontweight='bold') ax4.grid(axis='y', alpha=0.3) plt.tight_layout() plt.savefig('speed_vs_baserunning_comprehensive.png', dpi=300) plt.show() # Calculate summary statistics print("Baserunning Value by Speed Tier:") speed_summary = sprint_data.groupby('speed_bin').agg({ 'BsR': ['mean', 'median', 'std'], 'UBR': ['mean', 'median'], 'wSB': ['mean', 'median'], 'player_name': 'count' }).round(2) print(speed_summary) # Identify elite baserunners exceeding expectations from sklearn.linear_model import LinearRegression X = sprint_data[['sprint_speed']].values y = sprint_data['BsR'].values model = LinearRegression() model.fit(X, y) sprint_data['expected_BsR'] = model.predict(X) sprint_data['BsR_above_expected'] = sprint_data['BsR'] - sprint_data['expected_BsR'] elite_baserunners = sprint_data.nlargest(15, 'BsR_above_expected')[ ['player_name', 'sprint_speed', 'BsR', 'expected_BsR', 'BsR_above_expected'] ] print("\nTop 15 Baserunners vs Expected (Best Instincts):") print(elite_baserunners) ``` ## R Code Examples ### Example 1: Fetching and Exploring Sprint Speed Data ```r # Load required libraries library(baseballr) library(dplyr) library(ggplot2) library(tidyr) # Fetch sprint speed data for 2024 sprint_speed_data <- statcast_sprint_speed(year = 2024, min_opp = 10) # View structure str(sprint_speed_data) # Summary statistics summary(sprint_speed_data$sprint_speed) # Top 10 fastest players top_speedsters <- sprint_speed_data %>% arrange(desc(sprint_speed)) %>% select(player_name, sprint_speed, competitive_runs, hp_to_1b, bolts) %>% head(10) print("Top 10 Fastest Players (2024):") print(top_speedsters) # Create speed tier classification sprint_speed_data <- sprint_speed_data %>% mutate( speed_tier = case_when( sprint_speed >= 29.5 ~ "Elite", sprint_speed >= 28.5 ~ "Plus", sprint_speed >= 27.5 ~ "Above Average", sprint_speed >= 26.5 ~ "Average", sprint_speed >= 25.5 ~ "Below Average", sprint_speed >= 24.5 ~ "Poor", TRUE ~ "Very Poor" ), speed_tier = factor(speed_tier, levels = c( "Elite", "Plus", "Above Average", "Average", "Below Average", "Poor", "Very Poor" )) ) # Distribution of speed tiers tier_distribution <- sprint_speed_data %>% count(speed_tier) %>% mutate(percentage = n / sum(n) * 100) print("Speed Tier Distribution:") print(tier_distribution) # Histogram of sprint speeds ggplot(sprint_speed_data, aes(x = sprint_speed)) + geom_histogram(binwidth = 0.5, fill = "steelblue", color = "black", alpha = 0.7) + geom_vline(xintercept = 27.0, color = "red", linetype = "dashed", size = 1) + annotate("text", x = 27.0, y = 30, label = "League Avg\n(27.0 ft/s)", color = "red", hjust = -0.1) + labs( title = "Distribution of Sprint Speed (2024)", x = "Sprint Speed (ft/s)", y = "Number of Players" ) + theme_minimal() + theme(plot.title = element_text(face = "bold", size = 14)) ggsave("sprint_speed_histogram.png", width = 10, height = 6, dpi = 300) ``` ### Example 2: Analyzing Speed by Position ```r library(baseballr) library(dplyr) library(ggplot2) library(tidyr) # Fetch sprint speed data sprint_data <- statcast_sprint_speed(year = 2024, min_opp = 10) # For positional analysis, we need to add position data # This example assumes position data is available or merged from another source # Using hypothetical position assignment for demonstration set.seed(42) positions <- c("C", "1B", "2B", "3B", "SS", "LF", "CF", "RF", "DH") sprint_data$position <- sample(positions, nrow(sprint_data), replace = TRUE, prob = c(0.08, 0.10, 0.12, 0.12, 0.12, 0.12, 0.12, 0.12, 0.10)) # Calculate position summaries position_stats <- sprint_data %>% group_by(position) %>% summarise( count = n(), mean_speed = mean(sprint_speed, na.rm = TRUE), median_speed = median(sprint_speed, na.rm = TRUE), sd_speed = sd(sprint_speed, na.rm = TRUE), min_speed = min(sprint_speed, na.rm = TRUE), max_speed = max(sprint_speed, na.rm = TRUE) ) %>% arrange(desc(mean_speed)) print("Sprint Speed Statistics by Position:") print(position_stats) # Create boxplot by position position_order <- c("CF", "SS", "2B", "LF", "RF", "3B", "DH", "1B", "C") ggplot(sprint_data, aes(x = factor(position, levels = position_order), y = sprint_speed, fill = position)) + geom_boxplot(alpha = 0.7, outlier.shape = 21) + geom_hline(yintercept = 27.0, color = "red", linetype = "dashed", size = 1) + labs( title = "Sprint Speed Distribution by Position (2024)", x = "Position", y = "Sprint Speed (ft/s)" ) + theme_minimal() + theme( plot.title = element_text(face = "bold", size = 14), legend.position = "none" ) + scale_fill_brewer(palette = "Set3") ggsave("speed_by_position_boxplot.png", width = 12, height = 7, dpi = 300) # ANOVA test for position differences anova_result <- aov(sprint_speed ~ position, data = sprint_data) summary(anova_result) # Tukey HSD for pairwise comparisons tukey_result <- TukeyHSD(anova_result) print("Tukey HSD Pairwise Comparisons:") print(tukey_result) ``` ### Example 3: Speed and Stolen Base Analysis ```r library(baseballr) library(dplyr) library(ggplot2) library(broom) # Fetch sprint speed data sprint_data <- statcast_sprint_speed(year = 2024, min_opp = 10) # Fetch batting statistics (includes SB, CS) batting_data <- fg_batter_leaders(2024, 2024, qual = 50) # Merge datasets merged_data <- sprint_data %>% select(player_name, sprint_speed, competitive_runs, bolts) %>% inner_join( batting_data %>% select(Name, SB, CS, PA), by = c("player_name" = "Name") ) # Calculate SB metrics sb_analysis <- merged_data %>% mutate( SB_attempts = SB + CS, SB_success_rate = ifelse(SB_attempts > 0, SB / SB_attempts * 100, NA) ) %>% filter(SB_attempts >= 10) # Minimum 10 attempts for analysis # Correlation analysis cor_test <- cor.test(sb_analysis$sprint_speed, sb_analysis$SB_success_rate) print(paste("Correlation:", round(cor_test$estimate, 4))) print(paste("P-value:", format(cor_test$p.value, scientific = TRUE))) # Linear regression model lm_model <- lm(SB_success_rate ~ sprint_speed, data = sb_analysis) summary(lm_model) # Add predictions and residuals sb_analysis <- sb_analysis %>% mutate( predicted_sb_rate = predict(lm_model), residual = SB_success_rate - predicted_sb_rate ) # Scatter plot with regression line ggplot(sb_analysis, aes(x = sprint_speed, y = SB_success_rate)) + geom_point(aes(size = SB_attempts), alpha = 0.6, color = "steelblue") + geom_smooth(method = "lm", color = "red", se = TRUE, alpha = 0.2) + geom_hline(yintercept = 70, linetype = "dashed", color = "orange") + geom_vline(xintercept = 27.0, linetype = "dashed", color = "gray") + annotate("text", x = 27.0, y = 95, label = "Avg Speed", color = "gray") + annotate("text", x = 30, y = 70, label = "Break-even (70%)", color = "orange", vjust = -0.5) + labs( title = sprintf("Sprint Speed vs. Stolen Base Success Rate (2024)\nCorrelation: %.3f", cor_test$estimate), x = "Sprint Speed (ft/s)", y = "Stolen Base Success Rate (%)", size = "SB Attempts" ) + theme_minimal() + theme(plot.title = element_text(face = "bold", size = 13)) ggsave("speed_vs_sb_success_rate.png", width = 11, height = 7, dpi = 300) # Speed tier analysis sb_analysis <- sb_analysis %>% mutate( speed_category = case_when( sprint_speed >= 29.5 ~ "Elite (29.5+)", sprint_speed >= 28.0 ~ "Plus (28-29.5)", sprint_speed >= 27.0 ~ "Above Avg (27-28)", TRUE ~ "Below Avg (<27)" ), speed_category = factor(speed_category, levels = c( "Elite (29.5+)", "Plus (28-29.5)", "Above Avg (27-28)", "Below Avg (<27)" )) ) tier_summary <- sb_analysis %>% group_by(speed_category) %>% summarise( count = n(), avg_sb_rate = mean(SB_success_rate, na.rm = TRUE), median_sb_rate = median(SB_success_rate, na.rm = TRUE), total_sb = sum(SB), total_cs = sum(CS) ) %>% mutate(overall_sb_rate = total_sb / (total_sb + total_cs) * 100) print("Stolen Base Success by Speed Tier:") print(tier_summary) # Top over-performers top_performers <- sb_analysis %>% arrange(desc(residual)) %>% select(player_name, sprint_speed, SB_success_rate, predicted_sb_rate, residual, SB_attempts) %>% head(10) print("Top 10 Over-Performers (Best Reads/Jumps):") print(top_performers) ``` ### Example 4: Comprehensive Speed vs Baserunning Value ```r library(baseballr) library(dplyr) library(ggplot2) library(gridExtra) library(viridis) # Fetch sprint speed data sprint_data <- statcast_sprint_speed(year = 2024, min_opp = 10) # Simulate baserunning metrics (in practice, fetch from FanGraphs) set.seed(42) sprint_data <- sprint_data %>% mutate( BsR = (sprint_speed - 27) * 2.5 + rnorm(n(), 0, 1.5), UBR = (sprint_speed - 27) * 1.8 + rnorm(n(), 0, 1.2), wSB = (sprint_speed - 27) * 0.7 + rnorm(n(), 0, 0.8) ) # Create speed bins sprint_data <- sprint_data %>% mutate( speed_bin = cut( sprint_speed, breaks = c(0, 26, 27, 28, 29, 35), labels = c("<26", "26-27", "27-28", "28-29", "29+") ) ) # Plot 1: Sprint Speed vs BsR p1 <- ggplot(sprint_data, aes(x = sprint_speed, y = BsR)) + geom_point(aes(color = competitive_runs), alpha = 0.6, size = 2.5) + geom_hline(yintercept = 0, linetype = "dashed", color = "red", alpha = 0.5) + geom_vline(xintercept = 27.0, linetype = "dashed", color = "gray", alpha = 0.5) + scale_color_viridis(name = "Competitive\nRuns") + labs( title = "Sprint Speed vs Total Baserunning Value", x = "Sprint Speed (ft/s)", y = "Baserunning Runs (BsR)" ) + theme_minimal() + theme(plot.title = element_text(face = "bold", size = 11)) # Plot 2: Sprint Speed vs UBR p2 <- ggplot(sprint_data, aes(x = sprint_speed, y = UBR)) + geom_point(alpha = 0.5, color = "steelblue") + geom_smooth(method = "lm", color = "red", se = TRUE) + geom_hline(yintercept = 0, linetype = "dashed", color = "red", alpha = 0.5) + geom_vline(xintercept = 27.0, linetype = "dashed", color = "gray", alpha = 0.5) + labs( title = "Sprint Speed vs Non-Steal Baserunning", x = "Sprint Speed (ft/s)", y = "Ultimate Baserunning (UBR)" ) + theme_minimal() + theme(plot.title = element_text(face = "bold", size = 11)) # Plot 3: Sprint Speed vs wSB p3 <- ggplot(sprint_data, aes(x = sprint_speed, y = wSB)) + geom_point(alpha = 0.5, color = "darkgreen") + geom_smooth(method = "lm", color = "forestgreen", se = TRUE) + geom_hline(yintercept = 0, linetype = "dashed", color = "red", alpha = 0.5) + geom_vline(xintercept = 27.0, linetype = "dashed", color = "gray", alpha = 0.5) + labs( title = "Sprint Speed vs Stolen Base Value", x = "Sprint Speed (ft/s)", y = "Weighted SB Runs (wSB)" ) + theme_minimal() + theme(plot.title = element_text(face = "bold", size = 11)) # Plot 4: Distribution by speed tier p4 <- ggplot(sprint_data %>% filter(!is.na(speed_bin)), aes(x = speed_bin, y = BsR, fill = speed_bin)) + geom_violin(alpha = 0.7) + geom_boxplot(width = 0.2, alpha = 0.5, outlier.shape = NA) + geom_hline(yintercept = 0, linetype = "dashed", color = "red", alpha = 0.5) + labs( title = "Baserunning Value Distribution by Speed", x = "Sprint Speed Tier (ft/s)", y = "Baserunning Runs (BsR)" ) + theme_minimal() + theme( plot.title = element_text(face = "bold", size = 11), legend.position = "none" ) + scale_fill_brewer(palette = "Set2") # Combine plots combined_plot <- grid.arrange(p1, p2, p3, p4, ncol = 2) ggsave("speed_vs_baserunning_comprehensive.png", combined_plot, width = 14, height = 10, dpi = 300) # Summary statistics by speed tier speed_summary <- sprint_data %>% filter(!is.na(speed_bin)) %>% group_by(speed_bin) %>% summarise( count = n(), mean_BsR = mean(BsR, na.rm = TRUE), median_BsR = median(BsR, na.rm = TRUE), mean_UBR = mean(UBR, na.rm = TRUE), mean_wSB = mean(wSB, na.rm = TRUE) ) %>% mutate(across(where(is.numeric), ~round(., 2))) print("Baserunning Value by Speed Tier:") print(speed_summary) # Regression analysis lm_bsr <- lm(BsR ~ sprint_speed, data = sprint_data) print("Regression: BsR ~ Sprint Speed") summary(lm_bsr) # Identify elite performers vs expected sprint_data <- sprint_data %>% mutate( expected_BsR = predict(lm_bsr), BsR_above_expected = BsR - expected_BsR ) elite_baserunners <- sprint_data %>% arrange(desc(BsR_above_expected)) %>% select(player_name, sprint_speed, BsR, expected_BsR, BsR_above_expected) %>% head(15) print("Top 15 Baserunners Above Expected:") print(elite_baserunners) ``` ## Conclusion Sprint Speed represents a paradigm shift in how baseball evaluates and quantifies player speed. By providing objective, standardized measurements in feet per second, Statcast has enabled teams to: 1. **Quantify speed objectively** rather than relying on subjective scouting grades 2. **Optimize baserunning strategies** using data-driven success probability models 3. **Project aging curves** and plan positional transitions accordingly 4. **Identify undervalued players** whose speed hasn't translated to traditional stolen base totals 5. **Evaluate defensive range** with precision previously impossible The metric's correlation with various aspects of on-field performance—from stolen base success rates to infield hit probability to defensive range—demonstrates its fundamental importance to player evaluation. As analytics continue evolving, Sprint Speed will remain a cornerstone metric, providing the foundation for increasingly sophisticated baserunning and defensive models. For analysts, scouts, and front offices, understanding Sprint Speed and its implications is essential to modern player evaluation. The comprehensive code examples provided enable practical application of these concepts, allowing teams to extract maximum value from this revolutionary metric.

Spin Rate and Pitch Movement Previous

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.

Table of Contents

Sprint Speed and Baserunning Metrics

Test Your Knowledge

Discussion