Accessing FanGraphs Data

Beginner 10 min read 1 views Nov 26, 2025

FanGraphs Data Access and Analytics Guide

FanGraphs has established itself as one of the premier baseball analytics platforms, providing comprehensive statistics, advanced metrics, and data analysis tools. The site offers everything from traditional counting statistics to cutting-edge sabermetric measures, tracking data, and projection systems. Understanding how to access, interpret, and analyze FanGraphs data is essential for modern baseball analysis, player evaluation, and research.

This comprehensive guide covers FanGraphs metrics, data access methods, programming interfaces, and practical applications. Whether you're a casual fan looking to understand advanced statistics or a serious analyst building predictive models, this tutorial provides the knowledge and tools needed to leverage FanGraphs effectively.

What is FanGraphs?

FanGraphs (www.fangraphs.com) is a baseball statistics and analytics website founded in 2005 that has become an indispensable resource for analysts, writers, front office personnel, and fans. The platform distinguishes itself through several key features:

  • Comprehensive Statistics: FanGraphs hosts complete historical data dating back to baseball's early eras, including traditional stats, advanced metrics, and modern tracking data from Statcast.
  • Advanced Sabermetrics: The site calculates and displays sophisticated metrics like wOBA (weighted On-Base Average), wRC+ (weighted Runs Created Plus), FIP (Fielding Independent Pitching), and their proprietary calculation of WAR (Wins Above Replacement).
  • Projection Systems: FanGraphs provides access to multiple projection systems including ZiPS, Steamer, ATC (Average Total Cost), and THE BAT, enabling users to forecast future player performance.
  • Research Tools: The platform offers leaderboards with customizable date ranges, splits analysis showing performance in different situations, game logs, and advanced filtering options.
  • Articles and Analysis: A team of excellent writers produces daily analytical content explaining metrics, evaluating players and teams, and advancing sabermetric research.

FanGraphs differs from Baseball Reference (its main competitor) in several important ways. While both sites offer comprehensive statistics, FanGraphs emphasizes modern sabermetrics and predictive metrics, provides more granular plate discipline and batted ball data, and uses different methodologies for calculating WAR. Baseball Reference focuses more on historical context and traditional statistics, making the two sites complementary resources.

Understanding FanGraphs Metrics

FanGraphs has developed and popularized numerous advanced metrics that provide deeper insights into player performance than traditional statistics. These metrics attempt to isolate skill, remove contextual factors, and better predict future outcomes.

Key Offensive Metrics

wOBA (weighted On-Base Average)

wOBA = (0.69×BB + 0.72×HBP + 0.88×1B + 1.24×2B + 1.56×3B + 1.95×HR) / (AB + BB - IBB + SF + HBP)

wOBA measures overall offensive value by assigning appropriate weights to each offensive outcome based on their run value. Unlike OPS, which simply adds OBP and SLG (giving OBP too little weight and SLG too much), wOBA correctly weighs each outcome according to its actual run-scoring impact. The league average wOBA is typically calibrated to match league average OBP (around .320), making it intuitive to interpret.

wRC+ (weighted Runs Created Plus)

wRC+ = (wRAA per PA / League wRAA per PA + 1) × 100

wRC+ quantifies a player's total offensive value in a single number adjusted for park effects and league context. A wRC+ of 100 represents league average, with each point above or below representing one percent better or worse than average. A player with a 150 wRC+ has been 50% better than league average, while an 80 wRC+ indicates 20% below average. This context-neutral approach makes wRC+ ideal for comparing players across different eras and ballparks.

ISO (Isolated Power) measures raw power by subtracting batting average from slugging percentage (ISO = SLG - AVG). This isolates extra-base hit power from overall batting average. League average ISO is typically around .140, with .200+ representing elite power.

BABIP (Batting Average on Balls In Play) calculates batting average excluding home runs and strikeouts. League average is consistently around .300, and significant deviations often indicate luck or unsustainable performance. However, elite contact hitters can sustain higher BABIPs (.330+) through skill.

Key Pitching Metrics

FIP (Fielding Independent Pitching)

FIP = ((13×HR + 3×BB - 2×K) / IP) + constant

FIP estimates what a pitcher's ERA should have been based solely on outcomes they control: strikeouts, walks, and home runs. By excluding balls in play, FIP removes the influence of defense and luck. The constant (typically around 3.10) calibrates FIP to the ERA scale. FIP generally predicts future ERA better than current ERA does, making it valuable for player evaluation and transactions.

xFIP (Expected FIP)

xFIP = ((13×Flyball Rate×League HR/FB Rate + 3×BB - 2×K) / IP) + constant

xFIP adjusts FIP by normalizing home run rates to league average based on fly ball rate. Since HR/FB rates can fluctuate significantly due to luck and ballpark effects, xFIP provides an even more stable predictor of future performance than FIP. If a pitcher's ERA is much better than their xFIP, regression is likely.

SIERA (Skill-Interactive ERA) is a more sophisticated metric that accounts for batted ball types, strikeouts, walks, and their interactions. Unlike FIP, SIERA recognizes that not all strikeouts and walks are created equal - getting a strikeout with runners on base is more valuable than with bases empty.

K% and BB% (strikeout and walk rates as percentages of total batters faced) provide clean measures of a pitcher's ability to miss bats and command. These rates are more stable and predictive than raw K/9 and BB/9, which can be influenced by team defense and sequencing luck.

Wins Above Replacement (fWAR)

FanGraphs calculates WAR using their own methodology (abbreviated as fWAR to distinguish from Baseball Reference's bWAR). WAR attempts to summarize a player's total contribution in a single number representing wins contributed above a replacement-level player.

For position players, fWAR = (Batting Runs + Base Running Runs + Fielding Runs + Positional Adjustment + League Adjustment + Replacement Runs) / Runs Per Win

For pitchers, fWAR uses FIP rather than ERA as the foundation, making it defense-independent. Key differences between fWAR and bWAR include:

  • Pitching Basis: fWAR uses FIP; bWAR uses RA9 (runs allowed per nine innings)
  • Defensive Metrics: fWAR uses UZR (Ultimate Zone Rating) and DRS; bWAR uses DRS and positional adjustments
  • Replacement Level: Slight differences in calculating the replacement baseline

Neither version is definitively "correct" - they represent different philosophical approaches. fWAR's use of FIP makes it more predictive, while bWAR's use of actual runs allowed better reflects what actually happened.

FanGraphs Leaderboards and Export Options

The FanGraphs leaderboards are the primary interface for accessing statistical data. They offer extensive customization and filtering options that make them powerful research tools.

Using Leaderboards Effectively

To access leaderboards, navigate to the "Leaders" tab and select either "Major League" or "Minor League" statistics. The interface provides numerous options:

  • Date Range: Analyze any time period from single games to entire careers. Custom date ranges enable studying hot/cold streaks, second-half performance, or specific eras.
  • Split Categories: View performance splits by handedness (vs LHP/RHP), home/road, day/night, month, count, men on base, and many other situations.
  • Stat Types: Choose from standard batting, advanced batting, batted ball, plate discipline, pitch type, or value metrics. Multiple tabs organize the dozens of available statistics.
  • Minimum Plate Appearances/Innings: Filter for qualified players or customize thresholds to include/exclude specific players.
  • Teams and Positions: Filter by team, position, or league to narrow results.

The "Dashboard" view presents the most important metrics on a single screen, making it ideal for quick player comparisons. The "Standard" view shows traditional counting stats, while "Advanced" displays sabermetric measures like wRC+, wOBA, and WAR.

Exporting Data

FanGraphs makes data export simple and flexible. At the bottom of every leaderboard, you'll find an "Export Data" button. This generates a CSV file containing all displayed statistics for the filtered players. The export includes:

  • All visible columns from your selected view (Standard, Advanced, Batted Ball, etc.)
  • Player names, team affiliations, and identifiers
  • Statistics formatted as clean numerical values suitable for analysis

Best practices for exporting FanGraphs data:

  1. Select the appropriate stat view before exporting - you can only export visible columns
  2. Adjust player minimums to avoid cluttering datasets with small sample sizes
  3. Use custom date ranges to isolate specific time periods of interest
  4. Export to CSV for easy import into Excel, R, Python, or SQL databases
  5. Maintain consistent naming conventions when saving exported files for analysis pipelines

WAR Calculation Differences: fWAR vs bWAR

Understanding the methodological differences between FanGraphs WAR (fWAR) and Baseball Reference WAR (bWAR) is crucial for proper interpretation and analysis. Both attempt to measure total player value, but they make different assumptions and use different components.

Position Player WAR Differences

ComponentfWAR (FanGraphs)bWAR (Baseball Reference)
Offensive RunswRAA (weighted Runs Above Average)Batting Runs (similar methodology)
Base RunningBase running runs (BsR)Base running runs (similar)
DefenseUZR (Ultimate Zone Rating) + DRS blendDRS (Defensive Runs Saved) + TZR
Positional AdjustmentSpectrum based on positional scarcitySimilar positional adjustments
Replacement Level~20.5 wins below average (varies by league)~17.5 wins below average

Pitcher WAR Differences

The most significant difference lies in pitching evaluation:

  • fWAR uses FIP as its foundation, focusing on strikeouts, walks, and home runs. This approach credits pitchers only for outcomes they directly control, removing defense and luck. fWAR better predicts future performance and isolates pitcher skill.
  • bWAR uses RA9-WAR based on actual runs allowed (both earned and unearned). This approach credits pitchers for their actual results, including ability to prevent hits on balls in play and strand runners. bWAR better captures what actually happened in the past.

Practical implications of these differences:

  • Pitchers with excellent defense behind them will have higher bWAR than fWAR
  • Pitchers who prevent hits on balls in play through skill will have higher bWAR
  • Pitchers with unsustainable low BABIP will see bWAR overstate their value
  • For projection and player acquisition, fWAR typically provides better guidance
  • For historical assessment of what actually occurred, bWAR may be preferred

When comparing players, it's valuable to examine both fWAR and bWAR. Large discrepancies often reveal interesting insights about defense, luck, or specific skills. Neither metric is perfect - both provide useful perspectives on player value.

FanGraphs-Specific Advanced Metrics

Plate Discipline Metrics

FanGraphs provides granular plate discipline statistics that illuminate a hitter's approach and ability to identify pitches:

  • O-Swing%: Percentage of pitches outside the strike zone at which a batter swings. League average is ~30%; elite plate discipline shows O-Swing% under 25%.
  • Z-Swing%: Percentage of pitches in the strike zone at which a batter swings. League average is ~67%; aggressive hitters exceed 70%.
  • Swing%: Overall swing percentage. Typically around 45-48% league-wide.
  • O-Contact%: Percentage of swings on pitches outside the zone that result in contact. Measures ability to make contact on bad pitches. League average ~60%.
  • Z-Contact%: Percentage of swings in the zone resulting in contact. Elite contact hitters exceed 90%.
  • Contact%: Overall contact rate on swings. League average ~78%; elite contact skills show 85%+.
  • Zone%: Percentage of pitches seen in the strike zone. Reveals how pitchers approach the hitter. Star hitters often see Zone% below 45% as pitchers work around them.
  • F-Strike%: Percentage of plate appearances beginning with a first-pitch strike. Measures how often a batter falls behind in the count.

Batted Ball Metrics

FanGraphs tracks batted ball types to provide insight into hitting approach and power potential:

  • GB% / LD% / FB%: Ground ball, line drive, and fly ball percentages. League average is roughly 45% GB, 20% LD, 35% FB. Extreme ground ball pitchers exceed 50% GB rate; fly ball power hitters show FB% over 40%.
  • HR/FB: Home run to fly ball ratio. League average fluctuates around 10-11%. Elite power hitters sustain 15-20% HR/FB rates. Significant year-to-year variance is common, making it useful for identifying regression candidates.
  • Pull% / Cent% / Oppo%: Percentages of batted balls hit to pull field, center, and opposite field. Balanced hitters show roughly even distribution; extreme pull hitters exceed 50% pull rate but may be vulnerable to defensive shifts.
  • Soft% / Med% / Hard%: Percentage of batted balls hit softly, medium, or hard. These are subjective classifications based on how the ball comes off the bat. Hard% above 40% indicates strong contact ability; Soft% above 20% suggests weak contact.

Statcast Integration

FanGraphs incorporates Statcast metrics, providing objective measurements of batted ball and pitch characteristics:

  • Exit Velocity: Average and maximum speed of batted balls off the bat. Elite hitters average 90+ mph; 95+ mph represents exceptional bat speed and power.
  • Launch Angle: Average angle of batted balls. Optimal launch angle for power is 15-25 degrees; line drive hitters target 10-15 degrees.
  • Barrel%: Percentage of batted balls classified as "barrels" (optimal combination of exit velocity and launch angle). Barrel% above 10% indicates elite contact quality; league average is ~6-7%.
  • HardHit%: Percentage of batted balls with 95+ mph exit velocity. Strong correlation with offensive production; 45%+ represents elite contact.
  • EV95%: Expected batting average on 95+ mph batted balls. Provides context for whether hard contact is producing results.

Splits and Game Logs

FanGraphs provides comprehensive splits data showing player performance in various contexts. This granular data reveals platoon splits, situational performance, and home/road differences.

Available Split Categories

  • Platoon Splits: Performance vs LHP and RHP. Reveals vulnerability to same-handed pitching and potential platoon situations. Significant platoon splits exceed 50 points of wRC+ difference.
  • Home/Road: Identifies park effects and travel impacts. Large home/road disparities may indicate park-dependent skills or comfort levels.
  • Month: Reveals seasonal patterns, hot/cold streaks, and adjustment periods. Useful for identifying players who start slow or fade in summer heat.
  • Count: Performance in different count situations (ahead, behind, even). Elite hitters maintain performance regardless of count; pitch-to-contact hitters often struggle when behind.
  • Men On/Bases Empty: Clutch performance indicators. Large disparities may indicate pressing or rising to the occasion, though these effects are often overstated.
  • High/Medium/Low Leverage: Performance in different game situations by leverage index. True clutch performers are rare; most variance is noise.
  • Inning: Performance by inning, useful for identifying fatigue patterns in pitchers or late-game tendencies.
  • Day/Night: Some players perform significantly better under lights or in day games, potentially due to vision or circadian rhythm factors.

Game Logs

Game logs provide play-by-play performance for every game a player appeared in. Each log entry includes:

  • Date, opponent, and location
  • Traditional statistics (AB, H, R, RBI, BB, K, etc.)
  • Advanced metrics calculated for that specific game
  • Pitch counts and usage patterns for pitchers
  • Win probability added (WPA) showing impact on game outcome

Game logs are invaluable for:

  • Identifying streaks and slumps
  • Analyzing performance against specific opponents or pitchers
  • Tracking workload and fatigue indicators
  • Building time-series models for prediction
  • Verifying consistency versus volatility

Projection Systems on FanGraphs

FanGraphs hosts multiple projection systems that forecast future player performance. These systems use different methodologies but all attempt to predict next season's statistics based on historical performance, aging curves, and regression to the mean.

ZiPS (Szymborski Projection System)

Developed by Dan Szymborski, ZiPS uses a nearest-neighbor approach that identifies similar players from baseball history and projects based on their aging patterns. ZiPS strengths include:

  • Deep historical database spanning decades of player comparisons
  • Sophisticated aging curves customized by position and skill profile
  • Automatic regression to league mean based on sample size and reliability
  • Regular in-season updates as new performance data emerges

Steamer

Created by a team of analysts including Dash Davidson and Peter Rosenbloom, Steamer emphasizes recent performance and uses a hybrid approach combining regression and similar-player analysis. Steamer characteristics:

  • Heavy weighting of recent three years of performance
  • Component-based approach projects underlying skills rather than outcomes
  • Playing time estimates based on depth charts and projected roles
  • Generally conservative, often underestimating breakouts but avoiding overconfidence in unsustainable performance

ATC (Average Total Cost)

ATC aggregates multiple projection systems by taking weighted averages, creating an ensemble forecast. The methodology:

  • Combines ZiPS, Steamer, and THE BAT projections
  • Weights systems based on historical accuracy
  • Provides range estimates showing uncertainty
  • Often most accurate due to diversification across methodologies

THE BAT

A newer projection system emphasizing Statcast data and granular skill components:

  • Incorporates exit velocity, launch angle, and swing metrics
  • Uses machine learning to identify predictive patterns
  • Accounts for player development and skill changes
  • Generally more aggressive in projecting young player development

Using Projections Effectively

Best practices for working with projections:

  • No projection system is perfect - use multiple systems and compare
  • Projections work best for established players with 3+ years of data
  • Young players and rookies have wide uncertainty ranges
  • Consider projection ranges and confidence intervals, not just point estimates
  • Update projections as season progresses - early season projections are less accurate
  • Combine projections with scouting reports for complete player evaluation
  • Use projections for relative comparisons rather than precise predictions

API and Data Access

While FanGraphs doesn't provide an official public API, several methods exist for programmatic data access. The most reliable approach is using the PyBaseball library in Python or baseballr in R, which scrape FanGraphs pages and return clean DataFrames.

Python Data Access with PyBaseball


import pandas as pd
from pybaseball import fg_batting_data, fg_pitching_data
from pybaseball import batting_stats, pitching_stats
from pybaseball import playerid_lookup, statcast_batter

# Enable caching to reduce load on FanGraphs servers
from pybaseball import cache
cache.enable()

# Fetch FanGraphs batting leaderboard data
def get_batting_leaders(start_season=2023, end_season=2023, min_pa=100):
    """
    Get batting statistics from FanGraphs.

    Parameters:
    -----------
    start_season : int
        First season to include
    end_season : int
        Last season to include
    min_pa : int
        Minimum plate appearances to include

    Returns:
    --------
    DataFrame with batting statistics
    """
    # Using batting_stats function
    data = batting_stats(start_season, end_season, qual=min_pa)

    # Select key columns
    columns = [
        'Name', 'Team', 'Age', 'G', 'PA', 'AB', 'H', 'HR', 'R', 'RBI', 'SB',
        'BB%', 'K%', 'ISO', 'BABIP', 'AVG', 'OBP', 'SLG', 'wOBA', 'wRC+',
        'BsR', 'Off', 'Def', 'WAR'
    ]

    result = data[columns].copy()

    return result

# Example: Get 2023 batting leaders
batting_2023 = get_batting_leaders(2023, 2023, min_pa=400)

# Sort by wRC+ and display top performers
top_hitters = batting_2023.nlargest(10, 'wRC+')
print("Top 10 Hitters by wRC+ (2023):")
print(top_hitters[['Name', 'Team', 'PA', 'wOBA', 'wRC+', 'WAR']])

# Fetch pitching statistics
def get_pitching_leaders(start_season=2023, end_season=2023, min_ip=50):
    """
    Get pitching statistics from FanGraphs.

    Parameters:
    -----------
    start_season : int
        First season to include
    end_season : int
        Last season to include
    min_ip : int
        Minimum innings pitched to include

    Returns:
    --------
    DataFrame with pitching statistics
    """
    data = pitching_stats(start_season, end_season, qual=min_ip)

    columns = [
        'Name', 'Team', 'Age', 'W', 'L', 'SV', 'G', 'GS', 'IP',
        'K/9', 'BB/9', 'HR/9', 'BABIP', 'LOB%', 'GB%', 'HR/FB',
        'ERA', 'FIP', 'xFIP', 'SIERA', 'K%', 'BB%', 'WAR'
    ]

    result = data[columns].copy()

    return result

# Example: Get 2023 pitching leaders
pitching_2023 = get_pitching_leaders(2023, 2023, min_ip=100)

# Sort by WAR
top_pitchers = pitching_2023.nlargest(10, 'WAR')
print("\nTop 10 Pitchers by WAR (2023):")
print(top_pitchers[['Name', 'Team', 'IP', 'ERA', 'FIP', 'WAR']])

# Get specific player data
def get_player_profile(last_name, first_name):
    """
    Look up player and get their career statistics.

    Parameters:
    -----------
    last_name : str
        Player's last name
    first_name : str
        Player's first name

    Returns:
    --------
    Dictionary with player info and career stats
    """
    # Look up player ID
    player_lookup = playerid_lookup(last_name, first_name)

    if len(player_lookup) == 0:
        return {"error": f"Player {first_name} {last_name} not found"}

    player_info = player_lookup.iloc[0]

    # Get career batting stats if available
    try:
        career_batting = batting_stats(
            player_info['mlb_debut'].year if pd.notna(player_info['mlb_debut']) else 2020,
            2023,
            qual=0
        )
        player_stats = career_batting[
            career_batting['Name'].str.contains(first_name) &
            career_batting['Name'].str.contains(last_name)
        ]

        return {
            'player_info': player_info,
            'career_stats': player_stats
        }
    except:
        return {
            'player_info': player_info,
            'career_stats': None
        }

# Example: Get Aaron Judge profile
judge_profile = get_player_profile('Judge', 'Aaron')
print("\nAaron Judge Career Information:")
print(judge_profile['player_info'][['name_first', 'name_last', 'mlb_debut', 'key_mlbam']])

Advanced PyBaseball Usage


# Get player game logs
from pybaseball import schedule_and_record

def get_player_game_log(player_id, year):
    """
    Fetch detailed game-by-game statistics for a player.

    Parameters:
    -----------
    player_id : int
        MLB player ID (MLBAM ID)
    year : int
        Season year

    Returns:
    --------
    DataFrame with game-by-game performance
    """
    # Note: PyBaseball doesn't have direct game log function
    # This is a conceptual example - actual implementation varies

    # Alternative: Use statcast data and aggregate by game
    from datetime import datetime

    start_date = f'{year}-01-01'
    end_date = f'{year}-12-31'

    statcast_data = statcast_batter(start_date, end_date, player_id)

    # Aggregate by game
    game_logs = statcast_data.groupby('game_date').agg({
        'launch_speed': 'mean',
        'launch_angle': 'mean',
        'events': 'count',
        'estimated_woba_using_speedangle': 'mean',
        'woba_value': 'sum'
    }).reset_index()

    game_logs.columns = [
        'game_date', 'avg_exit_velo', 'avg_launch_angle',
        'batted_balls', 'xwOBA', 'total_woba'
    ]

    return game_logs

# Working with projection data
def compare_projections(player_name, year=2024):
    """
    Compare multiple projection systems for a player.

    Parameters:
    -----------
    player_name : str
        Player name
    year : int
        Projection year

    Returns:
    --------
    DataFrame comparing projection systems
    """
    # Note: Projections must be scraped from FanGraphs pages
    # This is a conceptual framework

    projections = {
        'System': ['ZiPS', 'Steamer', 'ATC', 'THE BAT'],
        'PA': [550, 560, 555, 565],
        'HR': [28, 25, 27, 30],
        'R': [85, 82, 84, 88],
        'RBI': [82, 78, 80, 85],
        'SB': [12, 10, 11, 13],
        'AVG': [.275, .268, .272, .278],
        'wOBA': [.350, .342, .346, .355],
        'WAR': [4.2, 3.8, 4.0, 4.5]
    }

    return pd.DataFrame(projections)

# Batch processing multiple players
def analyze_team_offense(team_abbrev, year):
    """
    Analyze offensive performance for all players on a team.

    Parameters:
    -----------
    team_abbrev : str
        Team abbreviation (e.g., 'NYY', 'LAD')
    year : int
        Season year

    Returns:
    --------
    DataFrame with team offensive statistics
    """
    # Get all batting data
    all_batting = batting_stats(year, qual=0)

    # Filter to specific team
    team_batting = all_batting[all_batting['Team'] == team_abbrev].copy()

    # Calculate team totals and averages
    team_summary = {
        'total_war': team_batting['WAR'].sum(),
        'avg_wrc_plus': team_batting['wRC+'].mean(),
        'total_hr': team_batting['HR'].sum(),
        'team_woba': team_batting['wOBA'].mean(),
        'player_count': len(team_batting)
    }

    return team_batting, team_summary

# Example usage
yankees_2023, yankees_summary = analyze_team_offense('NYY', 2023)
print("\nNew York Yankees 2023 Offense Summary:")
for metric, value in yankees_summary.items():
    print(f"  {metric}: {value:.2f}" if isinstance(value, float) else f"  {metric}: {value}")

R Data Access with baseballr


library(baseballr)
library(tidyverse)
library(lubridate)

# Fetch FanGraphs batting leaderboard
get_batting_leaders <- function(start_season = 2023, end_season = 2023, min_pa = 100) {
  # Get batting data from FanGraphs
  data <- fg_batter_leaders(
    startseason = start_season,
    endseason = end_season,
    qual = min_pa,
    ind = 1  # Individual season (0 for aggregate)
  )

  # Select key columns
  result <- data %>%
    select(
      Name, Team, Age, G, PA, AB, H, HR, R, RBI, SB,
      `BB%`, `K%`, ISO, BABIP, AVG, OBP, SLG, wOBA, `wRC+`,
      BsR, Off, Def, WAR
    )

  return(result)
}

# Example: Get 2023 batting leaders
batting_2023 <- get_batting_leaders(2023, 2023, 400)

# Display top hitters
top_hitters <- batting_2023 %>%
  arrange(desc(`wRC+`)) %>%
  head(10) %>%
  select(Name, Team, PA, wOBA, `wRC+`, WAR)

cat("Top 10 Hitters by wRC+ (2023):\n")
print(top_hitters)

# Fetch pitching statistics
get_pitching_leaders <- function(start_season = 2023, end_season = 2023, min_ip = 50) {
  data <- fg_pitcher_leaders(
    startseason = start_season,
    endseason = end_season,
    qual = min_ip,
    ind = 1
  )

  result <- data %>%
    select(
      Name, Team, Age, W, L, SV, G, GS, IP,
      `K/9`, `BB/9`, `HR/9`, BABIP, `LOB%`, `GB%`, `HR/FB`,
      ERA, FIP, xFIP, SIERA, `K%`, `BB%`, WAR
    )

  return(result)
}

# Example: Get 2023 pitching leaders
pitching_2023 <- get_pitching_leaders(2023, 2023, 100)

top_pitchers <- pitching_2023 %>%
  arrange(desc(WAR)) %>%
  head(10) %>%
  select(Name, Team, IP, ERA, FIP, WAR)

cat("\nTop 10 Pitchers by WAR (2023):\n")
print(top_pitchers)

# Get player splits
get_player_splits <- function(playerid, year) {
  # Fetch splits data for a player
  # Note: baseballr functions for splits vary by version

  splits <- fg_batter_game_logs(
    playerid = playerid,
    year = year
  )

  return(splits)
}

# Analyze team performance
analyze_team_offense <- function(team_abbrev, year) {
  # Get all batting data
  all_batting <- get_batting_leaders(year, year, min_pa = 0)

  # Filter to team
  team_batting <- all_batting %>%
    filter(Team == team_abbrev)

  # Calculate team summary
  team_summary <- team_batting %>%
    summarise(
      total_war = sum(WAR, na.rm = TRUE),
      avg_wrc_plus = mean(`wRC+`, na.rm = TRUE),
      total_hr = sum(HR, na.rm = TRUE),
      team_woba = mean(wOBA, na.rm = TRUE),
      player_count = n()
    )

  return(list(
    players = team_batting,
    summary = team_summary
  ))
}

# Example: Analyze Yankees offense
yankees_2023 <- analyze_team_offense("NYY", 2023)

cat("\nNew York Yankees 2023 Offense Summary:\n")
print(yankees_2023$summary)

# Working with projection data
compare_projections <- function(player_name, year = 2024) {
  # Conceptual framework for comparing projections
  # Actual implementation requires scraping FanGraphs projection pages

  # Create example projection comparison
  projections <- tibble(
    System = c('ZiPS', 'Steamer', 'ATC', 'THE BAT'),
    PA = c(550, 560, 555, 565),
    HR = c(28, 25, 27, 30),
    R = c(85, 82, 84, 88),
    RBI = c(82, 78, 80, 85),
    SB = c(12, 10, 11, 13),
    AVG = c(.275, .268, .272, .278),
    wOBA = c(.350, .342, .346, .355),
    WAR = c(4.2, 3.8, 4.0, 4.5)
  )

  return(projections)
}

# Calculate average projection across systems
avg_projection <- compare_projections("Example Player") %>%
  summarise(across(where(is.numeric), mean)) %>%
  mutate(System = "Average")

cat("\nProjection System Comparison:\n")
print(compare_projections("Example Player"))

# Advanced analysis: Identify breakout candidates
identify_breakout_candidates <- function(year) {
  # Get current year and prior year data
  current <- get_batting_leaders(year, year, 250)
  prior <- get_batting_leaders(year - 1, year - 1, 250)

  # Join datasets
  comparison <- current %>%
    select(Name, Team, Age, wRC_current = `wRC+`, WAR_current = WAR) %>%
    inner_join(
      prior %>% select(Name, wRC_prior = `wRC+`, WAR_prior = WAR),
      by = "Name"
    ) %>%
    mutate(
      wrc_improvement = wRC_current - wRC_prior,
      war_improvement = WAR_current - WAR_prior
    ) %>%
    filter(
      wrc_improvement > 20,  # 20+ point wRC+ improvement
      Age <= 26  # Focus on young players
    ) %>%
    arrange(desc(wrc_improvement))

  return(comparison)
}

# Example: Find 2023 breakout players
breakouts_2023 <- identify_breakout_candidates(2023)

cat("\n2023 Breakout Candidates (wRC+ improvement):\n")
print(head(breakouts_2023, 10))

Comparing FanGraphs with Baseball Reference

FanGraphs and Baseball Reference are the two most popular baseball statistics websites. While they overlap significantly, each offers unique features and perspectives that make them complementary resources.

Key Differences

FeatureFanGraphsBaseball Reference
WAR CalculationfWAR (FIP-based for pitchers)bWAR (RA9-based for pitchers)
FocusAdvanced metrics, projections, modern sabermetricsHistorical context, traditional stats, comprehensive archives
Defensive MetricsUZR, DRS blendDRS, TZR
InterfaceModern, leaderboard-focusedTraditional, player page-focused
Plate Discipline DataExtensive (O-Swing%, Z-Swing%, etc.)Limited
Batted Ball DataComprehensive (GB%, FB%, Hard%, etc.)Basic
ProjectionsMultiple systems (ZiPS, Steamer, ATC, THE BAT)Limited projection access
Play IndexLimited search toolsPowerful Play Index for historical queries
ArticlesDaily sabermetric analysis and researchMinimal editorial content
Historical DataComplete but less contextualComplete with rich historical context
Minor LeaguesComprehensive coverageBasic coverage

When to Use Each Site

Use FanGraphs for:

  • Modern player evaluation using advanced metrics
  • Projecting future performance
  • Analyzing plate discipline and batted ball profiles
  • Comparing projection systems
  • Understanding pitching with defense-independent metrics
  • Reading analytical articles and sabermetric research
  • Exporting leaderboard data for analysis
  • Minor league player evaluation

Use Baseball Reference for:

  • Historical research and career comparisons across eras
  • Comprehensive player pages with complete career statistics
  • Play Index for complex historical queries
  • Game logs and play-by-play data
  • Traditional statistics and counting stats
  • Awards, transactions, and biographical information
  • Actual run prevention evaluation (RA9-WAR)
  • Similarity scores and Hall of Fame statistics

Ideal Approach: Use both sites for comprehensive analysis. Start with FanGraphs for modern metrics and projections, then verify with Baseball Reference's historical context and actual results. The different WAR calculations provide useful bounds on player value - truth often lies between fWAR and bWAR.

FanGraphs Statistics Glossary

Offensive Statistics

StatFull NameDescriptionLeague Average
wOBAWeighted On-Base AverageOverall offensive value with proper weighting~.320
wRC+Weighted Runs Created PlusPark and league-adjusted offensive value100
ISOIsolated PowerRaw power measure (SLG - AVG)~.140
BABIPBatting Average on Balls In PlayBA excluding HR and K~.300
BB%Walk PercentageWalks per plate appearance~8.5%
K%Strikeout PercentageStrikeouts per plate appearance~22%
O-Swing%Outside Swing PercentageSwings on pitches outside zone~30%
Z-Swing%Zone Swing PercentageSwings on pitches in zone~67%
Contact%Contact PercentageContact made on swings~78%
Hard%Hard Hit PercentagePercentage of hard-hit balls~35%
Barrel%Barrel PercentageOptimal contact (EV + LA combination)~6-7%
BsRBase Running RunsRuns contributed by base running0
OffOffensive RunsBatting runs above average0
DefDefensive RunsFielding runs above average0

Pitching Statistics

StatFull NameDescriptionLeague Average
FIPFielding Independent PitchingERA estimator using K, BB, HR only~4.00
xFIPExpected FIPFIP with normalized HR/FB rate~4.00
SIERASkill-Interactive ERAAdvanced ERA estimator~4.00
K/9Strikeouts per 9 InningsStrikeout rate~8.5
BB/9Walks per 9 InningsWalk rate~3.0
K%Strikeout PercentageStrikeouts per batter faced~22%
BB%Walk PercentageWalks per batter faced~8%
K-BB%Strikeout Minus Walk PercentageNet K vs BB rate~14%
LOB%Left On Base PercentageStrand rate for runners~72%
GB%Ground Ball PercentageGround balls per ball in play~45%
FB%Fly Ball PercentageFly balls per ball in play~35%
HR/FBHome Run per Fly BallHome runs as % of fly balls~10-11%
WHIPWalks + Hits per Inning PitchedBase runners allowed per inning~1.30
Soft%Soft Contact PercentageWeakly hit balls~20%
Hard%Hard Contact PercentageHard hit balls allowed~35%

Value and Contextual Statistics

StatFull NameDescriptionScale
WARWins Above ReplacementTotal player value in wins0 = replacement, 2 = average, 5 = All-Star, 8+ = MVP
WPAWin Probability AddedImpact on game win probabilitySum to team wins - losses
RE24Run Expectancy based on 24 base-out statesRuns added by changing game states0 = average
REWRun Expectancy WinsRE24 converted to winsSimilar to WPA
LILeverage IndexGame situation importance1.0 = average, 2.0 = 2x pressure
ClutchClutch ScorePerformance in high-leverage situations0 = neutral, positive = clutch
DollarsDollar ValueEstimated market value in $Based on $/WAR conversion

Best Practices and Tips

Effective FanGraphs Usage

  • Use Custom Date Ranges: Analyze recent performance (last 30 days, second half) to identify trends and changes in approach or skill level.
  • Compare Multiple Metrics: Don't rely on single statistics. Cross-reference wOBA with wRC+, ISO, and plate discipline metrics for complete picture.
  • Check Sample Sizes: Small samples create noise. Require 200+ PA for batting, 50+ IP for pitching before drawing conclusions.
  • Use FIP Family for Pitchers: FIP, xFIP, and SIERA provide better predictive power than ERA for evaluating pitching talent.
  • Investigate Discrepancies: Large gaps between ERA and FIP suggest regression coming. High BABIP with low Hard% indicates bad luck.
  • Respect Projection Ranges: Point estimates are less useful than understanding uncertainty ranges around projections.
  • Export for Analysis: Download CSV files for statistical modeling, visualization, and custom analysis in Python/R.
  • Read the Glossary: FanGraphs provides detailed explanations of every metric - understand what you're measuring.

Common Pitfalls to Avoid

  • Overvaluing Wins and RBI: These heavily context-dependent stats poorly measure individual value.
  • Ignoring Defense: Defensive value is huge - a +15 run defender is worth ~1.5 WAR.
  • Treating WAR as Exact: WAR is an estimate with uncertainty. A 4.2 WAR player isn't meaningfully better than 3.9.
  • Cherry-Picking Metrics: Don't select the single stat that supports your narrative - use comprehensive evaluation.
  • Misunderstanding FIP: FIP predicts future ERA but isn't necessarily "better" than actual ERA for past evaluation.
  • Ignoring Batted Ball Data: Exit velocity, launch angle, and barrel rate reveal skills traditional stats miss.
  • Forgetting Context: Park effects, league differences, and era matter. Use wRC+ and ERA+ for fair comparisons.

Key Takeaways

  • FanGraphs is the premier source for modern baseball analytics, offering comprehensive statistics, advanced metrics, projection systems, and data export capabilities.
  • Understanding core FanGraphs metrics like wOBA, wRC+, FIP, and WAR is essential for modern player evaluation and analysis.
  • The difference between fWAR (FIP-based) and bWAR (runs-based) reflects different philosophies about pitcher evaluation - both provide value.
  • FanGraphs excels at plate discipline data, batted ball profiles, and predictive metrics that reveal underlying skills beyond results.
  • Multiple projection systems (ZiPS, Steamer, ATC, THE BAT) provide different perspectives on future performance - ensemble approaches work best.
  • PyBaseball (Python) and baseballr (R) enable programmatic access to FanGraphs data for statistical analysis and modeling.
  • FanGraphs and Baseball Reference are complementary - FanGraphs for predictive analysis and modern metrics, Baseball Reference for historical context.
  • Effective FanGraphs usage requires understanding sample size, metric limitations, and using multiple statistics for comprehensive evaluation.
  • Data export capabilities make FanGraphs invaluable for research, creating a bridge between web-based exploration and advanced statistical analysis.
  • Regular engagement with FanGraphs articles and glossary entries deepens understanding of sabermetric principles and analytical best practices.

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.