Accessing FanGraphs Data

Beginner 10 min read 17 views Nov 26, 2025

FanGraphs Data Access and Analytics Guide

FanGraphs has established itself as one of the premier baseball analytics platforms, providing comprehensive statistics, advanced metrics, and data analysis tools. The site offers everything from traditional counting statistics to cutting-edge sabermetric measures, tracking data, and projection systems. Understanding how to access, interpret, and analyze FanGraphs data is essential for modern baseball analysis, player evaluation, and research.

This comprehensive guide covers FanGraphs metrics, data access methods, programming interfaces, and practical applications. Whether you're a casual fan looking to understand advanced statistics or a serious analyst building predictive models, this tutorial provides the knowledge and tools needed to leverage FanGraphs effectively.

What is FanGraphs?

FanGraphs (www.fangraphs.com) is a baseball statistics and analytics website founded in 2005 that has become an indispensable resource for analysts, writers, front office personnel, and fans. The platform distinguishes itself through several key features:

Comprehensive Statistics: FanGraphs hosts complete historical data dating back to baseball's early eras, including traditional stats, advanced metrics, and modern tracking data from Statcast.
Advanced Sabermetrics: The site calculates and displays sophisticated metrics like wOBA (weighted On-Base Average), wRC+ (weighted Runs Created Plus), FIP (Fielding Independent Pitching), and their proprietary calculation of WAR (Wins Above Replacement).
Projection Systems: FanGraphs provides access to multiple projection systems including ZiPS, Steamer, ATC (Average Total Cost), and THE BAT, enabling users to forecast future player performance.
Research Tools: The platform offers leaderboards with customizable date ranges, splits analysis showing performance in different situations, game logs, and advanced filtering options.
Articles and Analysis: A team of excellent writers produces daily analytical content explaining metrics, evaluating players and teams, and advancing sabermetric research.

FanGraphs differs from Baseball Reference (its main competitor) in several important ways. While both sites offer comprehensive statistics, FanGraphs emphasizes modern sabermetrics and predictive metrics, provides more granular plate discipline and batted ball data, and uses different methodologies for calculating WAR. Baseball Reference focuses more on historical context and traditional statistics, making the two sites complementary resources.

Understanding FanGraphs Metrics

FanGraphs has developed and popularized numerous advanced metrics that provide deeper insights into player performance than traditional statistics. These metrics attempt to isolate skill, remove contextual factors, and better predict future outcomes.

Key Offensive Metrics

wOBA (weighted On-Base Average)

wOBA = (0.69×BB + 0.72×HBP + 0.88×1B + 1.24×2B + 1.56×3B + 1.95×HR) / (AB + BB - IBB + SF + HBP)

wOBA measures overall offensive value by assigning appropriate weights to each offensive outcome based on their run value. Unlike OPS, which simply adds OBP and SLG (giving OBP too little weight and SLG too much), wOBA correctly weighs each outcome according to its actual run-scoring impact. The league average wOBA is typically calibrated to match league average OBP (around .320), making it intuitive to interpret.

wRC+ (weighted Runs Created Plus)

wRC+ = (wRAA per PA / League wRAA per PA + 1) × 100

wRC+ quantifies a player's total offensive value in a single number adjusted for park effects and league context. A wRC+ of 100 represents league average, with each point above or below representing one percent better or worse than average. A player with a 150 wRC+ has been 50% better than league average, while an 80 wRC+ indicates 20% below average. This context-neutral approach makes wRC+ ideal for comparing players across different eras and ballparks.

ISO (Isolated Power) measures raw power by subtracting batting average from slugging percentage (ISO = SLG - AVG). This isolates extra-base hit power from overall batting average. League average ISO is typically around .140, with .200+ representing elite power.

BABIP (Batting Average on Balls In Play) calculates batting average excluding home runs and strikeouts. League average is consistently around .300, and significant deviations often indicate luck or unsustainable performance. However, elite contact hitters can sustain higher BABIPs (.330+) through skill.

Key Pitching Metrics

FIP (Fielding Independent Pitching)

FIP = ((13×HR + 3×BB - 2×K) / IP) + constant

FIP estimates what a pitcher's ERA should have been based solely on outcomes they control: strikeouts, walks, and home runs. By excluding balls in play, FIP removes the influence of defense and luck. The constant (typically around 3.10) calibrates FIP to the ERA scale. FIP generally predicts future ERA better than current ERA does, making it valuable for player evaluation and transactions.

xFIP (Expected FIP)

xFIP = ((13×Flyball Rate×League HR/FB Rate + 3×BB - 2×K) / IP) + constant

xFIP adjusts FIP by normalizing home run rates to league average based on fly ball rate. Since HR/FB rates can fluctuate significantly due to luck and ballpark effects, xFIP provides an even more stable predictor of future performance than FIP. If a pitcher's ERA is much better than their xFIP, regression is likely.

SIERA (Skill-Interactive ERA) is a more sophisticated metric that accounts for batted ball types, strikeouts, walks, and their interactions. Unlike FIP, SIERA recognizes that not all strikeouts and walks are created equal - getting a strikeout with runners on base is more valuable than with bases empty.

K% and BB% (strikeout and walk rates as percentages of total batters faced) provide clean measures of a pitcher's ability to miss bats and command. These rates are more stable and predictive than raw K/9 and BB/9, which can be influenced by team defense and sequencing luck.

Wins Above Replacement (fWAR)

FanGraphs calculates WAR using their own methodology (abbreviated as fWAR to distinguish from Baseball Reference's bWAR). WAR attempts to summarize a player's total contribution in a single number representing wins contributed above a replacement-level player.

For position players, fWAR = (Batting Runs + Base Running Runs + Fielding Runs + Positional Adjustment + League Adjustment + Replacement Runs) / Runs Per Win

For pitchers, fWAR uses FIP rather than ERA as the foundation, making it defense-independent. Key differences between fWAR and bWAR include:

Pitching Basis: fWAR uses FIP; bWAR uses RA9 (runs allowed per nine innings)
Defensive Metrics: fWAR uses UZR (Ultimate Zone Rating) and DRS; bWAR uses DRS and positional adjustments
Replacement Level: Slight differences in calculating the replacement baseline

Neither version is definitively "correct" - they represent different philosophical approaches. fWAR's use of FIP makes it more predictive, while bWAR's use of actual runs allowed better reflects what actually happened.

FanGraphs Leaderboards and Export Options

The FanGraphs leaderboards are the primary interface for accessing statistical data. They offer extensive customization and filtering options that make them powerful research tools.

Using Leaderboards Effectively

To access leaderboards, navigate to the "Leaders" tab and select either "Major League" or "Minor League" statistics. The interface provides numerous options:

Date Range: Analyze any time period from single games to entire careers. Custom date ranges enable studying hot/cold streaks, second-half performance, or specific eras.
Split Categories: View performance splits by handedness (vs LHP/RHP), home/road, day/night, month, count, men on base, and many other situations.
Stat Types: Choose from standard batting, advanced batting, batted ball, plate discipline, pitch type, or value metrics. Multiple tabs organize the dozens of available statistics.
Minimum Plate Appearances/Innings: Filter for qualified players or customize thresholds to include/exclude specific players.
Teams and Positions: Filter by team, position, or league to narrow results.

The "Dashboard" view presents the most important metrics on a single screen, making it ideal for quick player comparisons. The "Standard" view shows traditional counting stats, while "Advanced" displays sabermetric measures like wRC+, wOBA, and WAR.

Exporting Data

FanGraphs makes data export simple and flexible. At the bottom of every leaderboard, you'll find an "Export Data" button. This generates a CSV file containing all displayed statistics for the filtered players. The export includes:

All visible columns from your selected view (Standard, Advanced, Batted Ball, etc.)
Player names, team affiliations, and identifiers
Statistics formatted as clean numerical values suitable for analysis

Best practices for exporting FanGraphs data:

Select the appropriate stat view before exporting - you can only export visible columns
Adjust player minimums to avoid cluttering datasets with small sample sizes
Use custom date ranges to isolate specific time periods of interest
Export to CSV for easy import into Excel, R, Python, or SQL databases
Maintain consistent naming conventions when saving exported files for analysis pipelines

WAR Calculation Differences: fWAR vs bWAR

Understanding the methodological differences between FanGraphs WAR (fWAR) and Baseball Reference WAR (bWAR) is crucial for proper interpretation and analysis. Both attempt to measure total player value, but they make different assumptions and use different components.

Position Player WAR Differences

Component	fWAR (FanGraphs)	bWAR (Baseball Reference)
Offensive Runs	wRAA (weighted Runs Above Average)	Batting Runs (similar methodology)
Base Running	Base running runs (BsR)	Base running runs (similar)
Defense	UZR (Ultimate Zone Rating) + DRS blend	DRS (Defensive Runs Saved) + TZR
Positional Adjustment	Spectrum based on positional scarcity	Similar positional adjustments
Replacement Level	~20.5 wins below average (varies by league)	~17.5 wins below average

Pitcher WAR Differences

The most significant difference lies in pitching evaluation:

fWAR uses FIP as its foundation, focusing on strikeouts, walks, and home runs. This approach credits pitchers only for outcomes they directly control, removing defense and luck. fWAR better predicts future performance and isolates pitcher skill.
bWAR uses RA9-WAR based on actual runs allowed (both earned and unearned). This approach credits pitchers for their actual results, including ability to prevent hits on balls in play and strand runners. bWAR better captures what actually happened in the past.

Practical implications of these differences:

Pitchers with excellent defense behind them will have higher bWAR than fWAR
Pitchers who prevent hits on balls in play through skill will have higher bWAR
Pitchers with unsustainable low BABIP will see bWAR overstate their value
For projection and player acquisition, fWAR typically provides better guidance
For historical assessment of what actually occurred, bWAR may be preferred

When comparing players, it's valuable to examine both fWAR and bWAR. Large discrepancies often reveal interesting insights about defense, luck, or specific skills. Neither metric is perfect - both provide useful perspectives on player value.

FanGraphs-Specific Advanced Metrics

Plate Discipline Metrics

FanGraphs provides granular plate discipline statistics that illuminate a hitter's approach and ability to identify pitches:

O-Swing%: Percentage of pitches outside the strike zone at which a batter swings. League average is ~30%; elite plate discipline shows O-Swing% under 25%.
Z-Swing%: Percentage of pitches in the strike zone at which a batter swings. League average is ~67%; aggressive hitters exceed 70%.
Swing%: Overall swing percentage. Typically around 45-48% league-wide.
O-Contact%: Percentage of swings on pitches outside the zone that result in contact. Measures ability to make contact on bad pitches. League average ~60%.
Z-Contact%: Percentage of swings in the zone resulting in contact. Elite contact hitters exceed 90%.
Contact%: Overall contact rate on swings. League average ~78%; elite contact skills show 85%+.
Zone%: Percentage of pitches seen in the strike zone. Reveals how pitchers approach the hitter. Star hitters often see Zone% below 45% as pitchers work around them.
F-Strike%: Percentage of plate appearances beginning with a first-pitch strike. Measures how often a batter falls behind in the count.

Batted Ball Metrics

FanGraphs tracks batted ball types to provide insight into hitting approach and power potential:

GB% / LD% / FB%: Ground ball, line drive, and fly ball percentages. League average is roughly 45% GB, 20% LD, 35% FB. Extreme ground ball pitchers exceed 50% GB rate; fly ball power hitters show FB% over 40%.
HR/FB: Home run to fly ball ratio. League average fluctuates around 10-11%. Elite power hitters sustain 15-20% HR/FB rates. Significant year-to-year variance is common, making it useful for identifying regression candidates.
Pull% / Cent% / Oppo%: Percentages of batted balls hit to pull field, center, and opposite field. Balanced hitters show roughly even distribution; extreme pull hitters exceed 50% pull rate but may be vulnerable to defensive shifts.
Soft% / Med% / Hard%: Percentage of batted balls hit softly, medium, or hard. These are subjective classifications based on how the ball comes off the bat. Hard% above 40% indicates strong contact ability; Soft% above 20% suggests weak contact.

Statcast Integration

FanGraphs incorporates Statcast metrics, providing objective measurements of batted ball and pitch characteristics:

Exit Velocity: Average and maximum speed of batted balls off the bat. Elite hitters average 90+ mph; 95+ mph represents exceptional bat speed and power.
Launch Angle: Average angle of batted balls. Optimal launch angle for power is 15-25 degrees; line drive hitters target 10-15 degrees.
Barrel%: Percentage of batted balls classified as "barrels" (optimal combination of exit velocity and launch angle). Barrel% above 10% indicates elite contact quality; league average is ~6-7%.
HardHit%: Percentage of batted balls with 95+ mph exit velocity. Strong correlation with offensive production; 45%+ represents elite contact.
EV95%: Expected batting average on 95+ mph batted balls. Provides context for whether hard contact is producing results.

Splits and Game Logs

FanGraphs provides comprehensive splits data showing player performance in various contexts. This granular data reveals platoon splits, situational performance, and home/road differences.

Available Split Categories

Platoon Splits: Performance vs LHP and RHP. Reveals vulnerability to same-handed pitching and potential platoon situations. Significant platoon splits exceed 50 points of wRC+ difference.
Home/Road: Identifies park effects and travel impacts. Large home/road disparities may indicate park-dependent skills or comfort levels.
Month: Reveals seasonal patterns, hot/cold streaks, and adjustment periods. Useful for identifying players who start slow or fade in summer heat.
Count: Performance in different count situations (ahead, behind, even). Elite hitters maintain performance regardless of count; pitch-to-contact hitters often struggle when behind.
Men On/Bases Empty: Clutch performance indicators. Large disparities may indicate pressing or rising to the occasion, though these effects are often overstated.
High/Medium/Low Leverage: Performance in different game situations by leverage index. True clutch performers are rare; most variance is noise.
Inning: Performance by inning, useful for identifying fatigue patterns in pitchers or late-game tendencies.
Day/Night: Some players perform significantly better under lights or in day games, potentially due to vision or circadian rhythm factors.

Game Logs

Game logs provide play-by-play performance for every game a player appeared in. Each log entry includes:

Date, opponent, and location
Traditional statistics (AB, H, R, RBI, BB, K, etc.)
Advanced metrics calculated for that specific game
Pitch counts and usage patterns for pitchers
Win probability added (WPA) showing impact on game outcome

Game logs are invaluable for:

Identifying streaks and slumps
Analyzing performance against specific opponents or pitchers
Tracking workload and fatigue indicators
Building time-series models for prediction
Verifying consistency versus volatility

Projection Systems on FanGraphs

FanGraphs hosts multiple projection systems that forecast future player performance. These systems use different methodologies but all attempt to predict next season's statistics based on historical performance, aging curves, and regression to the mean.

ZiPS (Szymborski Projection System)

Developed by Dan Szymborski, ZiPS uses a nearest-neighbor approach that identifies similar players from baseball history and projects based on their aging patterns. ZiPS strengths include:

Deep historical database spanning decades of player comparisons
Sophisticated aging curves customized by position and skill profile
Automatic regression to league mean based on sample size and reliability
Regular in-season updates as new performance data emerges

Steamer

Created by a team of analysts including Dash Davidson and Peter Rosenbloom, Steamer emphasizes recent performance and uses a hybrid approach combining regression and similar-player analysis. Steamer characteristics:

Heavy weighting of recent three years of performance
Component-based approach projects underlying skills rather than outcomes
Playing time estimates based on depth charts and projected roles
Generally conservative, often underestimating breakouts but avoiding overconfidence in unsustainable performance

ATC (Average Total Cost)

ATC aggregates multiple projection systems by taking weighted averages, creating an ensemble forecast. The methodology:

Combines ZiPS, Steamer, and THE BAT projections
Weights systems based on historical accuracy
Provides range estimates showing uncertainty
Often most accurate due to diversification across methodologies

THE BAT

A newer projection system emphasizing Statcast data and granular skill components:

Incorporates exit velocity, launch angle, and swing metrics
Uses machine learning to identify predictive patterns
Accounts for player development and skill changes
Generally more aggressive in projecting young player development

Using Projections Effectively

Best practices for working with projections:

No projection system is perfect - use multiple systems and compare
Projections work best for established players with 3+ years of data
Young players and rookies have wide uncertainty ranges
Consider projection ranges and confidence intervals, not just point estimates
Update projections as season progresses - early season projections are less accurate
Combine projections with scouting reports for complete player evaluation
Use projections for relative comparisons rather than precise predictions

API and Data Access

While FanGraphs doesn't provide an official public API, several methods exist for programmatic data access. The most reliable approach is using the PyBaseball library in Python or baseballr in R, which scrape FanGraphs pages and return clean DataFrames.

Python Data Access with PyBaseball


import pandas as pd
from pybaseball import fg_batting_data, fg_pitching_data
from pybaseball import batting_stats, pitching_stats
from pybaseball import playerid_lookup, statcast_batter

# Enable caching to reduce load on FanGraphs servers
from pybaseball import cache
cache.enable()

# Fetch FanGraphs batting leaderboard data
def get_batting_leaders(start_season=2023, end_season=2023, min_pa=100):
    """
    Get batting statistics from FanGraphs.

    Parameters:
    -----------
    start_season : int
        First season to include
    end_season : int
        Last season to include
    min_pa : int
        Minimum plate appearances to include

    Returns:
    --------
    DataFrame with batting statistics
    """
    # Using batting_stats function
    data = batting_stats(start_season, end_season, qual=min_pa)

    # Select key columns
    columns = [
        'Name', 'Team', 'Age', 'G', 'PA', 'AB', 'H', 'HR', 'R', 'RBI', 'SB',
        'BB%', 'K%', 'ISO', 'BABIP', 'AVG', 'OBP', 'SLG', 'wOBA', 'wRC+',
        'BsR', 'Off', 'Def', 'WAR'
    ]

    result = data[columns].copy()

    return result

# Example: Get 2023 batting leaders
batting_2023 = get_batting_leaders(2023, 2023, min_pa=400)

# Sort by wRC+ and display top performers
top_hitters = batting_2023.nlargest(10, 'wRC+')
print("Top 10 Hitters by wRC+ (2023):")
print(top_hitters[['Name', 'Team', 'PA', 'wOBA', 'wRC+', 'WAR']])

# Fetch pitching statistics
def get_pitching_leaders(start_season=2023, end_season=2023, min_ip=50):
    """
    Get pitching statistics from FanGraphs.

    Parameters:
    -----------
    start_season : int
        First season to include
    end_season : int
        Last season to include
    min_ip : int
        Minimum innings pitched to include

    Returns:
    --------
    DataFrame with pitching statistics
    """
    data = pitching_stats(start_season, end_season, qual=min_ip)

    columns = [
        'Name', 'Team', 'Age', 'W', 'L', 'SV', 'G', 'GS', 'IP',
        'K/9', 'BB/9', 'HR/9', 'BABIP', 'LOB%', 'GB%', 'HR/FB',
        'ERA', 'FIP', 'xFIP', 'SIERA', 'K%', 'BB%', 'WAR'
    ]

    result = data[columns].copy()

    return result

# Example: Get 2023 pitching leaders
pitching_2023 = get_pitching_leaders(2023, 2023, min_ip=100)

# Sort by WAR
top_pitchers = pitching_2023.nlargest(10, 'WAR')
print("\nTop 10 Pitchers by WAR (2023):")
print(top_pitchers[['Name', 'Team', 'IP', 'ERA', 'FIP', 'WAR']])

# Get specific player data
def get_player_profile(last_name, first_name):
    """
    Look up player and get their career statistics.

    Parameters:
    -----------
    last_name : str
        Player's last name
    first_name : str
        Player's first name

    Returns:
    --------
    Dictionary with player info and career stats
    """
    # Look up player ID
    player_lookup = playerid_lookup(last_name, first_name)

    if len(player_lookup) == 0:
        return {"error": f"Player {first_name} {last_name} not found"}

    player_info = player_lookup.iloc[0]

    # Get career batting stats if available
    try:
        career_batting = batting_stats(
            player_info['mlb_debut'].year if pd.notna(player_info['mlb_debut']) else 2020,
            2023,
            qual=0
        )
        player_stats = career_batting[
            career_batting['Name'].str.contains(first_name) &
            career_batting['Name'].str.contains(last_name)
        ]

        return {
            'player_info': player_info,
            'career_stats': player_stats
        }
    except:
        return {
            'player_info': player_info,
            'career_stats': None
        }

# Example: Get Aaron Judge profile
judge_profile = get_player_profile('Judge', 'Aaron')
print("\nAaron Judge Career Information:")
print(judge_profile['player_info'][['name_first', 'name_last', 'mlb_debut', 'key_mlbam']])

Advanced PyBaseball Usage


# Get player game logs
from pybaseball import schedule_and_record

def get_player_game_log(player_id, year):
    """
    Fetch detailed game-by-game statistics for a player.

    Parameters:
    -----------
    player_id : int
        MLB player ID (MLBAM ID)
    year : int
        Season year

    Returns:
    --------
    DataFrame with game-by-game performance
    """
    # Note: PyBaseball doesn't have direct game log function
    # This is a conceptual example - actual implementation varies

    # Alternative: Use statcast data and aggregate by game
    from datetime import datetime

    start_date = f'{year}-01-01'
    end_date = f'{year}-12-31'

    statcast_data = statcast_batter(start_date, end_date, player_id)

    # Aggregate by game
    game_logs = statcast_data.groupby('game_date').agg({
        'launch_speed': 'mean',
        'launch_angle': 'mean',
        'events': 'count',
        'estimated_woba_using_speedangle': 'mean',
        'woba_value': 'sum'
    }).reset_index()

    game_logs.columns = [
        'game_date', 'avg_exit_velo', 'avg_launch_angle',
        'batted_balls', 'xwOBA', 'total_woba'
    ]

    return game_logs

# Working with projection data
def compare_projections(player_name, year=2024):
    """
    Compare multiple projection systems for a player.

    Parameters:
    -----------
    player_name : str
        Player name
    year : int
        Projection year

    Returns:
    --------
    DataFrame comparing projection systems
    """
    # Note: Projections must be scraped from FanGraphs pages
    # This is a conceptual framework

    projections = {
        'System': ['ZiPS', 'Steamer', 'ATC', 'THE BAT'],
        'PA': [550, 560, 555, 565],
        'HR': [28, 25, 27, 30],
        'R': [85, 82, 84, 88],
        'RBI': [82, 78, 80, 85],
        'SB': [12, 10, 11, 13],
        'AVG': [.275, .268, .272, .278],
        'wOBA': [.350, .342, .346, .355],
        'WAR': [4.2, 3.8, 4.0, 4.5]
    }

    return pd.DataFrame(projections)

# Batch processing multiple players
def analyze_team_offense(team_abbrev, year):
    """
    Analyze offensive performance for all players on a team.

    Parameters:
    -----------
    team_abbrev : str
        Team abbreviation (e.g., 'NYY', 'LAD')
    year : int
        Season year

    Returns:
    --------
    DataFrame with team offensive statistics
    """
    # Get all batting data
    all_batting = batting_stats(year, qual=0)

    # Filter to specific team
    team_batting = all_batting[all_batting['Team'] == team_abbrev].copy()

    # Calculate team totals and averages
    team_summary = {
        'total_war': team_batting['WAR'].sum(),
        'avg_wrc_plus': team_batting['wRC+'].mean(),
        'total_hr': team_batting['HR'].sum(),
        'team_woba': team_batting['wOBA'].mean(),
        'player_count': len(team_batting)
    }

    return team_batting, team_summary

# Example usage
yankees_2023, yankees_summary = analyze_team_offense('NYY', 2023)
print("\nNew York Yankees 2023 Offense Summary:")
for metric, value in yankees_summary.items():
    print(f"  {metric}: {value:.2f}" if isinstance(value, float) else f"  {metric}: {value}")

R Data Access with baseballr


library(baseballr)
library(tidyverse)
library(lubridate)

# Fetch FanGraphs batting leaderboard
get_batting_leaders <- function(start_season = 2023, end_season = 2023, min_pa = 100) {
  # Get batting data from FanGraphs
  data <- fg_batter_leaders(
    startseason = start_season,
    endseason = end_season,
    qual = min_pa,
    ind = 1  # Individual season (0 for aggregate)
  )

  # Select key columns
  result <- data %>%
    select(
      Name, Team, Age, G, PA, AB, H, HR, R, RBI, SB,
      `BB%`, `K%`, ISO, BABIP, AVG, OBP, SLG, wOBA, `wRC+`,
      BsR, Off, Def, WAR
    )

  return(result)
}

# Example: Get 2023 batting leaders
batting_2023 <- get_batting_leaders(2023, 2023, 400)

# Display top hitters
top_hitters <- batting_2023 %>%
  arrange(desc(`wRC+`)) %>%
  head(10) %>%
  select(Name, Team, PA, wOBA, `wRC+`, WAR)

cat("Top 10 Hitters by wRC+ (2023):\n")
print(top_hitters)

# Fetch pitching statistics
get_pitching_leaders <- function(start_season = 2023, end_season = 2023, min_ip = 50) {
  data <- fg_pitcher_leaders(
    startseason = start_season,
    endseason = end_season,
    qual = min_ip,
    ind = 1
  )

  result <- data %>%
    select(
      Name, Team, Age, W, L, SV, G, GS, IP,
      `K/9`, `BB/9`, `HR/9`, BABIP, `LOB%`, `GB%`, `HR/FB`,
      ERA, FIP, xFIP, SIERA, `K%`, `BB%`, WAR
    )

  return(result)
}

# Example: Get 2023 pitching leaders
pitching_2023 <- get_pitching_leaders(2023, 2023, 100)

top_pitchers <- pitching_2023 %>%
  arrange(desc(WAR)) %>%
  head(10) %>%
  select(Name, Team, IP, ERA, FIP, WAR)

cat("\nTop 10 Pitchers by WAR (2023):\n")
print(top_pitchers)

# Get player splits
get_player_splits <- function(playerid, year) {
  # Fetch splits data for a player
  # Note: baseballr functions for splits vary by version

  splits <- fg_batter_game_logs(
    playerid = playerid,
    year = year
  )

  return(splits)
}

# Analyze team performance
analyze_team_offense <- function(team_abbrev, year) {
  # Get all batting data
  all_batting <- get_batting_leaders(year, year, min_pa = 0)

  # Filter to team
  team_batting <- all_batting %>%
    filter(Team == team_abbrev)

  # Calculate team summary
  team_summary <- team_batting %>%
    summarise(
      total_war = sum(WAR, na.rm = TRUE),
      avg_wrc_plus = mean(`wRC+`, na.rm = TRUE),
      total_hr = sum(HR, na.rm = TRUE),
      team_woba = mean(wOBA, na.rm = TRUE),
      player_count = n()
    )

  return(list(
    players = team_batting,
    summary = team_summary
  ))
}

# Example: Analyze Yankees offense
yankees_2023 <- analyze_team_offense("NYY", 2023)

cat("\nNew York Yankees 2023 Offense Summary:\n")
print(yankees_2023$summary)

# Working with projection data
compare_projections <- function(player_name, year = 2024) {
  # Conceptual framework for comparing projections
  # Actual implementation requires scraping FanGraphs projection pages

  # Create example projection comparison
  projections <- tibble(
    System = c('ZiPS', 'Steamer', 'ATC', 'THE BAT'),
    PA = c(550, 560, 555, 565),
    HR = c(28, 25, 27, 30),
    R = c(85, 82, 84, 88),
    RBI = c(82, 78, 80, 85),
    SB = c(12, 10, 11, 13),
    AVG = c(.275, .268, .272, .278),
    wOBA = c(.350, .342, .346, .355),
    WAR = c(4.2, 3.8, 4.0, 4.5)
  )

  return(projections)
}

# Calculate average projection across systems
avg_projection <- compare_projections("Example Player") %>%
  summarise(across(where(is.numeric), mean)) %>%
  mutate(System = "Average")

cat("\nProjection System Comparison:\n")
print(compare_projections("Example Player"))

# Advanced analysis: Identify breakout candidates
identify_breakout_candidates <- function(year) {
  # Get current year and prior year data
  current <- get_batting_leaders(year, year, 250)
  prior <- get_batting_leaders(year - 1, year - 1, 250)

  # Join datasets
  comparison <- current %>%
    select(Name, Team, Age, wRC_current = `wRC+`, WAR_current = WAR) %>%
    inner_join(
      prior %>% select(Name, wRC_prior = `wRC+`, WAR_prior = WAR),
      by = "Name"
    ) %>%
    mutate(
      wrc_improvement = wRC_current - wRC_prior,
      war_improvement = WAR_current - WAR_prior
    ) %>%
    filter(
      wrc_improvement > 20,  # 20+ point wRC+ improvement
      Age <= 26  # Focus on young players
    ) %>%
    arrange(desc(wrc_improvement))

  return(comparison)
}

# Example: Find 2023 breakout players
breakouts_2023 <- identify_breakout_candidates(2023)

cat("\n2023 Breakout Candidates (wRC+ improvement):\n")
print(head(breakouts_2023, 10))

Comparing FanGraphs with Baseball Reference

FanGraphs and Baseball Reference are the two most popular baseball statistics websites. While they overlap significantly, each offers unique features and perspectives that make them complementary resources.

Key Differences

Feature	FanGraphs	Baseball Reference
WAR Calculation	fWAR (FIP-based for pitchers)	bWAR (RA9-based for pitchers)
Focus	Advanced metrics, projections, modern sabermetrics	Historical context, traditional stats, comprehensive archives
Defensive Metrics	UZR, DRS blend	DRS, TZR
Interface	Modern, leaderboard-focused	Traditional, player page-focused
Plate Discipline Data	Extensive (O-Swing%, Z-Swing%, etc.)	Limited
Batted Ball Data	Comprehensive (GB%, FB%, Hard%, etc.)	Basic
Projections	Multiple systems (ZiPS, Steamer, ATC, THE BAT)	Limited projection access
Play Index	Limited search tools	Powerful Play Index for historical queries
Articles	Daily sabermetric analysis and research	Minimal editorial content
Historical Data	Complete but less contextual	Complete with rich historical context
Minor Leagues	Comprehensive coverage	Basic coverage

When to Use Each Site

Use FanGraphs for:

Modern player evaluation using advanced metrics
Projecting future performance
Analyzing plate discipline and batted ball profiles
Comparing projection systems
Understanding pitching with defense-independent metrics
Reading analytical articles and sabermetric research
Exporting leaderboard data for analysis
Minor league player evaluation

Use Baseball Reference for:

Historical research and career comparisons across eras
Comprehensive player pages with complete career statistics
Play Index for complex historical queries
Game logs and play-by-play data
Traditional statistics and counting stats
Awards, transactions, and biographical information
Actual run prevention evaluation (RA9-WAR)
Similarity scores and Hall of Fame statistics

Ideal Approach: Use both sites for comprehensive analysis. Start with FanGraphs for modern metrics and projections, then verify with Baseball Reference's historical context and actual results. The different WAR calculations provide useful bounds on player value - truth often lies between fWAR and bWAR.

FanGraphs Statistics Glossary

Offensive Statistics

Stat	Full Name	Description	League Average
wOBA	Weighted On-Base Average	Overall offensive value with proper weighting	~.320
wRC+	Weighted Runs Created Plus	Park and league-adjusted offensive value	100
ISO	Isolated Power	Raw power measure (SLG - AVG)	~.140
BABIP	Batting Average on Balls In Play	BA excluding HR and K	~.300
BB%	Walk Percentage	Walks per plate appearance	~8.5%
K%	Strikeout Percentage	Strikeouts per plate appearance	~22%
O-Swing%	Outside Swing Percentage	Swings on pitches outside zone	~30%
Z-Swing%	Zone Swing Percentage	Swings on pitches in zone	~67%
Contact%	Contact Percentage	Contact made on swings	~78%
Hard%	Hard Hit Percentage	Percentage of hard-hit balls	~35%
Barrel%	Barrel Percentage	Optimal contact (EV + LA combination)	~6-7%
BsR	Base Running Runs	Runs contributed by base running	0
Off	Offensive Runs	Batting runs above average	0
Def	Defensive Runs	Fielding runs above average	0

Pitching Statistics

Stat	Full Name	Description	League Average
FIP	Fielding Independent Pitching	ERA estimator using K, BB, HR only	~4.00
xFIP	Expected FIP	FIP with normalized HR/FB rate	~4.00
SIERA	Skill-Interactive ERA	Advanced ERA estimator	~4.00
K/9	Strikeouts per 9 Innings	Strikeout rate	~8.5
BB/9	Walks per 9 Innings	Walk rate	~3.0
K%	Strikeout Percentage	Strikeouts per batter faced	~22%
BB%	Walk Percentage	Walks per batter faced	~8%
K-BB%	Strikeout Minus Walk Percentage	Net K vs BB rate	~14%
LOB%	Left On Base Percentage	Strand rate for runners	~72%
GB%	Ground Ball Percentage	Ground balls per ball in play	~45%
FB%	Fly Ball Percentage	Fly balls per ball in play	~35%
HR/FB	Home Run per Fly Ball	Home runs as % of fly balls	~10-11%
WHIP	Walks + Hits per Inning Pitched	Base runners allowed per inning	~1.30
Soft%	Soft Contact Percentage	Weakly hit balls	~20%
Hard%	Hard Contact Percentage	Hard hit balls allowed	~35%

Value and Contextual Statistics

Stat	Full Name	Description	Scale
WAR	Wins Above Replacement	Total player value in wins	0 = replacement, 2 = average, 5 = All-Star, 8+ = MVP
WPA	Win Probability Added	Impact on game win probability	Sum to team wins - losses
RE24	Run Expectancy based on 24 base-out states	Runs added by changing game states	0 = average
REW	Run Expectancy Wins	RE24 converted to wins	Similar to WPA
LI	Leverage Index	Game situation importance	1.0 = average, 2.0 = 2x pressure
Clutch	Clutch Score	Performance in high-leverage situations	0 = neutral, positive = clutch
Dollars	Dollar Value	Estimated market value in $	Based on $/WAR conversion

Best Practices and Tips

Effective FanGraphs Usage

Use Custom Date Ranges: Analyze recent performance (last 30 days, second half) to identify trends and changes in approach or skill level.
Compare Multiple Metrics: Don't rely on single statistics. Cross-reference wOBA with wRC+, ISO, and plate discipline metrics for complete picture.
Check Sample Sizes: Small samples create noise. Require 200+ PA for batting, 50+ IP for pitching before drawing conclusions.
Use FIP Family for Pitchers: FIP, xFIP, and SIERA provide better predictive power than ERA for evaluating pitching talent.
Investigate Discrepancies: Large gaps between ERA and FIP suggest regression coming. High BABIP with low Hard% indicates bad luck.
Respect Projection Ranges: Point estimates are less useful than understanding uncertainty ranges around projections.
Export for Analysis: Download CSV files for statistical modeling, visualization, and custom analysis in Python/R.
Read the Glossary: FanGraphs provides detailed explanations of every metric - understand what you're measuring.

Common Pitfalls to Avoid

Overvaluing Wins and RBI: These heavily context-dependent stats poorly measure individual value.
Ignoring Defense: Defensive value is huge - a +15 run defender is worth ~1.5 WAR.
Treating WAR as Exact: WAR is an estimate with uncertainty. A 4.2 WAR player isn't meaningfully better than 3.9.
Cherry-Picking Metrics: Don't select the single stat that supports your narrative - use comprehensive evaluation.
Misunderstanding FIP: FIP predicts future ERA but isn't necessarily "better" than actual ERA for past evaluation.
Ignoring Batted Ball Data: Exit velocity, launch angle, and barrel rate reveal skills traditional stats miss.
Forgetting Context: Park effects, league differences, and era matter. Use wRC+ and ERA+ for fair comparisons.

Key Takeaways

FanGraphs is the premier source for modern baseball analytics, offering comprehensive statistics, advanced metrics, projection systems, and data export capabilities.
Understanding core FanGraphs metrics like wOBA, wRC+, FIP, and WAR is essential for modern player evaluation and analysis.
The difference between fWAR (FIP-based) and bWAR (runs-based) reflects different philosophies about pitcher evaluation - both provide value.
FanGraphs excels at plate discipline data, batted ball profiles, and predictive metrics that reveal underlying skills beyond results.
Multiple projection systems (ZiPS, Steamer, ATC, THE BAT) provide different perspectives on future performance - ensemble approaches work best.
PyBaseball (Python) and baseballr (R) enable programmatic access to FanGraphs data for statistical analysis and modeling.
FanGraphs and Baseball Reference are complementary - FanGraphs for predictive analysis and modern metrics, Baseball Reference for historical context.
Effective FanGraphs usage requires understanding sample size, metric limitations, and using multiple statistics for comprehensive evaluation.
Data export capabilities make FanGraphs invaluable for research, creating a bridge between web-based exploration and advanced statistical analysis.
Regular engagement with FanGraphs articles and glossary entries deepens understanding of sabermetric principles and analytical best practices.

Getting Data from Baseball Savant Previous

Working with Retrosheet Historical Data Next

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.

Table of Contents