Using StatsBomb Free Data

Beginner 10 min read 0 views Nov 27, 2025
StatsBomb provides high-quality event data for several major competitions through their open data initiative. This guide covers how to access and work with this valuable resource. ## Accessing StatsBomb Open Data StatsBomb's open data is available through their GitHub repository and can be accessed via Python using the statsbombpy package: ```python from statsbombpy import sb # Get available competitions competitions = sb.competitions() print(competitions) # Get matches for a specific competition matches = sb.matches(competition_id=11, season_id=90) # Get event data for a match events = sb.events(match_id=3788741) ``` ## Understanding the Data Structure StatsBomb data includes detailed event information with 360-degree context: - Player positions at the time of each event - Pass end locations and recipients - Shot outcome details and expected goals (xG) - Pressure events and defensive actions - Freeze frames showing player positions ## Common Analysis Tasks ### Calculating Pass Networks ```python import pandas as pd # Filter for completed passes passes = events[events['type'] == 'Pass'] passes = passes[passes['pass_outcome'].isna()] # Group by passer and recipient pass_network = passes.groupby(['player', 'pass_recipient']).size().reset_index(name='passes') ``` ### Analyzing Shot Locations ```python shots = events[events['type'] == 'Shot'] shots['x'] = shots['location'].apply(lambda x: x[0]) shots['y'] = shots['location'].apply(lambda x: x[1]) # Plot shot map with xG import matplotlib.pyplot as plt plt.scatter(shots['x'], shots['y'], s=shots['shot_statsbomb_xg']*500, alpha=0.5) ``` ## Data Limitations While StatsBomb open data is excellent for learning and analysis, be aware of: - Limited competition coverage compared to commercial offerings - Data may not be updated in real-time - Some advanced metrics require additional processing - License restrictions on commercial use StatsBomb open data remains one of the best resources for learning soccer analytics and building proof-of-concept models.

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.