Introduction to Tracking Data

Beginner 10 min read 1 views Nov 27, 2025
Tracking data captures the x,y coordinates of every player and the ball at high frequency (typically 10-25 Hz), enabling sophisticated spatial and physical analysis of soccer matches. ## What is Tracking Data? Unlike event data which records discrete actions, tracking data provides continuous positional information: - Player positions at 10-25 frames per second - Ball position and possession status - Player identifiers and team assignments - Synchronized with match clock ## Data Structure Tracking data typically comes in CSV or JSON format: ```python import pandas as pd # Example tracking data structure tracking_df = pd.DataFrame({ 'frame': [1, 1, 1, 2, 2, 2], 'timestamp': [0.0, 0.0, 0.0, 0.04, 0.04, 0.04], 'player_id': ['P1', 'P2', 'Ball', 'P1', 'P2', 'Ball'], 'x': [52.3, 48.1, 60.2, 52.4, 48.3, 60.5], 'y': [34.2, 28.5, 35.1, 34.3, 28.6, 35.2], 'team': ['Home', 'Away', None, 'Home', 'Away', None] }) ``` ## Basic Analysis: Calculating Speed ```python def calculate_player_speed(tracking_data, player_id): player_data = tracking_data[tracking_data['player_id'] == player_id].copy() # Calculate distance between frames player_data['dx'] = player_data['x'].diff() player_data['dy'] = player_data['y'].diff() player_data['distance'] = np.sqrt(player_data['dx']**2 + player_data['dy']**2) # Calculate speed (distance per second) player_data['dt'] = player_data['timestamp'].diff() player_data['speed'] = player_data['distance'] / player_data['dt'] return player_data ``` ## Pitch Control Models Tracking data enables pitch control analysis, showing which team controls each area: ```python def calculate_pitch_control(frame_data, sigma=10): # Simplified pitch control using Gaussian influence x_grid, y_grid = np.meshgrid(np.linspace(0, 105, 50), np.linspace(0, 68, 32)) home_control = np.zeros_like(x_grid) away_control = np.zeros_like(x_grid) for _, player in frame_data.iterrows(): if player['team'] == 'Home': home_control += np.exp(-((x_grid - player['x'])**2 + (y_grid - player['y'])**2) / sigma**2) elif player['team'] == 'Away': away_control += np.exp(-((x_grid - player['x'])**2 + (y_grid - player['y'])**2) / sigma**2) pitch_control = home_control / (home_control + away_control) return pitch_control ``` ## Physical Performance Metrics Tracking data enables detailed physical analysis: - Total distance covered - High-speed running distance - Sprint counts and distances - Acceleration and deceleration events - Heat maps and position distributions ## Accessing Tracking Data Public tracking data is limited but available from: - Metrica Sports (sample matches on GitHub) - SkillCorner (limited open data) - Signality (research datasets) - Last Row/Friends of Tracking tutorials Tracking data opens up advanced analytics but requires significant computational resources and domain expertise to analyze effectively.

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.