Speed and Distance Metrics

Beginner 10 min read 0 views Nov 27, 2025
# Speed and Distance Metrics ## Overview Speed and distance metrics have become essential components of modern sports analytics, particularly with the advent of GPS tracking, optical tracking systems, and wearable technology. These metrics provide insights into player workload, tactical positioning, fitness levels, and game intensity across various sports. ## Key Metrics ### 1. Total Distance Covered **Definition**: The total ground covered by a player during a match or training session. **Typical Values by Sport**: - Soccer: 9-13 km per match - Basketball: 4-5 km per game - American Football: 1.5-2 km (skill positions), 0.5-1 km (linemen) - Rugby: 6-8 km per match **Analysis Considerations**: - Position-specific variations - Game duration and format - Playing style and tactics ### 2. High-Speed Running (HSR) **Definition**: Distance covered above a sport-specific speed threshold. **Common Thresholds**: - Soccer: >19.8 km/h (5.5 m/s) - Rugby: >18 km/h - Australian Football: >18 km/h **Significance**: Indicates high-intensity efforts and physical demands. ### 3. Sprint Distance **Definition**: Distance covered at maximum or near-maximum speed. **Typical Thresholds**: - Soccer: >25 km/h (7 m/s) - Basketball: >21 km/h - Rugby: >25 km/h ### 4. Average Speed **Definition**: Total distance divided by active playing time. **Calculation**: ``` Average Speed (km/h) = (Total Distance / Playing Time in hours) ``` **Contextual Factors**: - Ball-in-play time vs. total time - Position-specific roles - Game state (winning/losing) ### 5. Maximum Speed (Peak Speed) **Definition**: Highest instantaneous speed recorded during a session. **Elite Benchmarks**: - Soccer: 32-36 km/h - American Football: 35-37 km/h - Rugby: 33-36 km/h ### 6. Acceleration and Deceleration Metrics **High-Intensity Accelerations**: Number of efforts above 3 m/s² **High-Intensity Decelerations**: Number of efforts below -3 m/s² **Importance**: Often more demanding than constant-speed running and critical for injury risk. ### 7. Metabolic Power and Equivalent Distance **Metabolic Power**: Estimates energy cost by incorporating acceleration/deceleration. **Calculation**: ``` Power (W/kg) = (ES × g) + (g × tan(α)) ``` Where ES = equivalent slope, g = gravity, α = running direction angle ### 8. Distance Per Minute (Intensity Index) **Definition**: Distance covered per minute of playing time. **Calculation**: ``` Distance Per Minute = Total Distance / Minutes Played ``` **Advantages**: Normalizes for playing time, useful for substitute comparisons. ## Position-Specific Differences ### Soccer | Position | Total Distance (km) | HSR (m) | Sprint Distance (m) | Avg Speed (km/h) | |----------|-------------------|---------|---------------------|------------------| | Central Midfielder | 11-12 | 800-1200 | 200-400 | 7.0-7.5 | | Wide Midfielder/Winger | 10-11 | 1200-1600 | 400-600 | 6.5-7.0 | | Full-Back | 10-11 | 900-1300 | 300-500 | 6.5-7.0 | | Central Defender | 9-10 | 400-700 | 100-250 | 6.0-6.5 | | Striker | 9-10 | 900-1300 | 350-550 | 6.0-6.5 | ### Basketball | Position | Distance (km) | Avg Speed (km/h) | Max Speed (km/h) | HSR Distance (m) | |----------|--------------|------------------|------------------|------------------| | Point Guard | 4.5-5.2 | 4.8-5.2 | 28-30 | 600-800 | | Shooting Guard | 4.3-5.0 | 4.7-5.1 | 27-29 | 550-750 | | Small Forward | 4.0-4.8 | 4.5-4.9 | 26-28 | 500-700 | | Power Forward | 3.8-4.5 | 4.3-4.7 | 25-27 | 450-650 | | Center | 3.5-4.2 | 4.0-4.5 | 24-26 | 400-600 | ### American Football | Position | Distance per Play (m) | Max Speed (km/h) | High-Speed Dist/Game (m) | |----------|----------------------|------------------|-------------------------| | Wide Receiver | 15-25 | 35-37 | 800-1200 | | Cornerback | 12-20 | 34-36 | 700-1100 | | Running Back | 8-15 | 32-35 | 600-1000 | | Linebacker | 5-12 | 30-33 | 400-800 | | Offensive Line | 2-5 | 22-26 | 100-300 | ## Data Collection Methods ### 1. GPS/GNSS Tracking - **Frequency**: 10-18 Hz - **Accuracy**: ±5% for total distance, ±10% for high-speed running - **Advantages**: Direct measurement, player-specific data - **Limitations**: Satellite signal quality, indoor use limitations ### 2. Optical Tracking Systems - **Technology**: Multi-camera systems with computer vision - **Frequency**: 25 Hz - **Accuracy**: ±2-3% for distance metrics - **Advantages**: High accuracy, no wearable required - **Limitations**: Installation cost, venue-specific ### 3. Local Positioning Systems (LPS) - **Technology**: Ultra-wideband or radio frequency anchors - **Frequency**: 20-50 Hz - **Advantages**: Indoor capability, high accuracy - **Limitations**: Infrastructure requirements ### 4. Inertial Measurement Units (IMU) - **Sensors**: Accelerometers, gyroscopes, magnetometers - **Advantages**: Captures acceleration/deceleration detail - **Limitations**: Drift in distance calculations ## Python Implementation ### Basic Speed and Distance Analysis ```python import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from scipy import stats from datetime import datetime class SpeedDistanceAnalyzer: """ Comprehensive analyzer for player speed and distance metrics """ def __init__(self, speed_threshold_hsr=5.5, speed_threshold_sprint=7.0): """ Initialize analyzer with speed thresholds Parameters: ----------- speed_threshold_hsr : float High-speed running threshold in m/s (default: 5.5 m/s = 19.8 km/h) speed_threshold_sprint : float Sprint threshold in m/s (default: 7.0 m/s = 25.2 km/h) """ self.hsr_threshold = speed_threshold_hsr self.sprint_threshold = speed_threshold_sprint def calculate_distance(self, positions, timestamps): """ Calculate total distance from position data Parameters: ----------- positions : numpy.ndarray Array of (x, y) coordinates in meters timestamps : numpy.ndarray Array of timestamps in seconds Returns: -------- float : Total distance in meters """ distances = np.sqrt(np.sum(np.diff(positions, axis=0)**2, axis=1)) return np.sum(distances) def calculate_speed(self, positions, timestamps): """ Calculate instantaneous speed from position data Parameters: ----------- positions : numpy.ndarray Array of (x, y) coordinates timestamps : numpy.ndarray Array of timestamps in seconds Returns: -------- numpy.ndarray : Speed values in m/s """ distances = np.sqrt(np.sum(np.diff(positions, axis=0)**2, axis=1)) time_diff = np.diff(timestamps) # Avoid division by zero time_diff[time_diff == 0] = 0.001 speeds = distances / time_diff return speeds def calculate_acceleration(self, speeds, timestamps): """ Calculate acceleration from speed data Parameters: ----------- speeds : numpy.ndarray Array of speed values in m/s timestamps : numpy.ndarray Array of timestamps in seconds Returns: -------- numpy.ndarray : Acceleration values in m/s² """ speed_diff = np.diff(speeds) time_diff = np.diff(timestamps[1:]) # Adjusted for speed array length time_diff[time_diff == 0] = 0.001 accelerations = speed_diff / time_diff return accelerations def categorize_speed_zones(self, speeds): """ Categorize speeds into zones Parameters: ----------- speeds : numpy.ndarray Array of speed values in m/s Returns: -------- dict : Distance covered in each speed zone """ # Speed zones (in m/s) zones = { 'walking': (0, 2.0), # 0-7.2 km/h 'jogging': (2.0, 4.0), # 7.2-14.4 km/h 'running': (4.0, 5.5), # 14.4-19.8 km/h 'high_speed': (5.5, 7.0), # 19.8-25.2 km/h 'sprint': (7.0, np.inf) # >25.2 km/h } results = {} for zone, (low, high) in zones.items(): mask = (speeds >= low) & (speeds < high) results[zone] = np.sum(mask) return results def calculate_hsr_distance(self, positions, speeds, timestamps): """ Calculate high-speed running distance Parameters: ----------- positions : numpy.ndarray Array of (x, y) coordinates speeds : numpy.ndarray Array of speed values in m/s timestamps : numpy.ndarray Array of timestamps Returns: -------- float : HSR distance in meters """ hsr_mask = speeds >= self.hsr_threshold hsr_positions = positions[1:][hsr_mask] if len(hsr_positions) < 2: return 0.0 hsr_distances = np.sqrt(np.sum(np.diff(hsr_positions, axis=0)**2, axis=1)) return np.sum(hsr_distances) def calculate_sprint_distance(self, positions, speeds, timestamps): """ Calculate sprint distance Parameters: ----------- positions : numpy.ndarray Array of (x, y) coordinates speeds : numpy.ndarray Array of speed values in m/s timestamps : numpy.ndarray Array of timestamps Returns: -------- float : Sprint distance in meters """ sprint_mask = speeds >= self.sprint_threshold sprint_positions = positions[1:][sprint_mask] if len(sprint_positions) < 2: return 0.0 sprint_distances = np.sqrt(np.sum(np.diff(sprint_positions, axis=0)**2, axis=1)) return np.sum(sprint_distances) def detect_high_intensity_events(self, accelerations, threshold=3.0): """ Detect high-intensity acceleration and deceleration events Parameters: ----------- accelerations : numpy.ndarray Array of acceleration values in m/s² threshold : float Threshold for high-intensity events (default: 3.0 m/s²) Returns: -------- dict : Counts of acceleration and deceleration events """ high_accel = np.sum(accelerations > threshold) high_decel = np.sum(accelerations < -threshold) return { 'high_accelerations': high_accel, 'high_decelerations': high_decel, 'total_high_intensity': high_accel + high_decel } def calculate_metabolic_power(self, speeds, accelerations, terrain_angle=0): """ Calculate metabolic power (di Prampero et al., 2005) Parameters: ----------- speeds : numpy.ndarray Speed values in m/s accelerations : numpy.ndarray Acceleration values in m/s² terrain_angle : float Terrain angle in degrees (default: 0) Returns: -------- numpy.ndarray : Metabolic power in W/kg """ g = 9.81 # gravity (m/s²) # Equivalent slope ES = np.arctan(accelerations / g) + np.radians(terrain_angle) # Energy cost coefficient (J/kg/m) EC = 155.4 * ES**5 - 30.4 * ES**4 - 43.3 * ES**3 + \ 46.3 * ES**2 + 19.5 * ES + 3.6 # Metabolic power (W/kg) power = EC * speeds return power def analyze_player_session(self, positions, timestamps, player_name, session_duration_minutes): """ Comprehensive session analysis for a player Parameters: ----------- positions : numpy.ndarray Array of (x, y) coordinates timestamps : numpy.ndarray Array of timestamps in seconds player_name : str Player identifier session_duration_minutes : float Total session duration in minutes Returns: -------- dict : Comprehensive metrics """ # Calculate basic metrics total_distance = self.calculate_distance(positions, timestamps) speeds = self.calculate_speed(positions, timestamps) accelerations = self.calculate_acceleration(speeds, timestamps) # Speed statistics avg_speed = np.mean(speeds) max_speed = np.max(speeds) # Distance metrics hsr_distance = self.calculate_hsr_distance(positions, speeds, timestamps) sprint_distance = self.calculate_sprint_distance(positions, speeds, timestamps) # Speed zones speed_zones = self.categorize_speed_zones(speeds) # High-intensity events hi_events = self.detect_high_intensity_events(accelerations) # Metabolic power metabolic_power = self.calculate_metabolic_power(speeds[:-1], accelerations) avg_metabolic_power = np.mean(metabolic_power) # Intensity metrics distance_per_minute = total_distance / session_duration_minutes return { 'player': player_name, 'session_duration_min': session_duration_minutes, 'total_distance_m': total_distance, 'total_distance_km': total_distance / 1000, 'average_speed_ms': avg_speed, 'average_speed_kmh': avg_speed * 3.6, 'max_speed_ms': max_speed, 'max_speed_kmh': max_speed * 3.6, 'hsr_distance_m': hsr_distance, 'sprint_distance_m': sprint_distance, 'distance_per_minute': distance_per_minute, 'high_accelerations': hi_events['high_accelerations'], 'high_decelerations': hi_events['high_decelerations'], 'avg_metabolic_power_wkg': avg_metabolic_power, 'speed_zones': speed_zones } def compare_positions(self, player_data_dict): """ Compare metrics across different positions Parameters: ----------- player_data_dict : dict Dictionary with position as key and list of player metrics as values Returns: -------- pandas.DataFrame : Comparative statistics """ results = [] for position, players in player_data_dict.items(): df = pd.DataFrame(players) summary = { 'position': position, 'n_players': len(players), 'avg_total_distance_km': df['total_distance_km'].mean(), 'std_total_distance_km': df['total_distance_km'].std(), 'avg_hsr_m': df['hsr_distance_m'].mean(), 'std_hsr_m': df['hsr_distance_m'].std(), 'avg_sprint_m': df['sprint_distance_m'].mean(), 'std_sprint_m': df['sprint_distance_m'].std(), 'avg_max_speed_kmh': df['max_speed_kmh'].mean(), 'avg_high_accelerations': df['high_accelerations'].mean(), 'avg_high_decelerations': df['high_decelerations'].mean() } results.append(summary) return pd.DataFrame(results) def visualize_speed_profile(self, speeds, timestamps, player_name): """ Visualize speed profile over time Parameters: ----------- speeds : numpy.ndarray Speed values in m/s timestamps : numpy.ndarray Timestamps in seconds player_name : str Player identifier """ fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 8)) # Speed over time ax1.plot(timestamps[1:], speeds * 3.6, color='#2E86AB', linewidth=0.8) ax1.axhline(y=self.hsr_threshold * 3.6, color='orange', linestyle='--', label=f'HSR Threshold ({self.hsr_threshold * 3.6:.1f} km/h)') ax1.axhline(y=self.sprint_threshold * 3.6, color='red', linestyle='--', label=f'Sprint Threshold ({self.sprint_threshold * 3.6:.1f} km/h)') ax1.set_xlabel('Time (seconds)') ax1.set_ylabel('Speed (km/h)') ax1.set_title(f'Speed Profile: {player_name}') ax1.legend() ax1.grid(True, alpha=0.3) # Speed distribution ax2.hist(speeds * 3.6, bins=50, color='#2E86AB', alpha=0.7, edgecolor='black') ax2.axvline(x=self.hsr_threshold * 3.6, color='orange', linestyle='--', linewidth=2) ax2.axvline(x=self.sprint_threshold * 3.6, color='red', linestyle='--', linewidth=2) ax2.set_xlabel('Speed (km/h)') ax2.set_ylabel('Frequency') ax2.set_title('Speed Distribution') ax2.grid(True, alpha=0.3) plt.tight_layout() return fig def visualize_position_comparison(self, comparison_df): """ Visualize position-based comparisons Parameters: ----------- comparison_df : pandas.DataFrame DataFrame from compare_positions method """ fig, axes = plt.subplots(2, 2, figsize=(15, 10)) positions = comparison_df['position'] # Total distance axes[0, 0].bar(positions, comparison_df['avg_total_distance_km'], color='#2E86AB', alpha=0.7) axes[0, 0].errorbar(positions, comparison_df['avg_total_distance_km'], yerr=comparison_df['std_total_distance_km'], fmt='none', color='black', capsize=5) axes[0, 0].set_ylabel('Distance (km)') axes[0, 0].set_title('Average Total Distance by Position') axes[0, 0].tick_params(axis='x', rotation=45) axes[0, 0].grid(True, alpha=0.3) # HSR distance axes[0, 1].bar(positions, comparison_df['avg_hsr_m'], color='orange', alpha=0.7) axes[0, 1].errorbar(positions, comparison_df['avg_hsr_m'], yerr=comparison_df['std_hsr_m'], fmt='none', color='black', capsize=5) axes[0, 1].set_ylabel('Distance (m)') axes[0, 1].set_title('Average High-Speed Running Distance by Position') axes[0, 1].tick_params(axis='x', rotation=45) axes[0, 1].grid(True, alpha=0.3) # Sprint distance axes[1, 0].bar(positions, comparison_df['avg_sprint_m'], color='red', alpha=0.7) axes[1, 0].errorbar(positions, comparison_df['avg_sprint_m'], yerr=comparison_df['std_sprint_m'], fmt='none', color='black', capsize=5) axes[1, 0].set_ylabel('Distance (m)') axes[1, 0].set_title('Average Sprint Distance by Position') axes[1, 0].tick_params(axis='x', rotation=45) axes[1, 0].grid(True, alpha=0.3) # Max speed axes[1, 1].bar(positions, comparison_df['avg_max_speed_kmh'], color='#A23B72', alpha=0.7) axes[1, 1].set_ylabel('Speed (km/h)') axes[1, 1].set_title('Average Maximum Speed by Position') axes[1, 1].tick_params(axis='x', rotation=45) axes[1, 1].grid(True, alpha=0.3) plt.tight_layout() return fig # Example Usage if __name__ == "__main__": # Generate sample tracking data np.random.seed(42) # Simulate a player's movement during a match (90 minutes, 10 Hz sampling) num_samples = 90 * 60 * 10 # 54,000 samples timestamps = np.linspace(0, 90 * 60, num_samples) # Simulate position data (simplified random walk with varying speeds) positions = np.zeros((num_samples, 2)) for i in range(1, num_samples): # Random movement with occasional sprints if np.random.random() > 0.95: # 5% chance of sprint step = np.random.randn(2) * 2.5 else: step = np.random.randn(2) * 0.5 positions[i] = positions[i-1] + step # Initialize analyzer analyzer = SpeedDistanceAnalyzer() # Analyze session results = analyzer.analyze_player_session( positions=positions, timestamps=timestamps, player_name="Player A", session_duration_minutes=90 ) print("=== Player Session Analysis ===") print(f"Player: {results['player']}") print(f"Total Distance: {results['total_distance_km']:.2f} km") print(f"Average Speed: {results['average_speed_kmh']:.2f} km/h") print(f"Max Speed: {results['max_speed_kmh']:.2f} km/h") print(f"HSR Distance: {results['hsr_distance_m']:.0f} m") print(f"Sprint Distance: {results['sprint_distance_m']:.0f} m") print(f"Distance per Minute: {results['distance_per_minute']:.1f} m/min") print(f"High Accelerations: {results['high_accelerations']}") print(f"High Decelerations: {results['high_decelerations']}") print(f"Avg Metabolic Power: {results['avg_metabolic_power_wkg']:.2f} W/kg") # Visualize speeds = analyzer.calculate_speed(positions, timestamps) fig = analyzer.visualize_speed_profile(speeds, timestamps, "Player A") plt.savefig('speed_profile.png', dpi=300, bbox_inches='tight') plt.close() print("\nSpeed profile visualization saved as 'speed_profile.png'") ``` ### Advanced Analysis: Team-Wide Comparison ```python import pandas as pd import numpy as np from sklearn.preprocessing import StandardScaler from sklearn.cluster import KMeans import matplotlib.pyplot as plt import seaborn as sns def load_tracking_data(file_path): """ Load tracking data from CSV Expected columns: player_id, timestamp, x, y, speed, position """ df = pd.read_csv(file_path) return df def calculate_team_metrics(tracking_df, match_duration=90): """ Calculate comprehensive metrics for all players """ analyzer = SpeedDistanceAnalyzer() team_results = [] for player_id in tracking_df['player_id'].unique(): player_df = tracking_df[tracking_df['player_id'] == player_id] positions = player_df[['x', 'y']].values timestamps = player_df['timestamp'].values position = player_df['position'].iloc[0] metrics = analyzer.analyze_player_session( positions=positions, timestamps=timestamps, player_name=player_id, session_duration_minutes=match_duration ) metrics['position'] = position team_results.append(metrics) return pd.DataFrame(team_results) def cluster_player_profiles(metrics_df): """ Cluster players based on their physical profiles """ features = ['total_distance_km', 'hsr_distance_m', 'sprint_distance_m', 'max_speed_kmh', 'high_accelerations', 'high_decelerations'] X = metrics_df[features].values # Standardize features scaler = StandardScaler() X_scaled = scaler.fit_transform(X) # K-means clustering kmeans = KMeans(n_clusters=3, random_state=42) metrics_df['cluster'] = kmeans.fit_predict(X_scaled) return metrics_df, kmeans def visualize_team_heatmap(metrics_df): """ Create heatmap of team metrics """ metrics_cols = ['total_distance_km', 'hsr_distance_m', 'sprint_distance_m', 'max_speed_kmh', 'high_accelerations', 'high_decelerations'] plt.figure(figsize=(12, 8)) sns.heatmap(metrics_df[metrics_cols].T, xticklabels=metrics_df['player'], yticklabels=metrics_cols, cmap='YlOrRd', annot=True, fmt='.1f', cbar_kws={'label': 'Value'}) plt.title('Team Physical Performance Heatmap') plt.xlabel('Player') plt.ylabel('Metric') plt.tight_layout() return plt.gcf() ``` ## R Implementation ### Comprehensive Speed and Distance Analysis ```r library(tidyverse) library(lubridate) library(ggplot2) library(plotly) library(gridExtra) library(stats) # Speed and Distance Analyzer Class SpeedDistanceAnalyzer <- R6::R6Class( "SpeedDistanceAnalyzer", public = list( hsr_threshold = NULL, sprint_threshold = NULL, initialize = function(hsr_threshold = 5.5, sprint_threshold = 7.0) { self$hsr_threshold <- hsr_threshold self$sprint_threshold <- sprint_threshold }, calculate_distance = function(positions) { # Calculate total distance from x, y coordinates dx <- diff(positions$x) dy <- diff(positions$y) distances <- sqrt(dx^2 + dy^2) sum(distances, na.rm = TRUE) }, calculate_speed = function(positions, timestamps) { # Calculate instantaneous speed dx <- diff(positions$x) dy <- diff(positions$y) distances <- sqrt(dx^2 + dy^2) time_diff <- diff(timestamps) time_diff[time_diff == 0] <- 0.001 speeds <- distances / time_diff speeds }, calculate_acceleration = function(speeds, timestamps) { # Calculate acceleration from speed data speed_diff <- diff(speeds) time_diff <- diff(timestamps[-1]) time_diff[time_diff == 0] <- 0.001 accelerations <- speed_diff / time_diff accelerations }, categorize_speed_zones = function(speeds, time_diff) { # Categorize speeds into zones and calculate distance in each zones <- tibble( walking = sum(speeds < 2.0) * mean(time_diff), jogging = sum(speeds >= 2.0 & speeds < 4.0) * mean(time_diff), running = sum(speeds >= 4.0 & speeds < 5.5) * mean(time_diff), high_speed = sum(speeds >= 5.5 & speeds < 7.0) * mean(time_diff), sprint = sum(speeds >= 7.0) * mean(time_diff) ) zones }, calculate_hsr_distance = function(speeds, time_diff) { # Calculate high-speed running distance hsr_mask <- speeds >= self$hsr_threshold sum(speeds[hsr_mask] * time_diff[hsr_mask], na.rm = TRUE) }, calculate_sprint_distance = function(speeds, time_diff) { # Calculate sprint distance sprint_mask <- speeds >= self$sprint_threshold sum(speeds[sprint_mask] * time_diff[sprint_mask], na.rm = TRUE) }, detect_high_intensity_events = function(accelerations, threshold = 3.0) { # Detect high-intensity acceleration and deceleration events list( high_accelerations = sum(accelerations > threshold, na.rm = TRUE), high_decelerations = sum(accelerations < -threshold, na.rm = TRUE), total_high_intensity = sum(abs(accelerations) > threshold, na.rm = TRUE) ) }, analyze_player_session = function(tracking_data, player_id, session_duration_min) { # Comprehensive player session analysis player_data <- tracking_data %>% filter(player == player_id) positions <- player_data %>% select(x, y) timestamps <- player_data$timestamp # Calculate metrics total_distance <- self$calculate_distance(positions) speeds <- self$calculate_speed(positions, timestamps) time_diff <- diff(timestamps) avg_speed <- mean(speeds, na.rm = TRUE) max_speed <- max(speeds, na.rm = TRUE) hsr_distance <- self$calculate_hsr_distance(speeds[-length(speeds)], time_diff) sprint_distance <- self$calculate_sprint_distance(speeds[-length(speeds)], time_diff) speed_zones <- self$categorize_speed_zones(speeds, time_diff) accelerations <- self$calculate_acceleration(speeds, timestamps[-1]) hi_events <- self$detect_high_intensity_events(accelerations) distance_per_minute <- total_distance / session_duration_min # Return tibble with results tibble( player = player_id, session_duration_min = session_duration_min, total_distance_m = total_distance, total_distance_km = total_distance / 1000, average_speed_ms = avg_speed, average_speed_kmh = avg_speed * 3.6, max_speed_ms = max_speed, max_speed_kmh = max_speed * 3.6, hsr_distance_m = hsr_distance, sprint_distance_m = sprint_distance, distance_per_minute = distance_per_minute, high_accelerations = hi_events$high_accelerations, high_decelerations = hi_events$high_decelerations, total_high_intensity = hi_events$total_high_intensity ) }, visualize_speed_profile = function(tracking_data, player_id) { # Create speed profile visualization player_data <- tracking_data %>% filter(player == player_id) positions <- player_data %>% select(x, y) timestamps <- player_data$timestamp speeds <- self$calculate_speed(positions, timestamps) * 3.6 # Convert to km/h plot_data <- tibble( time = timestamps[-1], speed = speeds ) p1 <- ggplot(plot_data, aes(x = time, y = speed)) + geom_line(color = "#2E86AB", size = 0.5) + geom_hline(yintercept = self$hsr_threshold * 3.6, color = "orange", linetype = "dashed", size = 1) + geom_hline(yintercept = self$sprint_threshold * 3.6, color = "red", linetype = "dashed", size = 1) + labs( title = paste("Speed Profile:", player_id), x = "Time (seconds)", y = "Speed (km/h)" ) + theme_minimal() + theme( plot.title = element_text(size = 14, face = "bold"), axis.title = element_text(size = 12) ) p2 <- ggplot(plot_data, aes(x = speed)) + geom_histogram(bins = 50, fill = "#2E86AB", color = "black", alpha = 0.7) + geom_vline(xintercept = self$hsr_threshold * 3.6, color = "orange", linetype = "dashed", size = 1) + geom_vline(xintercept = self$sprint_threshold * 3.6, color = "red", linetype = "dashed", size = 1) + labs( title = "Speed Distribution", x = "Speed (km/h)", y = "Frequency" ) + theme_minimal() + theme( plot.title = element_text(size = 14, face = "bold"), axis.title = element_text(size = 12) ) grid.arrange(p1, p2, nrow = 2) } ) ) # Position comparison function compare_positions <- function(team_metrics) { # Compare metrics across positions position_summary <- team_metrics %>% group_by(position) %>% summarise( n_players = n(), avg_total_distance_km = mean(total_distance_km, na.rm = TRUE), sd_total_distance_km = sd(total_distance_km, na.rm = TRUE), avg_hsr_m = mean(hsr_distance_m, na.rm = TRUE), sd_hsr_m = sd(hsr_distance_m, na.rm = TRUE), avg_sprint_m = mean(sprint_distance_m, na.rm = TRUE), sd_sprint_m = sd(sprint_distance_m, na.rm = TRUE), avg_max_speed_kmh = mean(max_speed_kmh, na.rm = TRUE), avg_high_accelerations = mean(high_accelerations, na.rm = TRUE), avg_high_decelerations = mean(high_decelerations, na.rm = TRUE) ) position_summary } # Visualization function for position comparison visualize_position_comparison <- function(position_summary) { # Create multi-panel position comparison plot p1 <- ggplot(position_summary, aes(x = position, y = avg_total_distance_km)) + geom_bar(stat = "identity", fill = "#2E86AB", alpha = 0.7) + geom_errorbar(aes(ymin = avg_total_distance_km - sd_total_distance_km, ymax = avg_total_distance_km + sd_total_distance_km), width = 0.2) + labs(title = "Total Distance by Position", y = "Distance (km)", x = "") + theme_minimal() + theme(axis.text.x = element_text(angle = 45, hjust = 1)) p2 <- ggplot(position_summary, aes(x = position, y = avg_hsr_m)) + geom_bar(stat = "identity", fill = "orange", alpha = 0.7) + geom_errorbar(aes(ymin = avg_hsr_m - sd_hsr_m, ymax = avg_hsr_m + sd_hsr_m), width = 0.2) + labs(title = "HSR Distance by Position", y = "Distance (m)", x = "") + theme_minimal() + theme(axis.text.x = element_text(angle = 45, hjust = 1)) p3 <- ggplot(position_summary, aes(x = position, y = avg_sprint_m)) + geom_bar(stat = "identity", fill = "red", alpha = 0.7) + geom_errorbar(aes(ymin = avg_sprint_m - sd_sprint_m, ymax = avg_sprint_m + sd_sprint_m), width = 0.2) + labs(title = "Sprint Distance by Position", y = "Distance (m)", x = "") + theme_minimal() + theme(axis.text.x = element_text(angle = 45, hjust = 1)) p4 <- ggplot(position_summary, aes(x = position, y = avg_max_speed_kmh)) + geom_bar(stat = "identity", fill = "#A23B72", alpha = 0.7) + labs(title = "Max Speed by Position", y = "Speed (km/h)", x = "") + theme_minimal() + theme(axis.text.x = element_text(angle = 45, hjust = 1)) grid.arrange(p1, p2, p3, p4, nrow = 2, ncol = 2) } # Example usage # analyzer <- SpeedDistanceAnalyzer$new() # tracking_data <- read_csv("player_tracking_data.csv") # results <- analyzer$analyze_player_session(tracking_data, "Player_1", 90) # print(results) ``` ### Statistical Testing for Position Differences ```r library(broom) library(car) # ANOVA for position differences test_position_differences <- function(team_metrics, metric_name) { # Perform one-way ANOVA formula_str <- paste(metric_name, "~ position") anova_model <- aov(as.formula(formula_str), data = team_metrics) # Check assumptions shapiro_test <- shapiro.test(residuals(anova_model)) levene_test <- leveneTest(as.formula(formula_str), data = team_metrics) # Post-hoc tests (Tukey HSD) tukey_results <- TukeyHSD(anova_model) list( anova_summary = summary(anova_model), shapiro_test = shapiro_test, levene_test = levene_test, tukey_hsd = tukey_results ) } # Example: Test if total distance differs by position # results <- test_position_differences(team_metrics, "total_distance_km") # print(results$anova_summary) # print(results$tukey_hsd) ``` ## Practical Applications ### 1. Load Monitoring **Acute:Chronic Workload Ratio (ACWR)**: ```python def calculate_acwr(distances, window_acute=7, window_chronic=28): """ Calculate Acute:Chronic Workload Ratio Parameters: ----------- distances : list or array Daily distance values window_acute : int Acute workload window (days) window_chronic : int Chronic workload window (days) Returns: -------- float : ACWR value """ acute_load = np.mean(distances[-window_acute:]) chronic_load = np.mean(distances[-window_chronic:]) if chronic_load == 0: return 0 acwr = acute_load / chronic_load return acwr # Injury risk zones # ACWR < 0.8: Low risk (possible detraining) # ACWR 0.8-1.3: Optimal range # ACWR > 1.5: High injury risk ``` ### 2. Fitness Benchmarking Compare player metrics to position-specific benchmarks to identify: - Players underperforming physically - Recovery needs - Training adaptations ### 3. Return-to-Play Protocols Track players returning from injury: - Progressive increase in distance/intensity - Monitor HSR and sprint volumes - Ensure ACWR stays within safe ranges ### 4. Game Analysis - Identify periods of high/low intensity - Assess tactical demands - Evaluate substitution timing - Compare home vs. away performance ### 5. Opponent Analysis Analyze opposition physical profiles to: - Predict game tempo - Identify physically vulnerable players - Adjust tactical approach ## Best Practices ### Data Quality 1. **Sampling Rate**: Minimum 10 Hz for accurate speed metrics 2. **Calibration**: Regular system calibration and validation 3. **Data Cleaning**: Remove outliers, smooth noise, handle missing values 4. **Contextual Factors**: Account for weather, pitch conditions, altitude ### Interpretation 1. **Individualization**: Establish player-specific baselines 2. **Context Matters**: Consider game state, opposition, tactics 3. **Trend Analysis**: Focus on patterns over time, not single sessions 4. **Multi-Metric Approach**: Don't rely on single metrics ### Communication 1. **Visualizations**: Clear, actionable graphics for coaches 2. **Actionable Insights**: Translate data into training recommendations 3. **Regular Reporting**: Consistent format and frequency 4. **Player Education**: Help athletes understand their data ## Limitations and Considerations ### Technical Limitations - GPS accuracy decreases at very high speeds - Indoor tracking requires different technology - Occlusion issues with optical systems - Battery life and wearable comfort ### Analytical Limitations - Speed thresholds are somewhat arbitrary and sport/position-specific - Total distance doesn't capture directional changes - Doesn't account for technical/tactical actions - Individual variation in running economy ### Implementation Challenges - Cost of technology and infrastructure - Staff training requirements - Data management and storage - Integration with existing workflows ## Future Directions ### Emerging Technologies 1. **Computer Vision**: Markerless tracking from broadcast footage 2. **IMU Integration**: Better acceleration/deceleration capture 3. **Real-Time Analytics**: Live dashboards during matches 4. **Biomechanical Integration**: Combine movement with load/strain ### Advanced Analytics 1. **Machine Learning**: Predictive models for injury, performance 2. **Network Analysis**: Spatial relationships and team dynamics 3. **Contextual Metrics**: Possession-adjusted, opposition-adjusted values 4. **Fatigue Modeling**: Real-time fatigue estimation ### Personalization 1. **Individual Thresholds**: Player-specific speed zones 2. **Metabolic Phenotyping**: Account for individual efficiency 3. **Genetic Factors**: Integration with genetic/physiological data ## References 1. Buchheit, M., & Simpson, B. M. (2017). Player-tracking technology: Half-full or half-empty glass? International Journal of Sports Physiology and Performance, 12(s2), S2-35. 2. Carling, C., Bloomfield, J., Nelsen, L., & Reilly, T. (2008). The role of motion analysis in elite soccer: Contemporary performance measurement techniques and work rate data. Sports Medicine, 38(10), 839-862. 3. Di Prampero, P. E., Fusi, S., Sepulcri, L., Morin, J. B., Belli, A., & Antonutto, G. (2005). Sprint running: a new energetic approach. Journal of Experimental Biology, 208(14), 2809-2816. 4. Gabbett, T. J. (2016). The training-injury prevention paradox: should athletes be training smarter and harder? British Journal of Sports Medicine, 50(5), 273-280. 5. Malone, J. J., Lovell, R., Varley, M. C., & Coutts, A. J. (2017). Unpacking the black box: applications and considerations for using GPS devices in sport. International Journal of Sports Physiology and Performance, 12(s2), S2-18. 6. Osgnach, C., Poser, S., Bernardini, R., Rinaldo, R., & Di Prampero, P. E. (2010). Energy cost and metabolic power in elite soccer: a new match analysis approach. Medicine & Science in Sports & Exercise, 42(1), 170-178. 7. Sweeting, A. J., Cormack, S. J., Morgan, S., & Aughey, R. J. (2017). When is a sprint a sprint? A review of the analysis of team-sport athlete activity profile. Frontiers in Physiology, 8, 432.

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.