Chapter 23: Key Takeaways - Network Analysis in Football

Quick Reference Summary

This chapter covered applying network analysis to football, from passing networks to coaching trees and recruiting pipelines.


Core Concepts

Network Components

Component Definition Football Example
Node Entity in network Player, coach, school
Edge Connection between nodes Pass, mentorship, recruit
Weight Strength of connection Target count, years together
Directed One-way connection Pass (QB → WR)
Undirected Two-way connection Blocking assignment

Network Types in Football

Network Type Nodes Edges Use Case
Passing Players Passes Offensive structure
Coaching Coaches Mentorship Scheme tracing
Recruiting Schools Recruits Pipeline analysis
Play Sequence Play types Transitions Pattern detection

Essential Formulas

Network Density

Density = E / [N × (N-1)]   (directed)
Density = 2E / [N × (N-1)] (undirected)

Where: E = edges, N = nodes

Target Share

Target Share = Player Targets / Total Team Targets

Herfindahl-Hirschman Index (Concentration)

HHI = Σ (share_i)²

Range: 1/n (perfectly spread) to 1 (completely concentrated)

In-Degree Centrality

C_in(v) = deg_in(v) / (N - 1)

Where: deg_in(v) = number of incoming edges

PageRank (simplified)

PR(v) = (1-d)/N + d × Σ[PR(u)/out_degree(u)]

Where: d = damping factor (~0.85)
       u = nodes pointing to v

Code Patterns

Building a Passing Network

import networkx as nx

def build_passing_network(passes: pd.DataFrame) -> nx.DiGraph:
    """Build passing network from play data."""
    G = nx.DiGraph()

    # Aggregate passes
    agg = passes.groupby(['passer', 'receiver']).agg({
        'play_id': 'count',
        'yards': 'sum'
    }).reset_index()

    # Add edges
    for _, row in agg.iterrows():
        G.add_edge(
            row['passer'],
            row['receiver'],
            weight=row['play_id'],
            yards=row['yards']
        )

    return G

Calculate Target Share

def calculate_target_share(G: nx.DiGraph, qb: str) -> pd.DataFrame:
    """Calculate target share for receivers."""
    out_edges = G.out_edges(qb, data=True)
    total = sum(d['weight'] for _, _, d in out_edges)

    results = [
        {'receiver': v, 'targets': d['weight'],
         'share': d['weight'] / total}
        for _, v, d in out_edges
    ]
    return pd.DataFrame(results).sort_values('share', ascending=False)

Centrality Calculation

def calculate_centralities(G: nx.Graph) -> pd.DataFrame:
    """Calculate all centrality metrics."""
    return pd.DataFrame({
        'node': list(G.nodes()),
        'degree': list(nx.degree_centrality(G).values()),
        'betweenness': list(nx.betweenness_centrality(G).values()),
        'pagerank': list(nx.pagerank(G).values())
    })

Community Detection

from community import community_louvain

def detect_communities(G: nx.Graph) -> dict:
    """Detect communities using Louvain algorithm."""
    if G.is_directed():
        G = G.to_undirected()
    return community_louvain.best_partition(G)

Centrality Interpretation

Football Context

Metric High Value Means Player Example
In-Degree Heavily targeted WR1, primary option
Out-Degree Distributes ball QB, playmaker
Betweenness Connects offense Slot WR, versatile TE
PageRank Valued by key players Go-to receiver
Closeness Quick access to all Underneath route runner

Receiver Role Classification

Role Network Signature
Primary Target High in-degree, high PageRank
Secondary Option Medium metrics across board
Versatile Player High betweenness
Specialist High metrics in specific situations

Common Network Patterns

Concentrated Passing Attack

QB ──────────────────► WR1 (60% share)
   ├────────────────► WR2 (20% share)
   ├──────────────► RB  (12% share)
   └────────────► TE   (8% share)

HHI: ~0.42 (concentrated)

Spread Passing Attack

QB ──────────────────► WR1 (28% share)
   ├────────────────► WR2 (24% share)
   ├──────────────► WR3 (22% share)
   ├────────────► RB  (14% share)
   └──────────► TE  (12% share)

HHI: ~0.22 (spread)

Visualization Guidelines

Node Sizing

  • Size by importance (targets, influence)
  • QB typically largest or central
  • Receivers sized by involvement

Edge Styling

  • Width by weight (frequency)
  • Color by type (completion vs. incompletion)
  • Arrow direction for passes

Layout Options

  • Spring layout for general networks
  • Hierarchical for coaching trees
  • Geographic for recruiting

Common Pitfalls

1. Ignoring Edge Direction

Wrong: Treating passing as undirected

G = nx.Graph()  # Wrong for passes

Right: Use directed graph

G = nx.DiGraph()  # Correct

2. Unweighted Analysis

Wrong: Ignoring target frequency

nx.degree_centrality(G)  # Counts connections only

Right: Use weighted metrics

nx.pagerank(G, weight='targets')

3. Missing Context

Wrong: Raw centrality without interpretation

Right: Contextualize for football

# High betweenness + slot position = versatile option

Analysis Checklist

Building Networks

  • [ ] Identify appropriate node types
  • [ ] Choose directed vs. undirected
  • [ ] Define edge weights meaningfully
  • [ ] Handle missing data

Calculating Metrics

  • [ ] Use weighted centrality when available
  • [ ] Calculate multiple metrics for comparison
  • [ ] Normalize for network size
  • [ ] Interpret in football context

Visualization

  • [ ] Choose appropriate layout
  • [ ] Size/color nodes meaningfully
  • [ ] Scale edge widths properly
  • [ ] Add clear labels

Analysis

  • [ ] Compare to league/team averages
  • [ ] Track changes over time
  • [ ] Identify outliers
  • [ ] Connect to performance outcomes

Quick Reference Tables

Centrality Comparison

Metric Computation Best For
Degree Count connections Volume
Betweenness Path frequency Connectors
Closeness Average distance Accessibility
Eigenvector Neighbor importance Quality
PageRank Weighted influence Overall importance

Network Statistics

Statistic Good Range Interpretation
Density 0.1-0.4 Connection rate
Clustering 0.3-0.6 Local grouping
Avg Path Length 2-4 Connectivity
Components 1 Fully connected

Next Steps

After mastering network analysis, proceed to: - Chapter 24: Computer Vision and Tracking Data - Chapter 25: Natural Language Processing for Scouting - Chapter 27: Building a Complete Analytics System