Chapter 23: Key Takeaways - Network Analysis in Football
Quick Reference Summary
This chapter covered applying network analysis to football, from passing networks to coaching trees and recruiting pipelines.
Core Concepts
Network Components
| Component | Definition | Football Example |
|---|---|---|
| Node | Entity in network | Player, coach, school |
| Edge | Connection between nodes | Pass, mentorship, recruit |
| Weight | Strength of connection | Target count, years together |
| Directed | One-way connection | Pass (QB → WR) |
| Undirected | Two-way connection | Blocking assignment |
Network Types in Football
| Network Type | Nodes | Edges | Use Case |
|---|---|---|---|
| Passing | Players | Passes | Offensive structure |
| Coaching | Coaches | Mentorship | Scheme tracing |
| Recruiting | Schools | Recruits | Pipeline analysis |
| Play Sequence | Play types | Transitions | Pattern detection |
Essential Formulas
Network Density
Density = E / [N × (N-1)] (directed)
Density = 2E / [N × (N-1)] (undirected)
Where: E = edges, N = nodes
Target Share
Target Share = Player Targets / Total Team Targets
Herfindahl-Hirschman Index (Concentration)
HHI = Σ (share_i)²
Range: 1/n (perfectly spread) to 1 (completely concentrated)
In-Degree Centrality
C_in(v) = deg_in(v) / (N - 1)
Where: deg_in(v) = number of incoming edges
PageRank (simplified)
PR(v) = (1-d)/N + d × Σ[PR(u)/out_degree(u)]
Where: d = damping factor (~0.85)
u = nodes pointing to v
Code Patterns
Building a Passing Network
import networkx as nx
def build_passing_network(passes: pd.DataFrame) -> nx.DiGraph:
"""Build passing network from play data."""
G = nx.DiGraph()
# Aggregate passes
agg = passes.groupby(['passer', 'receiver']).agg({
'play_id': 'count',
'yards': 'sum'
}).reset_index()
# Add edges
for _, row in agg.iterrows():
G.add_edge(
row['passer'],
row['receiver'],
weight=row['play_id'],
yards=row['yards']
)
return G
Calculate Target Share
def calculate_target_share(G: nx.DiGraph, qb: str) -> pd.DataFrame:
"""Calculate target share for receivers."""
out_edges = G.out_edges(qb, data=True)
total = sum(d['weight'] for _, _, d in out_edges)
results = [
{'receiver': v, 'targets': d['weight'],
'share': d['weight'] / total}
for _, v, d in out_edges
]
return pd.DataFrame(results).sort_values('share', ascending=False)
Centrality Calculation
def calculate_centralities(G: nx.Graph) -> pd.DataFrame:
"""Calculate all centrality metrics."""
return pd.DataFrame({
'node': list(G.nodes()),
'degree': list(nx.degree_centrality(G).values()),
'betweenness': list(nx.betweenness_centrality(G).values()),
'pagerank': list(nx.pagerank(G).values())
})
Community Detection
from community import community_louvain
def detect_communities(G: nx.Graph) -> dict:
"""Detect communities using Louvain algorithm."""
if G.is_directed():
G = G.to_undirected()
return community_louvain.best_partition(G)
Centrality Interpretation
Football Context
| Metric | High Value Means | Player Example |
|---|---|---|
| In-Degree | Heavily targeted | WR1, primary option |
| Out-Degree | Distributes ball | QB, playmaker |
| Betweenness | Connects offense | Slot WR, versatile TE |
| PageRank | Valued by key players | Go-to receiver |
| Closeness | Quick access to all | Underneath route runner |
Receiver Role Classification
| Role | Network Signature |
|---|---|
| Primary Target | High in-degree, high PageRank |
| Secondary Option | Medium metrics across board |
| Versatile Player | High betweenness |
| Specialist | High metrics in specific situations |
Common Network Patterns
Concentrated Passing Attack
QB ──────────────────► WR1 (60% share)
├────────────────► WR2 (20% share)
├──────────────► RB (12% share)
└────────────► TE (8% share)
HHI: ~0.42 (concentrated)
Spread Passing Attack
QB ──────────────────► WR1 (28% share)
├────────────────► WR2 (24% share)
├──────────────► WR3 (22% share)
├────────────► RB (14% share)
└──────────► TE (12% share)
HHI: ~0.22 (spread)
Visualization Guidelines
Node Sizing
- Size by importance (targets, influence)
- QB typically largest or central
- Receivers sized by involvement
Edge Styling
- Width by weight (frequency)
- Color by type (completion vs. incompletion)
- Arrow direction for passes
Layout Options
- Spring layout for general networks
- Hierarchical for coaching trees
- Geographic for recruiting
Common Pitfalls
1. Ignoring Edge Direction
Wrong: Treating passing as undirected
G = nx.Graph() # Wrong for passes
Right: Use directed graph
G = nx.DiGraph() # Correct
2. Unweighted Analysis
Wrong: Ignoring target frequency
nx.degree_centrality(G) # Counts connections only
Right: Use weighted metrics
nx.pagerank(G, weight='targets')
3. Missing Context
Wrong: Raw centrality without interpretation
Right: Contextualize for football
# High betweenness + slot position = versatile option
Analysis Checklist
Building Networks
- [ ] Identify appropriate node types
- [ ] Choose directed vs. undirected
- [ ] Define edge weights meaningfully
- [ ] Handle missing data
Calculating Metrics
- [ ] Use weighted centrality when available
- [ ] Calculate multiple metrics for comparison
- [ ] Normalize for network size
- [ ] Interpret in football context
Visualization
- [ ] Choose appropriate layout
- [ ] Size/color nodes meaningfully
- [ ] Scale edge widths properly
- [ ] Add clear labels
Analysis
- [ ] Compare to league/team averages
- [ ] Track changes over time
- [ ] Identify outliers
- [ ] Connect to performance outcomes
Quick Reference Tables
Centrality Comparison
| Metric | Computation | Best For |
|---|---|---|
| Degree | Count connections | Volume |
| Betweenness | Path frequency | Connectors |
| Closeness | Average distance | Accessibility |
| Eigenvector | Neighbor importance | Quality |
| PageRank | Weighted influence | Overall importance |
Network Statistics
| Statistic | Good Range | Interpretation |
|---|---|---|
| Density | 0.1-0.4 | Connection rate |
| Clustering | 0.3-0.6 | Local grouping |
| Avg Path Length | 2-4 | Connectivity |
| Components | 1 | Fully connected |
Next Steps
After mastering network analysis, proceed to: - Chapter 24: Computer Vision and Tracking Data - Chapter 25: Natural Language Processing for Scouting - Chapter 27: Building a Complete Analytics System