Comparison is the essence of sports analytics. Every meaningful question in college football involves comparison: Is this quarterback better than that one? How does our defense rank against the conference? Which recruit offers the best value? These...
In This Chapter
- Learning Objectives
- Introduction
- 14.1 Principles of Effective Comparison
- 14.2 Radar Charts for Player Profiles
- 14.3 Bar Chart Comparisons
- 14.4 Dumbbell and Slope Charts
- 14.5 Bump Charts for Ranking Changes
- 14.6 Small Multiples for Conference Comparisons
- 14.7 Similarity Analysis and Clustering
- 14.8 Advanced Comparison Techniques
- 14.9 Best Practices for Comparison Visualizations
- 14.10 Integrated Comparison Dashboard
- Chapter Summary
- Key Terms
- Practice Exercises
- Further Reading
Chapter 14: Player and Team Comparison Charts
Learning Objectives
By the end of this chapter, you will be able to:
- Design effective comparison visualizations for football analytics
- Create radar charts for multi-dimensional player profiles
- Build ranking visualizations that show change over time
- Implement similarity analysis with visual clustering
- Develop small multiples for conference-wide comparisons
- Apply best practices for fair and unbiased comparisons
Introduction
Comparison is the essence of sports analytics. Every meaningful question in college football involves comparison: Is this quarterback better than that one? How does our defense rank against the conference? Which recruit offers the best value? These questions demand visualizations specifically designed to reveal similarities, differences, and relative positions.
This chapter explores the specialized chart types and design principles that make player and team comparisons clear, fair, and actionable. Unlike general-purpose charts, comparison visualizations must handle multiple entities simultaneously while maintaining visual clarity and preventing unfair or misleading representations.
The stakes are high. Comparison charts influence recruiting decisions, playing time allocation, and strategic planning. A poorly designed comparison might obscure a player's true value or create misleading impressions about team performance. Our goal is to create visualizations that tell honest stories and support sound decision-making.
14.1 Principles of Effective Comparison
Before diving into specific chart types, we must establish foundational principles that govern all comparison visualizations.
The Comparison Framework
Effective comparisons require three elements:
Common Baseline: All entities must be measured against the same standard. Comparing a quarterback's completion percentage to a running back's yards per carry makes no sense—they lack a common baseline.
Consistent Scale: Visual encodings must be consistent across all compared entities. If one inch represents 10 yards for Team A, it must represent 10 yards for Team B.
Relevant Context: Raw numbers without context mislead. A quarterback with 3,500 passing yards sounds impressive until you learn the conference average is 3,800 yards.
Comparison Types in Football Analytics
Different questions require different comparison structures:
import matplotlib.pyplot as plt
import numpy as np
from typing import List, Dict, Tuple, Optional
from dataclasses import dataclass
@dataclass
class ComparisonType:
"""Classification of comparison visualizations."""
# One-to-one: Direct comparison between two entities
ONE_TO_ONE = "one_to_one"
# One-to-many: Single entity against a group
ONE_TO_MANY = "one_to_many"
# Many-to-many: Multiple entities compared simultaneously
MANY_TO_MANY = "many_to_many"
# Temporal: Same entity compared across time periods
TEMPORAL = "temporal"
# Hierarchical: Comparisons at different levels (player, team, conference)
HIERARCHICAL = "hierarchical"
Fairness in Comparison
Sports comparisons carry ethical responsibility. Consider these guidelines:
Position Equivalence: Compare players at the same position or with equivalent roles. Comparing a slot receiver's yards per catch to a tight end's creates an unfair comparison due to their different roles.
Opportunity Adjustment: Account for differences in playing time, number of attempts, or team context. A backup quarterback's efficiency might exceed the starter's, but with far fewer pressure situations.
Era Adjustment: When comparing across seasons, account for rule changes, schedule differences, and evolving strategies that affect raw statistics.
def normalize_for_comparison(stats: Dict[str, float],
games_played: int,
team_plays: int) -> Dict[str, float]:
"""
Normalize statistics for fair comparison.
Converts counting stats to per-game or per-play rates,
enabling comparison between players with different opportunities.
"""
normalized = {}
# Per-game normalization for counting stats
counting_stats = ['passing_yards', 'rushing_yards', 'receptions',
'tackles', 'sacks']
for stat in counting_stats:
if stat in stats:
normalized[f'{stat}_per_game'] = stats[stat] / games_played
# Per-play normalization for team-relative stats
if 'snaps' in stats and team_plays > 0:
normalized['snap_percentage'] = stats['snaps'] / team_plays * 100
# Rate stats remain unchanged
rate_stats = ['completion_pct', 'yards_per_attempt', 'success_rate']
for stat in rate_stats:
if stat in stats:
normalized[stat] = stats[stat]
return normalized
14.2 Radar Charts for Player Profiles
Radar charts (also called spider charts) excel at showing multi-dimensional player profiles. They display multiple metrics simultaneously, creating a visual "fingerprint" that reveals a player's strengths and weaknesses at a glance.
Anatomy of a Radar Chart
A radar chart arranges metrics around a central point, with each metric extending outward along its own axis. Values closer to the center indicate lower performance; values toward the edge indicate higher performance.
class PlayerRadarChart:
"""
Create radar charts for player profile visualization.
Radar charts display multiple metrics simultaneously,
creating a visual fingerprint of player capabilities.
"""
def __init__(self, metrics: List[str],
max_values: Dict[str, float] = None):
"""
Initialize radar chart with metric definitions.
Args:
metrics: List of metric names to display
max_values: Optional dict of maximum values for each metric
(for normalization). If not provided, uses data max.
"""
self.metrics = metrics
self.max_values = max_values or {}
self.num_metrics = len(metrics)
# Calculate angles for each metric
self.angles = np.linspace(0, 2 * np.pi, self.num_metrics,
endpoint=False).tolist()
# Close the polygon
self.angles += self.angles[:1]
# Color palette for multiple players
self.colors = ['#264653', '#2a9d8f', '#e9c46a', '#f4a261', '#e76f51']
def _normalize_values(self, values: Dict[str, float]) -> List[float]:
"""
Normalize values to 0-1 scale for plotting.
Higher is always better after normalization.
"""
normalized = []
for metric in self.metrics:
raw_value = values.get(metric, 0)
max_val = self.max_values.get(metric, raw_value * 1.2)
if max_val > 0:
norm_val = raw_value / max_val
else:
norm_val = 0
# Clamp to [0, 1]
normalized.append(min(max(norm_val, 0), 1))
# Close the polygon
normalized += normalized[:1]
return normalized
def create_single_player(self, player_name: str,
values: Dict[str, float],
figsize: Tuple[int, int] = (8, 8)) -> plt.Figure:
"""
Create radar chart for a single player.
Args:
player_name: Name for title
values: Dict mapping metric names to values
figsize: Figure dimensions
Returns:
matplotlib Figure object
"""
fig, ax = plt.subplots(figsize=figsize, subplot_kw=dict(polar=True))
# Normalize values
norm_values = self._normalize_values(values)
# Plot the radar
ax.plot(self.angles, norm_values, 'o-', linewidth=2,
color=self.colors[0])
ax.fill(self.angles, norm_values, alpha=0.25, color=self.colors[0])
# Set metric labels
ax.set_xticks(self.angles[:-1])
ax.set_xticklabels(self.metrics, size=10)
# Set radial limits and labels
ax.set_ylim(0, 1)
ax.set_yticks([0.25, 0.5, 0.75, 1.0])
ax.set_yticklabels(['25%', '50%', '75%', '100%'], size=8, alpha=0.7)
# Title
ax.set_title(f'{player_name}\nPlayer Profile', size=14,
fontweight='bold', y=1.08)
plt.tight_layout()
return fig
def create_comparison(self, players: Dict[str, Dict[str, float]],
figsize: Tuple[int, int] = (10, 10)) -> plt.Figure:
"""
Create radar chart comparing multiple players.
Args:
players: Dict mapping player names to their metric values
figsize: Figure dimensions
Returns:
matplotlib Figure object
"""
fig, ax = plt.subplots(figsize=figsize, subplot_kw=dict(polar=True))
# Plot each player
for idx, (player_name, values) in enumerate(players.items()):
norm_values = self._normalize_values(values)
color = self.colors[idx % len(self.colors)]
ax.plot(self.angles, norm_values, 'o-', linewidth=2,
label=player_name, color=color)
ax.fill(self.angles, norm_values, alpha=0.1, color=color)
# Set metric labels
ax.set_xticks(self.angles[:-1])
ax.set_xticklabels(self.metrics, size=10)
# Set radial limits
ax.set_ylim(0, 1)
ax.set_yticks([0.25, 0.5, 0.75, 1.0])
ax.set_yticklabels(['25%', '50%', '75%', '100%'], size=8, alpha=0.7)
# Legend
ax.legend(loc='upper right', bbox_to_anchor=(1.3, 1.0))
# Title
ax.set_title('Player Comparison', size=14, fontweight='bold', y=1.08)
plt.tight_layout()
return fig
Designing Effective Radar Charts
Several design decisions affect radar chart effectiveness:
Metric Selection: Choose 5-8 metrics for optimal readability. Too few metrics waste the radar format's multi-dimensional advantage; too many create visual clutter.
Metric Ordering: Place related metrics adjacent to each other. For a quarterback, group passing efficiency metrics together and mobility metrics together. This creates more meaningful shapes.
Normalization Strategy: All metrics must be normalized to a common scale (typically 0-1 or 0-100). Decide whether to normalize against: - League maximum (how does this player compare to the best?) - League average (how does this player compare to typical?) - Position average (how does this player compare to peers?)
def create_qb_radar_metrics():
"""
Define standard quarterback radar chart metrics.
Metrics are ordered to group related skills together:
- Passing accuracy/efficiency
- Volume/impact
- Mobility
- Decision-making
"""
metrics = [
'Completion %', # Accuracy
'Yards/Attempt', # Efficiency
'TD Rate', # Scoring
'Passing Yards', # Volume
'QBR', # Overall rating
'Rush Yards', # Mobility
'Sack Rate', # Protection/awareness (inverted: lower is better)
'INT Rate' # Decision-making (inverted: lower is better)
]
# For inverted metrics, transform before normalization
# e.g., sack_rate_normalized = 1 - (sack_rate / max_sack_rate)
return metrics
Radar Chart Limitations
Radar charts have notable weaknesses:
Area Perception: Humans poorly judge irregular polygon areas. A player whose radar chart looks visually larger may not actually be better overall.
Metric Order Dependency: The same data can produce different visual impressions depending on metric arrangement.
Scale Sensitivity: Small differences in normalization approach can create dramatically different visualizations.
Use radar charts for qualitative profile comparisons, not precise quantitative judgments. They answer "what type of player is this?" better than "which player is better?"
14.3 Bar Chart Comparisons
Bar charts remain the workhorse of comparison visualization. Their simplicity makes them accessible to any audience, while their versatility handles diverse comparison scenarios.
Horizontal Bar Charts for Rankings
When comparing many entities on a single metric, horizontal bar charts excel:
class RankingBarChart:
"""
Create horizontal bar charts for rankings and comparisons.
Horizontal orientation provides space for entity labels
and enables natural top-to-bottom ranking display.
"""
def __init__(self):
self.colors = {
'primary': '#264653',
'highlight': '#e76f51',
'average': '#e9c46a',
'background': '#f8f9fa'
}
def create_simple_ranking(self,
entities: List[str],
values: List[float],
title: str,
xlabel: str,
highlight: List[str] = None,
reference_line: float = None,
figsize: Tuple[int, int] = (10, 8)) -> plt.Figure:
"""
Create a simple horizontal bar ranking chart.
Args:
entities: List of entity names (teams, players)
values: Corresponding values
title: Chart title
xlabel: X-axis label
highlight: List of entities to highlight
reference_line: Optional vertical reference (e.g., average)
figsize: Figure dimensions
Returns:
matplotlib Figure object
"""
# Sort by value
sorted_pairs = sorted(zip(entities, values),
key=lambda x: x[1], reverse=True)
sorted_entities, sorted_values = zip(*sorted_pairs)
fig, ax = plt.subplots(figsize=figsize)
# Create bars
y_positions = range(len(sorted_entities))
# Determine colors
bar_colors = []
for entity in sorted_entities:
if highlight and entity in highlight:
bar_colors.append(self.colors['highlight'])
else:
bar_colors.append(self.colors['primary'])
bars = ax.barh(y_positions, sorted_values, color=bar_colors,
edgecolor='white', linewidth=0.5)
# Reference line
if reference_line is not None:
ax.axvline(reference_line, color=self.colors['average'],
linestyle='--', linewidth=2, label='Average')
ax.legend(loc='lower right')
# Labels
ax.set_yticks(y_positions)
ax.set_yticklabels(sorted_entities)
ax.set_xlabel(xlabel, fontsize=11)
ax.set_title(title, fontsize=14, fontweight='bold')
# Value labels on bars
for bar, value in zip(bars, sorted_values):
ax.text(bar.get_width() + 0.02 * max(sorted_values),
bar.get_y() + bar.get_height()/2,
f'{value:.1f}', va='center', fontsize=9)
# Clean up
ax.invert_yaxis() # Top rank at top
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.tight_layout()
return fig
def create_diverging_bar(self,
entities: List[str],
values: List[float],
title: str,
xlabel: str,
baseline: float = 0,
figsize: Tuple[int, int] = (10, 8)) -> plt.Figure:
"""
Create diverging bar chart for above/below comparisons.
Useful for showing deviation from average, positive/negative
metrics, or improvement/decline from baseline.
Args:
entities: List of entity names
values: Corresponding values
title: Chart title
xlabel: X-axis label
baseline: Center point for divergence
figsize: Figure dimensions
Returns:
matplotlib Figure object
"""
# Sort by value
sorted_pairs = sorted(zip(entities, values),
key=lambda x: x[1], reverse=True)
sorted_entities, sorted_values = zip(*sorted_pairs)
fig, ax = plt.subplots(figsize=figsize)
y_positions = range(len(sorted_entities))
# Color based on value relative to baseline
colors = ['#2a9d8f' if v >= baseline else '#e76f51'
for v in sorted_values]
bars = ax.barh(y_positions,
[v - baseline for v in sorted_values],
color=colors, edgecolor='white', linewidth=0.5)
# Baseline
ax.axvline(0, color='black', linewidth=1)
# Labels
ax.set_yticks(y_positions)
ax.set_yticklabels(sorted_entities)
ax.set_xlabel(f'{xlabel} (vs. {baseline:.1f})', fontsize=11)
ax.set_title(title, fontsize=14, fontweight='bold')
# Value labels
for bar, value in zip(bars, sorted_values):
offset = 0.02 * max(abs(v - baseline) for v in sorted_values)
x_pos = bar.get_width() + offset if bar.get_width() >= 0 else bar.get_width() - offset
ha = 'left' if bar.get_width() >= 0 else 'right'
ax.text(x_pos, bar.get_y() + bar.get_height()/2,
f'{value:.1f}', va='center', ha=ha, fontsize=9)
ax.invert_yaxis()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.tight_layout()
return fig
Grouped Bar Charts for Multi-Metric Comparison
When comparing entities across multiple metrics, grouped bar charts place related bars side by side:
def create_grouped_comparison(entities: List[str],
metrics: Dict[str, List[float]],
title: str,
figsize: Tuple[int, int] = (12, 6)) -> plt.Figure:
"""
Create grouped bar chart for multi-metric comparison.
Args:
entities: List of entity names
metrics: Dict mapping metric names to lists of values
(one value per entity)
title: Chart title
figsize: Figure dimensions
Returns:
matplotlib Figure object
"""
num_entities = len(entities)
num_metrics = len(metrics)
# Calculate bar positions
bar_width = 0.8 / num_metrics
x = np.arange(num_entities)
fig, ax = plt.subplots(figsize=figsize)
colors = ['#264653', '#2a9d8f', '#e9c46a', '#f4a261', '#e76f51']
for idx, (metric_name, values) in enumerate(metrics.items()):
offset = (idx - num_metrics/2 + 0.5) * bar_width
ax.bar(x + offset, values, bar_width * 0.9,
label=metric_name, color=colors[idx % len(colors)])
ax.set_xticks(x)
ax.set_xticklabels(entities, rotation=45, ha='right')
ax.set_title(title, fontsize=14, fontweight='bold')
ax.legend(loc='upper right')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.tight_layout()
return fig
Stacked Bar Charts for Composition
When showing how metrics combine to form totals, stacked bars reveal both the total and composition:
def create_stacked_composition(entities: List[str],
components: Dict[str, List[float]],
title: str,
ylabel: str,
figsize: Tuple[int, int] = (12, 6)) -> plt.Figure:
"""
Create stacked bar chart showing composition breakdown.
Useful for showing how total yards break down by play type,
or how scoring breaks down by quarter.
Args:
entities: List of entity names
components: Dict mapping component names to lists of values
title: Chart title
ylabel: Y-axis label
figsize: Figure dimensions
Returns:
matplotlib Figure object
"""
fig, ax = plt.subplots(figsize=figsize)
colors = ['#264653', '#2a9d8f', '#e9c46a', '#f4a261', '#e76f51']
x = np.arange(len(entities))
bottom = np.zeros(len(entities))
for idx, (component_name, values) in enumerate(components.items()):
ax.bar(x, values, bottom=bottom, label=component_name,
color=colors[idx % len(colors)])
bottom += np.array(values)
ax.set_xticks(x)
ax.set_xticklabels(entities, rotation=45, ha='right')
ax.set_ylabel(ylabel)
ax.set_title(title, fontsize=14, fontweight='bold')
ax.legend(loc='upper right')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.tight_layout()
return fig
14.4 Dumbbell and Slope Charts
Some comparisons involve paired values—before and after, expected versus actual, or two different time periods. Dumbbell and slope charts specialize in these scenarios.
Dumbbell Charts for Paired Comparisons
Dumbbell charts (also called DNA charts or connected dot plots) show two values per entity with a line connecting them:
class DumbbellChart:
"""
Create dumbbell charts for paired comparisons.
Ideal for before/after, expected/actual, or two-period comparisons.
The connecting line emphasizes the gap between values.
"""
def __init__(self):
self.colors = {
'start': '#264653',
'end': '#e76f51',
'connector': '#adb5bd'
}
def create(self, entities: List[str],
values_1: List[float],
values_2: List[float],
label_1: str,
label_2: str,
title: str,
xlabel: str,
figsize: Tuple[int, int] = (10, 8)) -> plt.Figure:
"""
Create dumbbell chart comparing two values per entity.
Args:
entities: List of entity names
values_1: First set of values
values_2: Second set of values
label_1: Label for first value set
label_2: Label for second value set
title: Chart title
xlabel: X-axis label
figsize: Figure dimensions
Returns:
matplotlib Figure object
"""
# Sort by difference (largest improvement at top)
differences = [v2 - v1 for v1, v2 in zip(values_1, values_2)]
sorted_indices = sorted(range(len(entities)),
key=lambda i: differences[i], reverse=True)
sorted_entities = [entities[i] for i in sorted_indices]
sorted_v1 = [values_1[i] for i in sorted_indices]
sorted_v2 = [values_2[i] for i in sorted_indices]
fig, ax = plt.subplots(figsize=figsize)
y_positions = range(len(sorted_entities))
# Draw connectors
for y, v1, v2 in zip(y_positions, sorted_v1, sorted_v2):
ax.plot([v1, v2], [y, y], color=self.colors['connector'],
linewidth=2, zorder=1)
# Draw points
ax.scatter(sorted_v1, y_positions, s=100,
color=self.colors['start'], zorder=2, label=label_1)
ax.scatter(sorted_v2, y_positions, s=100,
color=self.colors['end'], zorder=2, label=label_2)
# Labels
ax.set_yticks(y_positions)
ax.set_yticklabels(sorted_entities)
ax.set_xlabel(xlabel, fontsize=11)
ax.set_title(title, fontsize=14, fontweight='bold')
ax.legend(loc='lower right')
ax.invert_yaxis()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.tight_layout()
return fig
def create_with_change_labels(self,
entities: List[str],
values_1: List[float],
values_2: List[float],
title: str,
xlabel: str,
figsize: Tuple[int, int] = (12, 8)) -> plt.Figure:
"""
Create dumbbell chart with change magnitude labels.
Shows the actual difference value next to each dumbbell,
useful for precise communication of changes.
"""
differences = [v2 - v1 for v1, v2 in zip(values_1, values_2)]
sorted_indices = sorted(range(len(entities)),
key=lambda i: differences[i], reverse=True)
sorted_entities = [entities[i] for i in sorted_indices]
sorted_v1 = [values_1[i] for i in sorted_indices]
sorted_v2 = [values_2[i] for i in sorted_indices]
sorted_diff = [differences[i] for i in sorted_indices]
fig, ax = plt.subplots(figsize=figsize)
y_positions = range(len(sorted_entities))
for y, v1, v2, diff in zip(y_positions, sorted_v1, sorted_v2, sorted_diff):
# Connector color based on direction
color = '#2a9d8f' if diff > 0 else '#e76f51' if diff < 0 else '#adb5bd'
ax.plot([v1, v2], [y, y], color=color, linewidth=3, zorder=1)
# Change label
mid_x = (v1 + v2) / 2
label = f'+{diff:.1f}' if diff > 0 else f'{diff:.1f}'
ax.text(mid_x, y - 0.3, label, ha='center', fontsize=8,
color=color, fontweight='bold')
ax.scatter(sorted_v1, y_positions, s=80, color='#264653',
zorder=2, label='Before')
ax.scatter(sorted_v2, y_positions, s=80, color='#f4a261',
zorder=2, label='After')
ax.set_yticks(y_positions)
ax.set_yticklabels(sorted_entities)
ax.set_xlabel(xlabel, fontsize=11)
ax.set_title(title, fontsize=14, fontweight='bold')
ax.legend(loc='lower right')
ax.invert_yaxis()
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.tight_layout()
return fig
Slope Charts for Change Over Time
Slope charts show values at two time points with connecting lines. The line slopes reveal improvement or decline:
class SlopeChart:
"""
Create slope charts for temporal comparisons.
The slope angle immediately communicates direction
and magnitude of change between two time periods.
"""
def __init__(self):
self.colors = ['#264653', '#2a9d8f', '#e9c46a', '#f4a261',
'#e76f51', '#7209b7', '#3a0ca3', '#4361ee']
def create(self, entities: List[str],
values_1: List[float],
values_2: List[float],
period_1: str,
period_2: str,
title: str,
ylabel: str,
highlight: List[str] = None,
figsize: Tuple[int, int] = (8, 10)) -> plt.Figure:
"""
Create slope chart showing change between two periods.
Args:
entities: List of entity names
values_1: Values at first time period
values_2: Values at second time period
period_1: Label for first period
period_2: Label for second period
title: Chart title
ylabel: Y-axis label
highlight: Entities to emphasize
figsize: Figure dimensions
Returns:
matplotlib Figure object
"""
fig, ax = plt.subplots(figsize=figsize)
x_positions = [0, 1]
for idx, (entity, v1, v2) in enumerate(zip(entities, values_1, values_2)):
is_highlighted = highlight and entity in highlight
color = self.colors[idx % len(self.colors)]
alpha = 1.0 if is_highlighted else 0.4
linewidth = 3 if is_highlighted else 1.5
# Line
ax.plot(x_positions, [v1, v2], color=color,
linewidth=linewidth, alpha=alpha)
# Points
ax.scatter(x_positions, [v1, v2], color=color,
s=80 if is_highlighted else 40,
alpha=alpha, zorder=5)
# Labels
if is_highlighted or len(entities) <= 10:
ax.text(-0.05, v1, f'{entity}: {v1:.1f}',
ha='right', va='center', fontsize=9,
color=color, alpha=alpha)
ax.text(1.05, v2, f'{v2:.1f}',
ha='left', va='center', fontsize=9,
color=color, alpha=alpha)
# Period labels
ax.set_xticks(x_positions)
ax.set_xticklabels([period_1, period_2], fontsize=12)
ax.set_xlim(-0.3, 1.3)
ax.set_ylabel(ylabel, fontsize=11)
ax.set_title(title, fontsize=14, fontweight='bold')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['bottom'].set_visible(False)
plt.tight_layout()
return fig
14.5 Bump Charts for Ranking Changes
When tracking rankings over multiple time periods, bump charts reveal position changes:
class BumpChart:
"""
Create bump charts for ranking changes over time.
Bump charts track position changes rather than absolute values,
making them ideal for standings, power rankings, and polls.
"""
def __init__(self):
self.colors = ['#264653', '#2a9d8f', '#e9c46a', '#f4a261',
'#e76f51', '#7209b7', '#3a0ca3', '#4361ee',
'#4cc9f0', '#80b918', '#aacc00', '#55a630']
def create(self, rankings: Dict[str, List[int]],
periods: List[str],
title: str,
highlight: List[str] = None,
show_top_n: int = None,
figsize: Tuple[int, int] = (12, 8)) -> plt.Figure:
"""
Create bump chart showing ranking changes over time.
Args:
rankings: Dict mapping entity names to lists of rankings
(one ranking per period, lower is better)
periods: List of period labels
title: Chart title
highlight: Entities to emphasize
show_top_n: Only show entities that were in top N at some point
figsize: Figure dimensions
Returns:
matplotlib Figure object
"""
fig, ax = plt.subplots(figsize=figsize)
x_positions = range(len(periods))
# Filter to top N if specified
if show_top_n:
included = set()
for entity, ranks in rankings.items():
if any(r <= show_top_n for r in ranks if r is not None):
included.add(entity)
rankings = {k: v for k, v in rankings.items() if k in included}
for idx, (entity, ranks) in enumerate(rankings.items()):
is_highlighted = highlight and entity in highlight
color = self.colors[idx % len(self.colors)]
alpha = 1.0 if is_highlighted else 0.3
linewidth = 4 if is_highlighted else 2
# Filter out None values (unranked periods)
valid_points = [(x, r) for x, r in zip(x_positions, ranks)
if r is not None]
if valid_points:
xs, rs = zip(*valid_points)
ax.plot(xs, rs, color=color, linewidth=linewidth,
alpha=alpha, marker='o', markersize=8 if is_highlighted else 5)
# End label
if is_highlighted or len(rankings) <= 10:
ax.text(xs[-1] + 0.1, rs[-1], entity,
va='center', fontsize=10 if is_highlighted else 8,
color=color, alpha=alpha,
fontweight='bold' if is_highlighted else 'normal')
# Configure axes
ax.set_xticks(x_positions)
ax.set_xticklabels(periods, fontsize=10)
max_rank = max(max(r for r in ranks if r is not None)
for ranks in rankings.values())
ax.set_ylim(max_rank + 0.5, 0.5) # Invert: rank 1 at top
ax.set_ylabel('Ranking', fontsize=11)
# Add rank gridlines
for rank in range(1, max_rank + 1):
ax.axhline(rank, color='gray', linewidth=0.5, alpha=0.3)
ax.set_title(title, fontsize=14, fontweight='bold')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.tight_layout()
return fig
Conference Standings Bump Chart
A common application shows how conference standings evolve through the season:
def create_conference_standings_bump(team_records: Dict[str, List[Tuple[int, int]]],
weeks: List[str],
conference_name: str) -> plt.Figure:
"""
Create bump chart for conference standings over a season.
Args:
team_records: Dict mapping team names to lists of (wins, losses) tuples
weeks: List of week labels
conference_name: Conference name for title
Returns:
matplotlib Figure
"""
# Calculate standings (win percentage, with tiebreaker by total wins)
weekly_rankings = {team: [] for team in team_records}
for week_idx in range(len(weeks)):
week_standings = []
for team, records in team_records.items():
wins, losses = records[week_idx]
total = wins + losses
pct = wins / total if total > 0 else 0
week_standings.append((team, pct, wins))
# Sort by win pct (desc), then total wins (desc)
week_standings.sort(key=lambda x: (-x[1], -x[2]))
for rank, (team, _, _) in enumerate(week_standings, 1):
weekly_rankings[team].append(rank)
bump = BumpChart()
return bump.create(
rankings=weekly_rankings,
periods=weeks,
title=f'{conference_name} Standings by Week',
show_top_n=10
)
14.6 Small Multiples for Conference Comparisons
When comparing many entities across the same metric, small multiples provide clarity:
class SmallMultiples:
"""
Create small multiple displays for repeated comparisons.
Small multiples show the same chart structure repeated for
different subsets, enabling pattern recognition across groups.
"""
def __init__(self, ncols: int = 4):
self.ncols = ncols
self.colors = {
'primary': '#264653',
'secondary': '#adb5bd',
'highlight': '#e76f51',
'average': '#2a9d8f'
}
def create_distribution_multiples(self,
groups: Dict[str, List[float]],
title: str,
xlabel: str,
reference_value: float = None,
figsize_per_chart: Tuple[float, float] = (3, 2.5)) -> plt.Figure:
"""
Create small multiples of distribution plots.
Args:
groups: Dict mapping group names to lists of values
title: Overall title
xlabel: X-axis label for each chart
reference_value: Optional reference line
figsize_per_chart: Size of each individual chart
Returns:
matplotlib Figure
"""
num_groups = len(groups)
nrows = (num_groups + self.ncols - 1) // self.ncols
fig_width = figsize_per_chart[0] * self.ncols
fig_height = figsize_per_chart[1] * nrows
fig, axes = plt.subplots(nrows, self.ncols,
figsize=(fig_width, fig_height))
axes_flat = axes.flatten() if num_groups > 1 else [axes]
# Find global range for consistent scaling
all_values = [v for values in groups.values() for v in values]
x_min, x_max = min(all_values), max(all_values)
for idx, (group_name, values) in enumerate(groups.items()):
ax = axes_flat[idx]
ax.hist(values, bins=15, color=self.colors['primary'],
alpha=0.7, edgecolor='white')
if reference_value is not None:
ax.axvline(reference_value, color=self.colors['average'],
linestyle='--', linewidth=2)
group_mean = np.mean(values)
ax.axvline(group_mean, color=self.colors['highlight'],
linewidth=2, label=f'Mean: {group_mean:.1f}')
ax.set_xlim(x_min, x_max)
ax.set_title(group_name, fontsize=10, fontweight='bold')
ax.set_xlabel(xlabel, fontsize=8)
ax.tick_params(labelsize=8)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
# Hide unused subplots
for idx in range(len(groups), len(axes_flat)):
axes_flat[idx].set_visible(False)
fig.suptitle(title, fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
return fig
def create_trend_multiples(self,
groups: Dict[str, Dict[str, List[float]]],
time_labels: List[str],
title: str,
ylabel: str,
figsize_per_chart: Tuple[float, float] = (3, 2.5)) -> plt.Figure:
"""
Create small multiples of trend lines.
Args:
groups: Dict mapping group names to dicts of series
(each series is a list of values over time)
time_labels: Labels for time axis
title: Overall title
ylabel: Y-axis label
figsize_per_chart: Size of each chart
Returns:
matplotlib Figure
"""
num_groups = len(groups)
nrows = (num_groups + self.ncols - 1) // self.ncols
fig_width = figsize_per_chart[0] * self.ncols
fig_height = figsize_per_chart[1] * nrows
fig, axes = plt.subplots(nrows, self.ncols,
figsize=(fig_width, fig_height))
axes_flat = axes.flatten() if num_groups > 1 else [axes]
line_colors = ['#264653', '#2a9d8f', '#e9c46a', '#f4a261', '#e76f51']
for idx, (group_name, series_dict) in enumerate(groups.items()):
ax = axes_flat[idx]
for series_idx, (series_name, values) in enumerate(series_dict.items()):
color = line_colors[series_idx % len(line_colors)]
ax.plot(values, color=color, linewidth=2,
label=series_name, marker='o', markersize=4)
ax.set_xticks(range(len(time_labels)))
ax.set_xticklabels(time_labels, fontsize=7, rotation=45)
ax.set_title(group_name, fontsize=10, fontweight='bold')
ax.tick_params(labelsize=8)
if idx == 0:
ax.legend(fontsize=7, loc='upper left')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
# Hide unused
for idx in range(len(groups), len(axes_flat)):
axes_flat[idx].set_visible(False)
fig.suptitle(title, fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
return fig
14.7 Similarity Analysis and Clustering
Beyond explicit comparisons, analytics often seeks to find similar players or teams—useful for recruiting, opponent preparation, and player development.
Player Similarity Visualization
from scipy.spatial.distance import pdist, squareform
from scipy.cluster.hierarchy import dendrogram, linkage
class SimilarityAnalysis:
"""
Visualize player and team similarities using clustering.
Uses hierarchical clustering and heatmaps to reveal
which entities are most similar based on multiple metrics.
"""
def __init__(self):
self.colors = {
'low': '#264653',
'high': '#e76f51',
'mid': '#f8f9fa'
}
def create_similarity_heatmap(self,
entities: List[str],
features: np.ndarray,
title: str,
figsize: Tuple[int, int] = (10, 8)) -> plt.Figure:
"""
Create heatmap showing pairwise similarities.
Args:
entities: List of entity names
features: 2D array of shape (n_entities, n_features)
title: Chart title
figsize: Figure dimensions
Returns:
matplotlib Figure
"""
# Normalize features
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
features_scaled = scaler.fit_transform(features)
# Calculate distance matrix
distances = squareform(pdist(features_scaled, metric='euclidean'))
# Convert to similarity (inverse of distance)
max_dist = distances.max()
similarities = 1 - (distances / max_dist)
fig, ax = plt.subplots(figsize=figsize)
im = ax.imshow(similarities, cmap='RdYlGn', aspect='auto')
ax.set_xticks(range(len(entities)))
ax.set_yticks(range(len(entities)))
ax.set_xticklabels(entities, rotation=45, ha='right', fontsize=9)
ax.set_yticklabels(entities, fontsize=9)
plt.colorbar(im, ax=ax, label='Similarity', shrink=0.8)
ax.set_title(title, fontsize=14, fontweight='bold')
plt.tight_layout()
return fig
def create_dendrogram(self,
entities: List[str],
features: np.ndarray,
title: str,
figsize: Tuple[int, int] = (12, 6)) -> plt.Figure:
"""
Create dendrogram showing hierarchical clustering.
The tree structure reveals natural groupings among entities.
Args:
entities: List of entity names
features: 2D array of shape (n_entities, n_features)
title: Chart title
figsize: Figure dimensions
Returns:
matplotlib Figure
"""
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
features_scaled = scaler.fit_transform(features)
# Hierarchical clustering
linkage_matrix = linkage(features_scaled, method='ward')
fig, ax = plt.subplots(figsize=figsize)
dendrogram(linkage_matrix, labels=entities, ax=ax,
leaf_rotation=45, leaf_font_size=9)
ax.set_title(title, fontsize=14, fontweight='bold')
ax.set_ylabel('Distance', fontsize=11)
plt.tight_layout()
return fig
def find_similar_players(self,
target: str,
entities: List[str],
features: np.ndarray,
top_n: int = 5) -> List[Tuple[str, float]]:
"""
Find players most similar to a target player.
Args:
target: Name of target player
entities: List of all player names
features: Feature matrix
top_n: Number of similar players to return
Returns:
List of (player_name, similarity_score) tuples
"""
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
features_scaled = scaler.fit_transform(features)
target_idx = entities.index(target)
target_features = features_scaled[target_idx]
similarities = []
for idx, name in enumerate(entities):
if idx != target_idx:
dist = np.linalg.norm(features_scaled[idx] - target_features)
sim = 1 / (1 + dist) # Convert distance to similarity
similarities.append((name, sim))
similarities.sort(key=lambda x: x[1], reverse=True)
return similarities[:top_n]
Visualizing Similar Player Comparisons
def create_similar_player_comparison(target: str,
similar_players: List[Tuple[str, float]],
player_stats: Dict[str, Dict[str, float]],
metrics: List[str],
figsize: Tuple[int, int] = (12, 8)) -> plt.Figure:
"""
Create visual comparison between target player and similar players.
Combines radar chart overlay with similarity scores.
Args:
target: Target player name
similar_players: List of (name, similarity) tuples
player_stats: Dict mapping names to stat dicts
metrics: List of metrics to include
figsize: Figure dimensions
Returns:
matplotlib Figure
"""
fig = plt.figure(figsize=figsize)
# Radar chart on left
ax_radar = fig.add_subplot(121, polar=True)
# Similarity bars on right
ax_bars = fig.add_subplot(122)
# Setup radar
num_metrics = len(metrics)
angles = np.linspace(0, 2 * np.pi, num_metrics, endpoint=False).tolist()
angles += angles[:1]
colors = ['#264653', '#2a9d8f', '#e9c46a', '#f4a261', '#e76f51']
# Find max values for normalization
all_stats = [player_stats[target]] + [player_stats[p[0]] for p in similar_players]
max_vals = {m: max(s.get(m, 0) for s in all_stats) for m in metrics}
# Plot target player
target_vals = [player_stats[target].get(m, 0) / max_vals[m]
for m in metrics]
target_vals += target_vals[:1]
ax_radar.plot(angles, target_vals, 'o-', linewidth=3,
color='#264653', label=target)
ax_radar.fill(angles, target_vals, alpha=0.25, color='#264653')
# Plot similar players
for idx, (name, sim) in enumerate(similar_players[:3]):
vals = [player_stats[name].get(m, 0) / max_vals[m] for m in metrics]
vals += vals[:1]
color = colors[(idx + 1) % len(colors)]
ax_radar.plot(angles, vals, 'o-', linewidth=2,
color=color, alpha=0.7, label=f'{name} ({sim:.0%})')
ax_radar.set_xticks(angles[:-1])
ax_radar.set_xticklabels(metrics, size=9)
ax_radar.set_ylim(0, 1)
ax_radar.legend(loc='upper right', bbox_to_anchor=(1.3, 1.1), fontsize=9)
ax_radar.set_title(f'Profile Comparison', fontsize=12, fontweight='bold')
# Similarity bar chart
names = [p[0] for p in similar_players]
scores = [p[1] for p in similar_players]
y_pos = range(len(names))
ax_bars.barh(y_pos, scores, color='#2a9d8f', edgecolor='white')
ax_bars.set_yticks(y_pos)
ax_bars.set_yticklabels(names)
ax_bars.set_xlabel('Similarity Score', fontsize=10)
ax_bars.set_title(f'Most Similar to {target}', fontsize=12, fontweight='bold')
ax_bars.set_xlim(0, 1)
for bar, score in zip(ax_bars.patches, scores):
ax_bars.text(bar.get_width() + 0.02, bar.get_y() + bar.get_height()/2,
f'{score:.0%}', va='center', fontsize=9)
ax_bars.invert_yaxis()
ax_bars.spines['top'].set_visible(False)
ax_bars.spines['right'].set_visible(False)
plt.tight_layout()
return fig
14.8 Advanced Comparison Techniques
Percentile Ranks and Distributions
Showing where an entity falls within the distribution provides essential context:
class PercentileVisualization:
"""
Visualize values in context of their distributions.
Shows both absolute values and relative standing,
providing the context needed for meaningful comparison.
"""
def create_percentile_chart(self,
player_name: str,
player_stats: Dict[str, float],
population_stats: Dict[str, List[float]],
metrics: List[str],
figsize: Tuple[int, int] = (12, 6)) -> plt.Figure:
"""
Create chart showing player percentiles across metrics.
Args:
player_name: Target player name
player_stats: Dict of player's stats
population_stats: Dict mapping metrics to lists of all values
metrics: Metrics to display
figsize: Figure dimensions
Returns:
matplotlib Figure
"""
from scipy import stats
fig, ax = plt.subplots(figsize=figsize)
percentiles = []
for metric in metrics:
player_val = player_stats.get(metric, 0)
pop_vals = population_stats.get(metric, [0])
pct = stats.percentileofscore(pop_vals, player_val)
percentiles.append(pct)
y_positions = range(len(metrics))
# Background zones
ax.axvspan(0, 25, color='#fee2e2', alpha=0.5, zorder=0)
ax.axvspan(25, 50, color='#fef3c7', alpha=0.5, zorder=0)
ax.axvspan(50, 75, color='#d1fae5', alpha=0.5, zorder=0)
ax.axvspan(75, 100, color='#a7f3d0', alpha=0.5, zorder=0)
# Percentile bars
colors = ['#e76f51' if p < 25 else '#e9c46a' if p < 50
else '#2a9d8f' if p < 75 else '#264653' for p in percentiles]
bars = ax.barh(y_positions, percentiles, color=colors,
edgecolor='white', linewidth=0.5, height=0.6)
# Value labels
for bar, pct, metric in zip(bars, percentiles, metrics):
raw_val = player_stats.get(metric, 0)
ax.text(bar.get_width() + 2, bar.get_y() + bar.get_height()/2,
f'{pct:.0f}th ({raw_val:.1f})', va='center', fontsize=9)
# Zone labels
ax.text(12.5, len(metrics) + 0.5, 'Below Avg', ha='center',
fontsize=8, alpha=0.7)
ax.text(37.5, len(metrics) + 0.5, 'Average', ha='center',
fontsize=8, alpha=0.7)
ax.text(62.5, len(metrics) + 0.5, 'Above Avg', ha='center',
fontsize=8, alpha=0.7)
ax.text(87.5, len(metrics) + 0.5, 'Elite', ha='center',
fontsize=8, alpha=0.7)
ax.set_yticks(y_positions)
ax.set_yticklabels(metrics)
ax.set_xlim(0, 105)
ax.set_xlabel('Percentile Rank', fontsize=11)
ax.set_title(f'{player_name} - Percentile Profile',
fontsize=14, fontweight='bold')
ax.axvline(50, color='gray', linestyle='--', linewidth=1, alpha=0.5)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.tight_layout()
return fig
Swarm Plots for Distribution Context
Show individual data points to provide distribution context:
def create_swarm_comparison(groups: Dict[str, List[float]],
highlight_values: Dict[str, float] = None,
title: str = None,
ylabel: str = None,
figsize: Tuple[int, int] = (10, 6)) -> plt.Figure:
"""
Create swarm plot showing distributions with optional highlights.
Each point represents one observation, avoiding overlap,
revealing the full distribution while highlighting specific values.
Args:
groups: Dict mapping group names to lists of values
highlight_values: Optional dict of specific values to highlight
title: Chart title
ylabel: Y-axis label
figsize: Figure dimensions
Returns:
matplotlib Figure
"""
import seaborn as sns
# Prepare data
plot_data = []
for group, values in groups.items():
for val in values:
plot_data.append({'group': group, 'value': val})
import pandas as pd
df = pd.DataFrame(plot_data)
fig, ax = plt.subplots(figsize=figsize)
sns.swarmplot(x='group', y='value', data=df, ax=ax,
color='#264653', alpha=0.5, size=5)
# Add boxplot summary
sns.boxplot(x='group', y='value', data=df, ax=ax,
color='white', width=0.3,
boxprops=dict(alpha=0.7),
showfliers=False)
# Highlight specific values
if highlight_values:
for idx, (group, val) in enumerate(highlight_values.items()):
ax.scatter([idx], [val], s=200, color='#e76f51',
zorder=10, marker='*', edgecolors='white', linewidths=1)
ax.annotate(f'{val:.1f}', (idx, val),
xytext=(10, 5), textcoords='offset points',
fontsize=10, fontweight='bold', color='#e76f51')
ax.set_xlabel('')
ax.set_ylabel(ylabel, fontsize=11)
ax.set_title(title, fontsize=14, fontweight='bold')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.tight_layout()
return fig
14.9 Best Practices for Comparison Visualizations
Avoiding Common Pitfalls
Cherry-Picking Metrics: Including only metrics that favor one side creates misleading comparisons. Always include a balanced set of relevant metrics.
Inconsistent Baselines: Starting one bar chart at zero and another at a non-zero value exaggerates or minimizes differences.
Misleading Scales: Truncating axes can make small differences appear dramatic. Use full scales unless explicitly communicating relative change.
Missing Context: Raw values without baselines, averages, or percentiles leave audiences unable to judge significance.
def validate_comparison_fairness(entities: List[str],
stats: Dict[str, Dict[str, float]],
metrics: List[str]) -> Dict[str, str]:
"""
Check comparison visualization for potential fairness issues.
Returns warnings about potential problems.
"""
warnings = {}
# Check for missing data
for entity in entities:
entity_stats = stats.get(entity, {})
missing = [m for m in metrics if m not in entity_stats]
if missing:
warnings[entity] = f"Missing metrics: {missing}"
# Check for extreme outliers that might distort visualization
for metric in metrics:
values = [stats[e].get(metric, 0) for e in entities if metric in stats.get(e, {})]
if values:
mean_val = np.mean(values)
std_val = np.std(values)
for entity in entities:
val = stats.get(entity, {}).get(metric, 0)
if std_val > 0 and abs(val - mean_val) > 3 * std_val:
warnings[f"{entity}_{metric}"] = f"Extreme outlier: {val:.1f} (mean: {mean_val:.1f})"
return warnings
Design Checklist
Before publishing any comparison visualization:
- Verify data accuracy: Double-check all values
- Confirm equivalence: Ensure entities are genuinely comparable
- Check normalization: Verify adjustments for opportunity/playing time
- Test scales: Confirm axes start at appropriate values
- Add context: Include averages, percentiles, or reference points
- Review balance: Ensure metric selection doesn't favor any entity
- Clarify methodology: Document how similarity or rankings are calculated
14.10 Integrated Comparison Dashboard
Combining multiple comparison views creates comprehensive analysis:
def create_player_comparison_dashboard(player_a: str,
player_b: str,
stats_a: Dict[str, float],
stats_b: Dict[str, float],
metrics: List[str],
population_stats: Dict[str, List[float]]) -> plt.Figure:
"""
Create comprehensive comparison dashboard for two players.
Combines:
- Radar chart overlay
- Metric-by-metric bar comparison
- Percentile context
Args:
player_a, player_b: Player names
stats_a, stats_b: Player statistics
metrics: Metrics to compare
population_stats: Population distributions for percentiles
Returns:
matplotlib Figure
"""
fig = plt.figure(figsize=(16, 10))
# Define grid
gs = fig.add_gridspec(2, 3, height_ratios=[1, 1], width_ratios=[1, 1, 1],
hspace=0.3, wspace=0.3)
# Colors
color_a = '#264653'
color_b = '#e76f51'
# === Radar Chart (top left) ===
ax_radar = fig.add_subplot(gs[0, 0], polar=True)
angles = np.linspace(0, 2 * np.pi, len(metrics), endpoint=False).tolist()
angles += angles[:1]
# Normalize to max in population
max_vals = {m: max(population_stats.get(m, [1])) for m in metrics}
vals_a = [stats_a.get(m, 0) / max_vals[m] for m in metrics]
vals_a += vals_a[:1]
vals_b = [stats_b.get(m, 0) / max_vals[m] for m in metrics]
vals_b += vals_b[:1]
ax_radar.plot(angles, vals_a, 'o-', linewidth=2, color=color_a, label=player_a)
ax_radar.fill(angles, vals_a, alpha=0.2, color=color_a)
ax_radar.plot(angles, vals_b, 'o-', linewidth=2, color=color_b, label=player_b)
ax_radar.fill(angles, vals_b, alpha=0.2, color=color_b)
ax_radar.set_xticks(angles[:-1])
ax_radar.set_xticklabels(metrics, size=8)
ax_radar.set_ylim(0, 1)
ax_radar.legend(loc='upper right', fontsize=8)
ax_radar.set_title('Profile Overlay', fontsize=11, fontweight='bold')
# === Bar Comparison (top middle and right) ===
ax_bars = fig.add_subplot(gs[0, 1:])
x = np.arange(len(metrics))
width = 0.35
values_a = [stats_a.get(m, 0) for m in metrics]
values_b = [stats_b.get(m, 0) for m in metrics]
bars_a = ax_bars.bar(x - width/2, values_a, width, label=player_a, color=color_a)
bars_b = ax_bars.bar(x + width/2, values_b, width, label=player_b, color=color_b)
ax_bars.set_xticks(x)
ax_bars.set_xticklabels(metrics, rotation=45, ha='right', fontsize=9)
ax_bars.legend()
ax_bars.set_title('Raw Value Comparison', fontsize=11, fontweight='bold')
ax_bars.spines['top'].set_visible(False)
ax_bars.spines['right'].set_visible(False)
# === Percentile Comparison (bottom) ===
ax_pct = fig.add_subplot(gs[1, :])
from scipy import stats as sp_stats
pct_a = [sp_stats.percentileofscore(population_stats.get(m, [0]), stats_a.get(m, 0))
for m in metrics]
pct_b = [sp_stats.percentileofscore(population_stats.get(m, [0]), stats_b.get(m, 0))
for m in metrics]
y = np.arange(len(metrics))
height = 0.35
ax_pct.barh(y - height/2, pct_a, height, label=player_a, color=color_a)
ax_pct.barh(y + height/2, pct_b, height, label=player_b, color=color_b)
ax_pct.axvline(50, color='gray', linestyle='--', alpha=0.5)
ax_pct.set_yticks(y)
ax_pct.set_yticklabels(metrics)
ax_pct.set_xlabel('Percentile Rank')
ax_pct.set_xlim(0, 100)
ax_pct.legend(loc='lower right')
ax_pct.set_title('Percentile Comparison', fontsize=11, fontweight='bold')
ax_pct.spines['top'].set_visible(False)
ax_pct.spines['right'].set_visible(False)
# Overall title
fig.suptitle(f'{player_a} vs {player_b}', fontsize=16, fontweight='bold', y=0.98)
return fig
Chapter Summary
Player and team comparison charts require thoughtful design to be both accurate and useful:
-
Principles First: Establish common baselines, consistent scales, and relevant context before selecting chart types.
-
Radar Charts: Excellent for qualitative profile comparisons, but be aware of perceptual limitations with area judgments.
-
Bar Charts: The versatile workhorse—use horizontal for rankings, grouped for multi-metric, diverging for above/below comparisons.
-
Dumbbell and Slope Charts: Specialized for paired comparisons showing change between two points.
-
Bump Charts: Track ranking changes over time, perfect for standings and polls.
-
Small Multiples: Enable pattern recognition across many entities using consistent chart structures.
-
Similarity Analysis: Use clustering and heatmaps to find and visualize similar players or teams.
-
Percentile Context: Always show where values fall within distributions to enable meaningful interpretation.
-
Fairness: Ensure comparisons account for position, opportunity, and era differences.
The next chapter explores interactive dashboards that combine these comparison techniques with user controls for dynamic exploration.
Key Terms
- Bump Chart: Visualization showing ranking changes over time with connecting lines
- Dendrogram: Tree diagram showing hierarchical clustering relationships
- Diverging Bar Chart: Bar chart centered on a baseline showing positive/negative deviation
- Dumbbell Chart: Paired dot chart connected by lines showing two values per entity
- Percentile Rank: Position within a distribution expressed as percentage below that value
- Radar Chart: Multi-axis chart showing values radiating from center point
- Similarity Matrix: Grid showing pairwise similarity scores between entities
- Slope Chart: Two-column chart with connecting lines showing change between periods
- Small Multiples: Repeated chart structure across subsets enabling pattern comparison
Practice Exercises
-
Create a radar chart comparing three quarterbacks using completion %, yards per attempt, TD rate, INT rate, and rush yards.
-
Build a bump chart showing conference standings evolution through a 12-week season.
-
Design a dumbbell chart comparing team offensive efficiency from first half to second half of season.
-
Implement similarity analysis to find the 5 players most similar to a target based on performance metrics.
-
Create a comprehensive comparison dashboard for two teams including multiple chart types.
Further Reading
- Tufte, E. (2001). The Visual Display of Quantitative Information
- Few, S. (2012). Show Me the Numbers: Designing Tables and Graphs to Enlighten
- Cairo, A. (2016). The Truthful Art: Data, Charts, and Maps for Communication
- Knaflic, C. (2015). Storytelling with Data