Hot Hand Analysis

Beginner 10 min read 0 views Nov 27, 2025

The Hot Hand in Basketball: Does It Really Exist?

The "hot hand" is one of basketball's most enduring debates—the belief that players experience streaks where they're more likely to make shots after recent successes. For decades, this phenomenon has divided researchers, coaches, and fans. What began as a seemingly straightforward question has evolved into a complex investigation spanning psychology, statistics, and sports science. This comprehensive guide explores the hot hand debate from its origins to the latest research, providing both the analytical tools to study it and insights into its practical implications.

What is the Hot Hand?

The hot hand refers to the widespread belief that basketball players experience temporary periods of increased shooting accuracy following successful shots. Players and fans often describe someone who makes several consecutive shots as "hot," "in the zone," or "feeling it." The implicit assumption is that recent success predicts future success—that a player who has just made three shots in a row is more likely to make the fourth than their baseline shooting percentage would suggest.

This belief manifests in several ways on the basketball court:

  • Shot Selection: Teammates pass more frequently to players perceived as "hot"
  • Defensive Strategy: Opponents defend "hot" players more aggressively
  • Coaching Decisions: Coaches design plays for players on hot streaks
  • Player Confidence: Players may shoot more confidently after making consecutive shots
  • Momentum Narrative: Broadcasters and fans attribute team performance to individual hot streaks

The hot hand hypothesis directly contradicts the assumption of independence in probability theory—that each shot is an independent event unaffected by previous outcomes. If the hot hand exists, basketball shooting would exhibit positive autocorrelation, making it fundamentally different from random coin flips.

History of the Hot Hand Debate

The scientific investigation of the hot hand began in the 1980s, launching one of the most fascinating and contentious debates in sports analytics.

The Cornell Free Throw Study (Early 1980s)

The earliest systematic study of the hot hand came from Cornell University, where researchers analyzed free throw shooting patterns. Their preliminary findings suggested that free throw success might not exhibit the streak shooting that players and coaches believed existed. This study, though less well-known, laid the groundwork for more comprehensive investigations.

The Gilovich, Vallone, and Tversky Study (1985)

The landmark paper that defined the hot hand debate for decades was published in 1985 by psychologists Thomas Gilovich, Robert Vallone, and Amos Tversky in the journal Cognitive Psychology. Their paper, "The Hot Hand in Basketball: On the Misperception of Random Sequences," became one of the most cited works in behavioral economics and sports psychology.

The researchers conducted three main analyses:

  • Philadelphia 76ers Field Goal Analysis: They examined shooting records from 48 home games during the 1980-81 season, analyzing whether players shot better after hits than after misses
  • Boston Celtics Free Throw Analysis: They studied two seasons of free throw data, looking for positive correlations between consecutive free throws
  • Cornell University Controlled Experiment: They conducted a controlled shooting study with Cornell men's and women's varsity basketball players

Their conclusion was striking: They found no evidence for the hot hand. The shooting patterns they observed were statistically indistinguishable from random sequences. Players were not more likely to make shots after hits than after misses. The hot hand, they argued, was a "cognitive illusion"—a misperception of randomness caused by humans' tendency to see patterns in random data.

The Cognitive Illusion Framework

Gilovich and colleagues explained the hot hand belief through several psychological mechanisms:

  • Representativeness Heuristic: People expect small samples to resemble the underlying probability distribution more than they should. A streak of makes seems "non-random" even when it's perfectly consistent with chance
  • Confirmation Bias: People remember instances that confirm their beliefs (hot streaks) more readily than disconfirming evidence
  • Gambler's Fallacy (Reverse): After seeing hits, people irrationally expect more hits, the opposite of expecting a "balancing out"
  • Selective Memory: Dramatic hot streaks are memorable and salient, making them seem more common than they are

The Acceptance Period (1985-2010s)

For nearly three decades, the Gilovich study's findings were widely accepted in academic circles. The hot hand became a classic example in behavioral economics and psychology courses, illustrating how cognitive biases lead people to perceive patterns in randomness. The concept was discussed in popular books like Daniel Kahneman's Thinking, Fast and Slow as a prime example of human irrationality.

However, some researchers and practitioners remained skeptical. Basketball insiders—players, coaches, and experienced analysts—consistently maintained that the hot hand was real, despite the statistical evidence against it.

The Great Reversal (2014-Present)

Beginning in the mid-2010s, new research began challenging the original findings, leading to what some call "the hot hand debate's revenge."

Miller and Sanjurjo's Statistical Critique (2018): Joshua Miller and Adam Sanjurjo published a groundbreaking paper in Econometrica identifying a subtle but critical statistical bias in the original hot hand research. They demonstrated that when analyzing conditional probabilities in finite sequences, there's an inherent selection bias that makes truly random sequences appear to show negative correlation (the opposite of a hot hand).

The technical issue: When you condition on a hit (selecting only sequences with a hit to start), you're more likely to sample from sequences with a balanced mix of hits and misses rather than those with long runs. This creates a downward bias in the measured probability of hitting after a hit, even in purely random data.

Key Insight: The original studies may have found "no hot hand" not because the hot hand doesn't exist, but because their methodology was biased toward finding a cold hand even in random data. When corrected for this bias, some of the same data showed evidence of a small but real hot hand effect.

Modern Tracking Data Analysis: With the advent of detailed player tracking data (shot location, defender distance, shot difficulty), researchers gained new tools to investigate the hot hand while controlling for confounding variables. Studies using NBA SportVU data and other tracking systems have found modest but statistically significant hot hand effects, particularly when accounting for shot difficulty.

The Original Gilovich, Vallone, and Tversky Study: Deep Dive

Understanding the original study is crucial for appreciating the full debate. Here's a detailed look at their methodology and findings:

Philadelphia 76ers Field Goal Analysis

The researchers obtained shot-by-shot records for nine Philadelphia 76ers players over 48 home games in the 1980-81 season (a total of 100 games per player). For each player, they calculated:

  • Field Goal Percentage after Hits: The shooting percentage on shots immediately following one, two, or three consecutive made field goals
  • Field Goal Percentage after Misses: The shooting percentage on shots immediately following one, two, or three consecutive missed field goals

Results: For most players, the differences were negligible and not statistically significant. In fact, some players showed slightly lower shooting percentages after hits than after misses, the opposite of the hot hand hypothesis. For the team as a whole:

  • FG% after one hit: 51%
  • FG% after one miss: 54%
  • FG% after two hits: 50%
  • FG% after two misses: 53%
  • FG% after three hits: 46%
  • FG% after three misses: 56%

These results suggested that, if anything, players shot worse after makes than after misses.

Boston Celtics Free Throw Analysis

Free throws provided an ideal test case because they eliminate many confounding variables: shot location is fixed, there's no defense, and each shooter has two consecutive attempts. The researchers analyzed two seasons of Boston Celtics free throw data, examining whether the outcome of the first free throw predicted the outcome of the second.

Results: They found no positive correlation. Players made their second free throw at approximately the same rate regardless of whether they made or missed the first. The correlation coefficient was near zero, failing to support the hot hand hypothesis.

Cornell Controlled Experiment

To eliminate additional confounding variables, the researchers conducted a controlled shooting study with Cornell varsity basketball players. Players took 100 shots under standardized conditions while researchers recorded the sequence.

Results: Analysis of the shooting sequences showed no evidence of positive serial correlation. The number and length of streaks were consistent with binomial random sequences given each player's overall shooting percentage.

Survey of Beliefs

Importantly, the researchers also surveyed basketball fans and players about their beliefs. An overwhelming majority (91% of fans, 100% of players surveyed) believed in the hot hand, demonstrating a stark disconnect between perception and statistical reality (as measured at the time).

The Miller-Sanjurjo Critique: A Statistical Revolution

The 2018 paper by Joshua Miller and Adam Sanjurjo, "Surprised by the Hot Hand Fallacy? A Truth in the Law of Small Numbers," fundamentally changed the debate by identifying a subtle but crucial statistical bias.

The Selection Bias Problem

Miller and Sanjurjo demonstrated that when you condition on observing a hit (or a streak of hits) in a finite sequence, you create a selection bias. Here's an intuitive explanation:

Imagine flipping a fair coin four times. What's the probability of heads on flip 4, given that flip 3 was heads?

Intuitively, the answer should be 50% (the coin has no memory). However, when you condition on observing heads in flip 3, you're more likely to be looking at sequences like HTHH, THHH, HHTH, HHHT rather than HHHH, because there are only four possible sequences with H in position 3, and HHHH is just one of them.

The mathematics shows that when you condition on a hit in a finite sequence, the expected probability of the next outcome being a hit is slightly less than the base rate, even for a truly random process. This bias is small but systematic, and it's exactly the kind of negative autocorrelation that Gilovich and colleagues found.

Reanalysis of Original Data

When Miller and Sanjurjo reanalyzed the original Gilovich data accounting for this bias, they found evidence of a small positive hot hand effect. The "cold hand" pattern originally observed was actually consistent with a slightly hot hand when the selection bias was corrected.

The Size of the Effect

The reanalysis suggested that the hot hand effect, while real, is modest—typically an increase of 1-3 percentage points in shooting probability after a hit. This is small enough to have been masked by the selection bias but large enough to be meaningful over the course of a game or season.

Recent Research and Modern Findings

The post-2015 era has produced a wealth of new hot hand research using modern data and methods:

NBA SportVU Tracking Data Studies

Researchers using NBA player tracking data have found evidence for hot hand effects, particularly when controlling for shot difficulty:

  • Bocskocsky, Ezekowitz, and Stein (2014): Using SportVU data, they found that players shoot approximately 2% better after a made shot than after a miss, even when controlling for shot location and defender distance
  • Shot Selection Confound: Players tend to take more difficult shots when they're "hot," which can mask the effect in raw data. When adjusting for shot difficulty, the hot hand becomes more apparent

Three-Point Shooting Studies

Three-point shooting has been a particularly fruitful area for hot hand research:

  • Studies of NBA three-point contests (where confounding variables are minimized) show evidence of positive autocorrelation in shooting
  • Analysis of three-point attempts shows that players shoot better from deep after making previous three-pointers, with the effect strongest for elite shooters

Free Throw Research

Modern free throw studies with larger sample sizes have found modest positive correlations between consecutive free throws, suggesting a small hot hand effect even in this highly controlled situation.

International Basketball

Studies of European leagues and international competitions have generally confirmed the findings from NBA research, suggesting the hot hand is not specific to American basketball.

Individual Differences

Recent research suggests the hot hand effect varies by player:

  • Skill Level: Elite shooters may exhibit stronger hot hand effects than average shooters
  • Shot Type: The effect appears stronger for jump shots than for shots at the rim
  • Game Situation: Hot hand effects may be more pronounced in certain game contexts (close games, high-pressure situations)

Bayesian Approaches to the Hot Hand

Bayesian statistical methods offer a powerful framework for investigating the hot hand, allowing researchers to quantify uncertainty and incorporate prior beliefs systematically.

Bayesian vs. Frequentist Approaches

Traditional hot hand studies used frequentist statistics, which test whether data are consistent with a null hypothesis of no effect. Bayesian approaches instead estimate the probability that the hot hand exists and quantify its magnitude, given the observed data.

Hierarchical Bayesian Models

Modern Bayesian hot hand studies often use hierarchical models that account for:

  • Player-Level Variation: Each player has their own baseline shooting probability and potential hot hand effect
  • Shot-Level Variation: Shot difficulty varies based on location, defense, and game context
  • Game-Level Variation: Player performance varies across games due to fatigue, matchups, and other factors

Bayesian Findings

Bayesian analyses of shooting data generally support a small but real hot hand effect:

  • Posterior probability distributions for the hot hand effect typically center on small positive values (1-3 percentage points)
  • The probability that the hot hand effect is positive (rather than zero or negative) is typically high (70-90% depending on the study)
  • There's substantial heterogeneity across players—some show strong hot hand effects, others show none

Prior Sensitivity

Bayesian analyses must specify prior beliefs about the hot hand. Studies show that results are generally robust to reasonable prior specifications, with data dominating the prior when sample sizes are large enough.

Python Code for Hot Hand Analysis

Example 1: Simulating Random Shooting Sequences

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats

def simulate_shooting_sequence(n_shots=100, base_prob=0.45, hot_hand_effect=0.0):
    """
    Simulate a shooting sequence with optional hot hand effect.

    Parameters:
    n_shots: Number of shots to simulate
    base_prob: Baseline shooting probability
    hot_hand_effect: Increase in probability after a make (0 = no hot hand)

    Returns:
    Array of shot outcomes (1 = make, 0 = miss)
    """
    shots = np.zeros(n_shots, dtype=int)

    # First shot uses base probability
    shots[0] = np.random.random() < base_prob

    # Subsequent shots depend on previous outcome
    for i in range(1, n_shots):
        if shots[i-1] == 1:  # Previous shot was a make
            prob = min(base_prob + hot_hand_effect, 1.0)
        else:  # Previous shot was a miss
            prob = base_prob

        shots[i] = np.random.random() < prob

    return shots


def analyze_hot_hand(shots):
    """
    Analyze a shooting sequence for hot hand effect.

    Parameters:
    shots: Array of shot outcomes (1 = make, 0 = miss)

    Returns:
    Dictionary with analysis results
    """
    n_shots = len(shots)

    # Calculate overall shooting percentage
    overall_pct = shots.mean()

    # Calculate shooting percentage after makes and misses
    after_make = []
    after_miss = []

    for i in range(1, n_shots):
        if shots[i-1] == 1:  # Previous was a make
            after_make.append(shots[i])
        else:  # Previous was a miss
            after_miss.append(shots[i])

    pct_after_make = np.mean(after_make) if after_make else 0
    pct_after_miss = np.mean(after_miss) if after_miss else 0

    # Calculate shooting percentage after streaks
    after_2_makes = []
    after_2_misses = []
    after_3_makes = []
    after_3_misses = []

    for i in range(2, n_shots):
        if shots[i-1] == 1 and shots[i-2] == 1:
            after_2_makes.append(shots[i])
        elif shots[i-1] == 0 and shots[i-2] == 0:
            after_2_misses.append(shots[i])

    for i in range(3, n_shots):
        if shots[i-1] == 1 and shots[i-2] == 1 and shots[i-3] == 1:
            after_3_makes.append(shots[i])
        elif shots[i-1] == 0 and shots[i-2] == 0 and shots[i-3] == 0:
            after_3_misses.append(shots[i])

    pct_after_2_makes = np.mean(after_2_makes) if after_2_makes else 0
    pct_after_2_misses = np.mean(after_2_misses) if after_2_misses else 0
    pct_after_3_makes = np.mean(after_3_makes) if after_3_makes else 0
    pct_after_3_misses = np.mean(after_3_misses) if after_3_misses else 0

    # Calculate autocorrelation
    autocorr = np.corrcoef(shots[:-1], shots[1:])[0, 1]

    return {
        'overall_pct': overall_pct,
        'pct_after_make': pct_after_make,
        'pct_after_miss': pct_after_miss,
        'pct_after_2_makes': pct_after_2_makes,
        'pct_after_2_misses': pct_after_2_misses,
        'pct_after_3_makes': pct_after_3_makes,
        'pct_after_3_misses': pct_after_3_misses,
        'autocorr': autocorr,
        'n_after_make': len(after_make),
        'n_after_miss': len(after_miss),
        'n_after_2_makes': len(after_2_makes),
        'n_after_3_makes': len(after_3_makes)
    }


# Simulation: No hot hand (random shooting)
print("Simulation 1: No Hot Hand (Random Shooting)")
print("=" * 50)
random_shots = simulate_shooting_sequence(n_shots=1000, base_prob=0.45, hot_hand_effect=0.0)
random_analysis = analyze_hot_hand(random_shots)

print(f"Overall FG%: {random_analysis['overall_pct']:.3f}")
print(f"FG% after 1 make: {random_analysis['pct_after_make']:.3f}")
print(f"FG% after 1 miss: {random_analysis['pct_after_miss']:.3f}")
print(f"FG% after 2 makes: {random_analysis['pct_after_2_makes']:.3f}")
print(f"FG% after 3 makes: {random_analysis['pct_after_3_makes']:.3f}")
print(f"Autocorrelation: {random_analysis['autocorr']:.3f}")

# Simulation: With hot hand effect
print("\n\nSimulation 2: With Hot Hand Effect (+5%)")
print("=" * 50)
hot_shots = simulate_shooting_sequence(n_shots=1000, base_prob=0.45, hot_hand_effect=0.05)
hot_analysis = analyze_hot_hand(hot_shots)

print(f"Overall FG%: {hot_analysis['overall_pct']:.3f}")
print(f"FG% after 1 make: {hot_analysis['pct_after_make']:.3f}")
print(f"FG% after 1 miss: {hot_analysis['pct_after_miss']:.3f}")
print(f"FG% after 2 makes: {hot_analysis['pct_after_2_makes']:.3f}")
print(f"FG% after 3 makes: {hot_analysis['pct_after_3_makes']:.3f}")
print(f"Autocorrelation: {hot_analysis['autocorr']:.3f}")

# Visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Random shooting
categories = ['Overall', 'After Make', 'After Miss', 'After 2 Makes', 'After 3 Makes']
random_values = [
    random_analysis['overall_pct'],
    random_analysis['pct_after_make'],
    random_analysis['pct_after_miss'],
    random_analysis['pct_after_2_makes'],
    random_analysis['pct_after_3_makes']
]

axes[0].bar(categories, random_values, color='steelblue', alpha=0.7, edgecolor='navy')
axes[0].axhline(y=0.45, color='red', linestyle='--', linewidth=2, label='Base Rate (45%)')
axes[0].set_ylabel('Shooting Percentage', fontsize=12, fontweight='bold')
axes[0].set_title('Random Shooting (No Hot Hand)', fontsize=13, fontweight='bold')
axes[0].set_ylim([0, 0.7])
axes[0].legend()
axes[0].tick_params(axis='x', rotation=45)

# Plot 2: Hot hand shooting
hot_values = [
    hot_analysis['overall_pct'],
    hot_analysis['pct_after_make'],
    hot_analysis['pct_after_miss'],
    hot_analysis['pct_after_2_makes'],
    hot_analysis['pct_after_3_makes']
]

axes[1].bar(categories, hot_values, color='orangered', alpha=0.7, edgecolor='darkred')
axes[1].axhline(y=0.45, color='red', linestyle='--', linewidth=2, label='Base Rate (45%)')
axes[1].set_ylabel('Shooting Percentage', fontsize=12, fontweight='bold')
axes[1].set_title('With Hot Hand Effect (+5%)', fontsize=13, fontweight='bold')
axes[1].set_ylim([0, 0.7])
axes[1].legend()
axes[1].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.savefig('hot_hand_simulation.png', dpi=300, bbox_inches='tight')
plt.show()

Example 2: Analyzing Real Shooting Data

import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt

def load_shooting_data(csv_file):
    """
    Load shooting data from CSV file.
    Expected columns: player, game_id, shot_number, made (1/0), shot_type
    """
    df = pd.read_csv(csv_file)
    return df


def calculate_conditional_probabilities(shots, streak_length=1):
    """
    Calculate shooting percentage after streaks of makes/misses.
    Accounts for Miller-Sanjurjo bias correction.

    Parameters:
    shots: Array of shot outcomes (1 = make, 0 = miss)
    streak_length: Length of streak to condition on (1, 2, or 3)

    Returns:
    Tuple of (prob_after_makes, prob_after_misses, difference, p_value)
    """
    n = len(shots)
    after_make_streak = []
    after_miss_streak = []

    for i in range(streak_length, n):
        # Check if previous streak_length shots were all makes
        if all(shots[i-j] == 1 for j in range(1, streak_length + 1)):
            after_make_streak.append(shots[i])

        # Check if previous streak_length shots were all misses
        if all(shots[i-j] == 0 for j in range(1, streak_length + 1)):
            after_miss_streak.append(shots[i])

    if len(after_make_streak) == 0 or len(after_miss_streak) == 0:
        return None, None, None, None

    prob_after_makes = np.mean(after_make_streak)
    prob_after_misses = np.mean(after_miss_streak)
    difference = prob_after_makes - prob_after_misses

    # Two-sample proportion test
    n1, n2 = len(after_make_streak), len(after_miss_streak)
    p1, p2 = prob_after_makes, prob_after_misses

    pooled_prob = (sum(after_make_streak) + sum(after_miss_streak)) / (n1 + n2)
    se = np.sqrt(pooled_prob * (1 - pooled_prob) * (1/n1 + 1/n2))

    if se > 0:
        z_stat = (p1 - p2) / se
        p_value = 2 * (1 - stats.norm.cdf(abs(z_stat)))
    else:
        p_value = 1.0

    return prob_after_makes, prob_after_misses, difference, p_value


def runs_test(shots):
    """
    Perform runs test for randomness.
    Tests whether the sequence of makes/misses is random.

    Returns:
    Tuple of (test_statistic, p_value)
    """
    n = len(shots)
    n_makes = sum(shots)
    n_misses = n - n_makes

    # Count runs (sequences of consecutive same outcomes)
    runs = 1
    for i in range(1, n):
        if shots[i] != shots[i-1]:
            runs += 1

    # Expected runs and variance under null hypothesis
    expected_runs = (2 * n_makes * n_misses) / n + 1
    var_runs = (2 * n_makes * n_misses * (2 * n_makes * n_misses - n)) / (n**2 * (n - 1))

    if var_runs > 0:
        z_stat = (runs - expected_runs) / np.sqrt(var_runs)
        p_value = 2 * (1 - stats.norm.cdf(abs(z_stat)))
    else:
        z_stat = 0
        p_value = 1.0

    return z_stat, p_value


# Example: Analyze hypothetical player shooting data
np.random.seed(42)

# Create sample data for multiple players
players = ['Player A', 'Player B', 'Player C']
results = []

for player in players:
    # Simulate shooting with slight hot hand effect
    base_prob = np.random.uniform(0.40, 0.50)
    hot_effect = np.random.uniform(-0.01, 0.03)
    shots = simulate_shooting_sequence(n_shots=500, base_prob=base_prob, hot_hand_effect=hot_effect)

    # Analyze shooting
    overall_pct = shots.mean()

    # Conditional probabilities after 1 make/miss
    p_after_1_make, p_after_1_miss, diff_1, pval_1 = calculate_conditional_probabilities(shots, 1)

    # Conditional probabilities after 2 makes/misses
    p_after_2_make, p_after_2_miss, diff_2, pval_2 = calculate_conditional_probabilities(shots, 2)

    # Runs test
    runs_z, runs_p = runs_test(shots)

    # Autocorrelation
    autocorr = np.corrcoef(shots[:-1], shots[1:])[0, 1]

    results.append({
        'Player': player,
        'Overall FG%': overall_pct,
        'FG% After Make': p_after_1_make,
        'FG% After Miss': p_after_1_miss,
        'Difference': diff_1,
        'P-value': pval_1,
        'Autocorr': autocorr,
        'Runs Test P-value': runs_p
    })

# Create results DataFrame
results_df = pd.DataFrame(results)
print("\nHot Hand Analysis Results")
print("=" * 80)
print(results_df.to_string(index=False))

# Interpretation
print("\n\nInterpretation:")
print("-" * 80)
for _, row in results_df.iterrows():
    print(f"\n{row['Player']}:")
    print(f"  Overall shooting: {row['Overall FG%']:.1%}")
    print(f"  After make: {row['FG% After Make']:.1%} | After miss: {row['FG% After Miss']:.1%}")
    print(f"  Difference: {row['Difference']:.1%} (p = {row['P-value']:.3f})")

    if row['P-value'] < 0.05:
        if row['Difference'] > 0:
            print(f"  ✓ Significant HOT HAND effect detected")
        else:
            print(f"  ✓ Significant COLD HAND effect detected")
    else:
        print(f"  ✗ No significant hot hand effect (consistent with randomness)")

Example 3: Miller-Sanjurjo Bias Correction

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

def calculate_naive_conditional_prob(sequences, condition_length=1):
    """
    Calculate conditional probability the naive way (subject to bias).
    P(Hit | previous k hits)
    """
    n_sequences = len(sequences)
    total_next_hits = 0
    total_conditions = 0

    for seq in sequences:
        for i in range(condition_length, len(seq)):
            # Check if previous condition_length outcomes were hits
            if all(seq[i-j] == 1 for j in range(1, condition_length + 1)):
                total_conditions += 1
                total_next_hits += seq[i]

    if total_conditions == 0:
        return None

    return total_next_hits / total_conditions


def calculate_bias_corrected_prob(sequences, base_prob, condition_length=1):
    """
    Calculate bias-corrected conditional probability accounting for
    Miller-Sanjurjo selection bias.
    """
    naive_prob = calculate_naive_conditional_prob(sequences, condition_length)

    if naive_prob is None:
        return None

    # The expected naive probability for random sequences
    # This represents the bias
    expected_naive = base_prob - (base_prob * (1 - base_prob)) / (condition_length + 1)

    # Correction
    bias = base_prob - expected_naive
    corrected_prob = naive_prob + bias

    return corrected_prob, naive_prob, bias


# Demonstrate the bias with simulations
print("Demonstrating Miller-Sanjurjo Selection Bias")
print("=" * 70)
print("\nGenerating 10,000 truly random sequences (50% probability)...")

n_simulations = 10000
sequence_length = 20
true_prob = 0.50

# Generate random sequences
random_sequences = []
for _ in range(n_simulations):
    seq = (np.random.random(sequence_length) < true_prob).astype(int)
    random_sequences.append(seq)

# Calculate naive conditional probability
naive_prob_1 = calculate_naive_conditional_prob(random_sequences, condition_length=1)
naive_prob_2 = calculate_naive_conditional_prob(random_sequences, condition_length=2)
naive_prob_3 = calculate_naive_conditional_prob(random_sequences, condition_length=3)

print(f"\nTrue probability: {true_prob:.3f}")
print(f"\nNaive conditional probabilities (uncorrected):")
print(f"  P(Hit | 1 previous hit):  {naive_prob_1:.3f}")
print(f"  P(Hit | 2 previous hits): {naive_prob_2:.3f}")
print(f"  P(Hit | 3 previous hits): {naive_prob_3:.3f}")

# Calculate expected bias
expected_bias_1 = true_prob * (1 - true_prob) / 2
expected_bias_2 = true_prob * (1 - true_prob) / 3
expected_bias_3 = true_prob * (1 - true_prob) / 4

print(f"\nExpected bias (Miller-Sanjurjo):")
print(f"  After 1 hit:  {expected_bias_1:.3f}")
print(f"  After 2 hits: {expected_bias_2:.3f}")
print(f"  After 3 hits: {expected_bias_3:.3f}")

print(f"\nObserved bias:")
print(f"  After 1 hit:  {true_prob - naive_prob_1:.3f}")
print(f"  After 2 hits: {true_prob - naive_prob_2:.3f}")
print(f"  After 3 hits: {true_prob - naive_prob_3:.3f}")

print("\nKey insight: Even for truly random sequences (no hot hand),")
print("the naive method shows probabilities BELOW the base rate!")
print("This creates a bias toward finding a 'cold hand' even in random data.")

# Visualization
fig, ax = plt.subplots(figsize=(10, 6))

conditions = ['1 Hit', '2 Hits', '3 Hits']
naive_values = [naive_prob_1, naive_prob_2, naive_prob_3]
corrected_values = [
    naive_prob_1 + expected_bias_1,
    naive_prob_2 + expected_bias_2,
    naive_prob_3 + expected_bias_3
]

x = np.arange(len(conditions))
width = 0.35

bars1 = ax.bar(x - width/2, naive_values, width, label='Naive (Biased)',
               color='coral', edgecolor='darkred', alpha=0.7)
bars2 = ax.bar(x + width/2, corrected_values, width, label='Bias-Corrected',
               color='lightblue', edgecolor='darkblue', alpha=0.7)

ax.axhline(y=true_prob, color='green', linestyle='--', linewidth=2,
           label=f'True Probability ({true_prob:.2f})')

ax.set_ylabel('Conditional Probability', fontsize=12, fontweight='bold')
ax.set_xlabel('Condition', fontsize=12, fontweight='bold')
ax.set_title('Miller-Sanjurjo Bias in Random Sequences', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels(conditions)
ax.legend(fontsize=10)
ax.set_ylim([0.40, 0.55])
ax.grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.savefig('miller_sanjurjo_bias.png', dpi=300, bbox_inches='tight')
plt.show()

print("\nConclusion: The bias is small but systematic. In the original hot hand")
print("studies, this bias may have hidden a real (but small) hot hand effect.")

R Code for Statistical Testing

Example 1: Hot Hand Analysis in R

library(dplyr)
library(ggplot2)

# Simulate shooting sequence
simulate_shots <- function(n_shots = 100, base_prob = 0.45, hot_effect = 0.0) {
  shots <- numeric(n_shots)
  shots[1] <- rbinom(1, 1, base_prob)

  for (i in 2:n_shots) {
    if (shots[i-1] == 1) {
      prob <- min(base_prob + hot_effect, 1.0)
    } else {
      prob <- base_prob
    }
    shots[i] <- rbinom(1, 1, prob)
  }

  return(shots)
}

# Analyze hot hand
analyze_hot_hand <- function(shots) {
  n <- length(shots)

  # Overall percentage
  overall_pct <- mean(shots)

  # After makes and misses
  after_make <- shots[2:n][shots[1:(n-1)] == 1]
  after_miss <- shots[2:n][shots[1:(n-1)] == 0]

  pct_after_make <- mean(after_make)
  pct_after_miss <- mean(after_miss)

  # Statistical test
  test_result <- prop.test(
    x = c(sum(after_make), sum(after_miss)),
    n = c(length(after_make), length(after_miss))
  )

  # Autocorrelation
  autocorr <- cor(shots[1:(n-1)], shots[2:n])

  list(
    overall = overall_pct,
    after_make = pct_after_make,
    after_miss = pct_after_miss,
    difference = pct_after_make - pct_after_miss,
    p_value = test_result$p.value,
    autocorr = autocorr
  )
}

# Run simulation
set.seed(123)

# No hot hand
cat("Simulation 1: No Hot Hand\n")
cat(rep("=", 50), "\n", sep="")
random_shots <- simulate_shots(n_shots = 1000, base_prob = 0.45, hot_effect = 0.0)
random_result <- analyze_hot_hand(random_shots)

cat(sprintf("Overall FG%%: %.3f\n", random_result$overall))
cat(sprintf("FG%% after make: %.3f\n", random_result$after_make))
cat(sprintf("FG%% after miss: %.3f\n", random_result$after_miss))
cat(sprintf("Difference: %.3f\n", random_result$difference))
cat(sprintf("P-value: %.3f\n", random_result$p_value))
cat(sprintf("Autocorrelation: %.3f\n\n", random_result$autocorr))

# With hot hand
cat("Simulation 2: With Hot Hand (+5%%)\n")
cat(rep("=", 50), "\n", sep="")
hot_shots <- simulate_shots(n_shots = 1000, base_prob = 0.45, hot_effect = 0.05)
hot_result <- analyze_hot_hand(hot_shots)

cat(sprintf("Overall FG%%: %.3f\n", hot_result$overall))
cat(sprintf("FG%% after make: %.3f\n", hot_result$after_make))
cat(sprintf("FG%% after miss: %.3f\n", hot_result$after_miss))
cat(sprintf("Difference: %.3f\n", hot_result$difference))
cat(sprintf("P-value: %.3f\n", hot_result$p_value))
cat(sprintf("Autocorrelation: %.3f\n", hot_result$autocorr))

Example 2: Bayesian Analysis with RStan

library(rstan)
library(bayesplot)

# Stan model for hot hand effect
stan_model_code <- "
data {
  int N;              // Number of shots
  int y[N];   // Shot outcomes (1 = make, 0 = miss)
  int prev[N]; // Previous shot outcome
}

parameters {
  real base_prob;      // Baseline shooting probability
  real hot_hand_effect;                  // Hot hand effect (can be positive or negative)
}

transformed parameters {
  vector[N] prob;

  for (i in 1:N) {
    if (i == 1) {
      prob[i] = base_prob;
    } else {
      prob[i] = base_prob + hot_hand_effect * prev[i];
      prob[i] = fmin(fmax(prob[i], 0.0), 1.0);  // Keep in [0,1]
    }
  }
}

model {
  // Priors
  base_prob ~ beta(10, 10);           // Centered around 0.5
  hot_hand_effect ~ normal(0, 0.1);   // Small effect expected

  // Likelihood
  for (i in 1:N) {
    y[i] ~ bernoulli(prob[i]);
  }
}

generated quantities {
  real prob_after_make;
  real prob_after_miss;

  prob_after_make = base_prob + hot_hand_effect;
  prob_after_miss = base_prob;
}
"

# Simulate data
set.seed(456)
n_shots <- 500
true_base <- 0.45
true_effect <- 0.03

shots <- simulate_shots(n_shots, true_base, true_effect)
prev_shots <- c(0, shots[1:(n_shots-1)])

# Prepare data for Stan
stan_data <- list(
  N = n_shots,
  y = shots,
  prev = prev_shots
)

# Fit model
fit <- stan(
  model_code = stan_model_code,
  data = stan_data,
  iter = 2000,
  chains = 4,
  cores = 4
)

# Print results
print(fit, pars = c("base_prob", "hot_hand_effect", "prob_after_make", "prob_after_miss"))

# Visualize posterior distributions
posterior <- as.matrix(fit)

mcmc_areas(posterior, pars = c("hot_hand_effect"),
           prob = 0.95) +
  ggtitle("Posterior Distribution of Hot Hand Effect") +
  theme_minimal()

# Probability that hot hand effect is positive
hot_hand_samples <- posterior[, "hot_hand_effect"]
prob_positive <- mean(hot_hand_samples > 0)

cat(sprintf("\nProbability that hot hand effect is positive: %.2f%%\n", prob_positive * 100))
cat(sprintf("Median hot hand effect: %.3f\n", median(hot_hand_samples)))
cat(sprintf("95%% Credible Interval: [%.3f, %.3f]\n",
            quantile(hot_hand_samples, 0.025),
            quantile(hot_hand_samples, 0.975)))

Example 3: Runs Test in R

library(dplyr)
library(ggplot2)

# Runs test for randomness
runs_test <- function(shots) {
  n <- length(shots)
  n_makes <- sum(shots)
  n_misses <- n - n_makes

  # Count runs
  runs <- 1
  for (i in 2:n) {
    if (shots[i] != shots[i-1]) {
      runs <- runs + 1
    }
  }

  # Expected runs under randomness
  expected_runs <- (2 * n_makes * n_misses) / n + 1

  # Variance of runs
  var_runs <- (2 * n_makes * n_misses * (2 * n_makes * n_misses - n)) /
              (n^2 * (n - 1))

  # Z-statistic
  z_stat <- (runs - expected_runs) / sqrt(var_runs)
  p_value <- 2 * (1 - pnorm(abs(z_stat)))

  list(
    n_runs = runs,
    expected_runs = expected_runs,
    z_statistic = z_stat,
    p_value = p_value
  )
}

# Test on random and hot sequences
set.seed(789)

random_shots <- simulate_shots(500, 0.45, 0.0)
hot_shots <- simulate_shots(500, 0.45, 0.05)

cat("Runs Test Results\n")
cat(rep("=", 60), "\n\n", sep="")

cat("Random Sequence (No Hot Hand):\n")
random_runs <- runs_test(random_shots)
cat(sprintf("  Observed runs: %d\n", random_runs$n_runs))
cat(sprintf("  Expected runs: %.1f\n", random_runs$expected_runs))
cat(sprintf("  Z-statistic: %.3f\n", random_runs$z_statistic))
cat(sprintf("  P-value: %.3f\n\n", random_runs$p_value))

cat("Hot Hand Sequence (+5%%):\n")
hot_runs <- runs_test(hot_shots)
cat(sprintf("  Observed runs: %d\n", hot_runs$n_runs))
cat(sprintf("  Expected runs: %.1f\n", hot_runs$expected_runs))
cat(sprintf("  Z-statistic: %.3f\n", hot_runs$z_statistic))
cat(sprintf("  P-value: %.3f\n", hot_runs$p_value))

cat("\nInterpretation:\n")
cat("Fewer runs than expected suggests positive autocorrelation (hot hand).\n")
cat("More runs than expected suggests negative autocorrelation (cold hand).\n")

Practical Implications for Coaches and Players

The hot hand debate has moved from purely academic to highly practical, with implications for how basketball is played and coached:

For Coaches

1. Shot Selection and Play Calling

If the hot hand effect exists (even if small), coaches might consider:

  • Feed the Hot Hand: When a player makes several consecutive shots, increasing their shot attempts by 2-3 per game could yield marginal gains
  • Balanced Approach: The effect is small enough that other factors (matchups, offensive balance, defensive attention) should remain primary considerations
  • Avoid Overreaction: A player making 3-4 shots in a row doesn't justify forcing bad shots to them

2. Defensive Adjustments

Defensive strategy should account for hot hand effects:

  • Heightened Attention: Defending players more aggressively after they make shots may be justified
  • Help Defense: Rotating help defense toward "hot" players could disrupt their rhythm
  • Shot Contest Priority: Prioritizing contests on players coming off makes makes statistical sense

3. Substitution Patterns

Playing time decisions could incorporate shooting streaks:

  • Riding Hot Streaks: Extending minutes slightly for players on hot streaks may capture small efficiency gains
  • Confidence Management: Even if the hot hand is partly psychological, leveraging player confidence has real value
  • Context Matters: In close games, small efficiency edges justify more aggressive hot-hand strategies

4. Practice Design

Training can focus on maximizing hot hand potential:

  • Rhythm Shooting: Practice drills that simulate game-speed shooting sequences
  • Confidence Building: Create practice scenarios where players can build shooting momentum
  • Shot Quality Recognition: Teach players to recognize when they're "feeling it" vs. taking bad shots

For Players

1. Shot Selection

Players should be strategic about capitalizing on hot streaks:

  • Aggression When Hot: After making shots, look for opportunities within the offense to shoot again
  • Shot Quality First: Never compromise shot quality for volume—a contested shot is still contested, hot hand or not
  • Read the Defense: If defenders are playing you tighter after makes, look for driving lanes or passing opportunities

2. Confidence and Psychology

The psychological aspect of the hot hand matters regardless of statistics:

  • Positive Reinforcement: Making shots builds confidence, which improves mechanics and decision-making
  • Avoid Pressing: After making several shots, maintain your normal approach rather than forcing things
  • Short Memory: Missing after a hot streak doesn't mean you've "cooled off"—maintain confidence

3. Communication with Teammates

Team dynamics around hot hands require good communication:

  • Call for the Ball: When you're shooting well, communicate availability without being selfish
  • Trust Teammates: Recognize when teammates are hot and find ways to get them shots
  • Offensive Balance: Even when hot, maintain ball movement and offensive flow

4. Skill Development

Training to maximize hot hand potential:

  • Consistency Work: The foundation of any hot streak is consistent shooting mechanics
  • Quick Release: Being able to get shots off quickly helps capitalize when defenses haven't adjusted
  • Off-Ball Movement: Creating space without the ball helps you get shots when you're hot

For General Managers and Analysts

1. Player Evaluation

  • Streakiness vs. Consistency: Some players may have stronger hot hand effects than others—this could factor into roster construction
  • Clutch Shooting: Hot hand effects may be stronger in high-pressure situations, making clutch performance more predictable
  • Role Definition: Players with strong hot hand tendencies might be especially valuable as spark plugs off the bench

2. Game Strategy

  • Timeout Timing: Calling timeouts to disrupt opponent hot streaks may have merit, though the effect is likely small
  • Matchup Planning: In playoff series, tracking which players exhibit hot hand patterns can inform defensive schemes
  • Shot Quality Metrics: Advanced stats should account for potential hot hand effects when evaluating shooting efficiency

The Bottom Line

The current evidence suggests:

  • The hot hand exists, but it's small: Approximately 1-3 percentage point improvement after makes
  • Context matters enormously: Shot difficulty, defensive attention, and player skill level all modify the effect
  • Psychology is real: Even if the statistical effect is small, confidence and momentum matter
  • Don't overthink it: Basic basketball principles (shot quality, offensive balance, defensive intensity) matter far more than hot hand optimization
  • Individual variation: Some players may exhibit stronger hot hand effects than others

The practical takeaway: Be aware of hot hand potential, make marginal adjustments to capitalize on it, but never compromise fundamental basketball principles in pursuit of riding hot streaks. The effect is real but subtle, and context-dependent factors (shot quality, matchups, game situation) should always take precedence over simple streak-chasing.

Unresolved Questions and Future Research

Despite decades of research, several important questions about the hot hand remain unanswered:

1. Individual Differences

Do some players have stronger hot hand effects than others? Elite shooters may exhibit different patterns than average shooters. Future research with player-level tracking data could identify which players show the strongest autocorrelation in their shooting.

2. Mechanism Understanding

If the hot hand exists, what causes it? Possibilities include:

  • Improved confidence leading to better shot selection and mechanics
  • Physiological factors (better neuromuscular coordination after successful movements)
  • Psychological flow states that enhance performance
  • Defensive lapses when players are perceived as hot

3. Shot Difficulty Adjustment

How much of the hot hand effect disappears when properly controlling for shot difficulty? Modern tracking data enables better adjustment, but questions remain about the best methods.

4. Game Situation Effects

Does the hot hand effect vary by game context? Close games, playoff situations, and high-pressure moments may amplify or suppress the effect.

5. Team-Level Hot Hand

Can entire teams get "hot"? Basketball analysts often discuss team momentum, but rigorous statistical investigation is limited.

6. Optimal Response Strategy

How should teams optimally respond to hot hands (both their own players' and opponents')? Game theory approaches could illuminate optimal defensive and offensive adjustments.

7. Other Sports

Does the hot hand exist in other sports (baseball, golf, tennis)? Cross-sport comparisons could reveal whether the effect is specific to basketball or represents a general performance phenomenon.

Conclusion: Where We Stand Today

The hot hand debate has evolved from a seemingly settled question ("it's just an illusion") to a nuanced understanding that the truth lies somewhere in between. Modern research suggests that:

  1. The hot hand exists: After correcting for statistical biases and controlling for confounding variables, there's evidence for a small but real positive autocorrelation in shooting
  2. The effect is modest: The hot hand appears to improve shooting probability by roughly 1-3 percentage points, much smaller than folk wisdom suggests
  3. Context is crucial: Shot difficulty, defensive adjustments, player skill, and game situation all modify the effect substantially
  4. People overestimate it: While the hot hand exists, humans still perceive patterns more strongly than the data support
  5. It has practical implications: Even small effects matter in competitive basketball, justifying marginal strategic adjustments

The hot hand story illustrates several important lessons for sports analytics:

  • Methodology matters: Subtle statistical biases can dramatically affect conclusions
  • Null findings require scrutiny: "No evidence of effect" doesn't always mean "evidence of no effect"
  • Domain expertise has value: Players and coaches were right to trust their observations, even when early statistics disagreed
  • Small effects can matter: In competitive sports, marginal gains accumulate
  • Psychology and statistics intertwine: Performance isn't purely mechanical—confidence and belief affect outcomes

For basketball practitioners, the resolution is pragmatic: Acknowledge that hot hands exist, make reasonable adjustments to capitalize on them, but don't abandon fundamental basketball principles in pursuit of riding streaks. The hot hand is real, but it's a subtle effect that should inform rather than dominate strategy.

The debate also demonstrates how science progresses: initial findings are challenged, refined, and improved through rigorous investigation. The hot hand may never be "fully solved"—there's always another confounding variable to control, another dataset to analyze, another methodological refinement to make. But that's exactly what makes it such a compelling case study at the intersection of psychology, statistics, and sports.

Whether you're a researcher studying human performance, a coach designing strategy, or a fan watching games, the hot hand phenomenon reminds us that basketball—like all sports—is endlessly complex, resistant to simple explanations, and far more interesting than we often give it credit for.

Discussion

Have questions or feedback? Join our community discussion on Discord or GitHub Discussions.