Case Study 1: The Fall and Rise of Legal Sports Betting in America

Executive Summary

For twenty-six years, the Professional and Amateur Sports Protection Act (PASPA) of 1992 effectively banned sports betting across most of the United States, pushing an estimated $150 billion annual market underground and offshore. This case study traces the arc of American sports betting from federal prohibition through the landmark Supreme Court decision in Murphy v. National Collegiate Athletic Association (2018), which struck down PASPA as unconstitutional. We examine the legal, economic, and cultural forces that shaped this transformation, then analyze the state-by-state rollout using synthetic revenue data modeled on publicly available figures. Through Python-based analysis, we compare pre-legalization market projections against actual outcomes, revealing how forecasters systematically underestimated the speed of market adoption while overestimating per-capita spending in early-adoption states. The case illustrates fundamental concepts in market sizing, regulatory economics, and the power of data-driven policy analysis.

Background

The Origins of PASPA

By the late 1980s, professional sports leagues in the United States had grown increasingly concerned about the perceived threat of legalized gambling. While Nevada had long operated legal sportsbooks under a grandfathered exemption, several other states were exploring legislation to authorize sports wagering. Oregon ran a sports-themed lottery game called "Sports Action," Delaware had briefly offered NFL parlay betting in 1976, and Montana permitted small-scale sports pools.

The catalyst for federal action came from multiple directions. Senator Bill Bradley of New Jersey, a former NBA star with the New York Knicks, became the most prominent legislative champion for a federal ban. Bradley argued that the spread of legalized sports betting would undermine the integrity of athletic competition and send a damaging message to young people. The major professional sports leagues — the NFL, NBA, MLB, and NHL — along with the NCAA, lobbied aggressively in support of the legislation.

On October 28, 1992, President George H.W. Bush signed the Professional and Amateur Sports Protection Act into law. PASPA did not make sports betting a federal crime per se; instead, it prohibited states from "sponsoring, operating, advertising, promoting, licensing, or authorizing" sports wagering. The law included narrow exemptions: Nevada could continue its full-scale sports betting operations, while Oregon, Montana, and Delaware were permitted to maintain their existing, more limited offerings.

The Underground Market Under PASPA

PASPA's prohibition did not eliminate demand for sports betting — it merely redirected it. Over the following decades, an enormous underground economy flourished. The American Gaming Association estimated that Americans wagered approximately $150 billion per year illegally on sports during the PASPA era, dwarfing Nevada's legal handle of roughly $5 billion annually.

This underground market took several forms:

Local bookmakers: Traditional neighborhood bookies continued operating in every major American city, often with ties to organized crime networks, though many were independent small operators.
Offshore sportsbooks: The rise of the internet in the late 1990s and 2000s created a massive offshore betting industry. Operations based in Costa Rica, Antigua, Curacao, and other Caribbean jurisdictions offered American bettors easy access through websites and, later, mobile apps. Companies like Bodog (later Bovada), BetOnline, and 5Dimes became household names among sports bettors despite operating in legal gray areas.
Office pools and social betting: Informal wagering — Super Bowl squares, March Madness brackets with entry fees, fantasy sports leagues with buy-ins — represented billions more in unregulated activity.

The underground market created significant problems. Bettors had no legal recourse if an offshore book refused to pay winnings. There were no consumer protections, no responsible gambling safeguards, and no tax revenue flowing to state or federal coffers. Criminal enterprises profited while law enforcement resources were stretched thin attempting to police an activity that millions of Americans considered harmless recreation.

New Jersey's Legal Challenge

The movement to overturn PASPA began in earnest in 2011 when New Jersey voters approved a referendum authorizing sports betting at the state's casinos and racetracks by a margin of nearly two to one. Governor Chris Christie signed legislation in 2012 to implement the voter-approved measure.

The NCAA and the four major professional sports leagues immediately sued, arguing that New Jersey's law violated PASPA. The federal courts agreed, striking down the 2012 law. But New Jersey did not surrender. In 2014, the state adopted a different strategy: rather than affirmatively authorizing sports betting, it simply repealed its existing prohibitions on the activity at casinos and racetracks. The theory was that PASPA prohibited states from "authorizing" sports betting but could not compel states to maintain their own prohibitions — a distinction rooted in the Tenth Amendment's anti-commandeering doctrine.

The leagues sued again. The Third Circuit Court of Appeals ruled against New Jersey once more, but the legal reasoning grew increasingly strained, and dissenting opinions highlighted the constitutional tensions in PASPA's structure.

The Challenge

In 2017, the Supreme Court agreed to hear the case, now styled Murphy v. National Collegiate Athletic Association (Phil Murphy had succeeded Christie as governor). The central question was whether PASPA violated the anti-commandeering principle of the Tenth Amendment by effectively forcing states to maintain laws prohibiting sports betting.

On May 14, 2018, the Supreme Court issued its decision in a 7-2 ruling written by Justice Samuel Alito. The Court struck down PASPA in its entirety, holding that the law unconstitutionally commandeered state legislatures by preventing them from modifying or repealing their own sports betting prohibitions. Justice Alito wrote: "The legalization of sports gambling requires an important policy choice, but the choice is not ours to make. Congress can regulate sports gambling directly, but if it elects not to do so, each State is free to act on its own."

The decision opened the floodgates. Within weeks, New Jersey and Delaware launched legal sports betting operations. Mississippi and West Virginia followed within months. By the end of 2018, eight states had some form of legal sports wagering.

Your challenge: Analyze the state-by-state rollout of legal sports betting following the Murphy decision. Using the provided synthetic dataset, compare projected market sizes against actual revenue figures, identify patterns in adoption speed, and evaluate the economic impact of legalization.

Available Data

The dataset state_betting_revenue.csv contains synthetic data modeled on publicly reported figures from state gaming commissions and the American Gaming Association.

Data Dictionary

Column	Type	Description
`state`	string	U.S. state name
`launch_year`	int	Year legal sports betting launched
`launch_month`	int	Month of launch (1-12)
`population_millions`	float	State population in millions (2020 Census estimates)
`year`	int	Calendar year of revenue observation
`total_handle_millions`	float	Total amount wagered in millions USD
`gross_revenue_millions`	float	Sportsbook gross revenue (handle minus payouts) in millions USD
`tax_revenue_millions`	float	State tax revenue from sports betting in millions USD
`tax_rate_pct`	float	State tax rate on gross gaming revenue (%)
`online_pct`	float	Percentage of handle placed via mobile/online (%)
`num_operators`	int	Number of licensed sportsbook operators
`projected_handle_millions`	float	Pre-launch industry projection of total handle for that year
`projected_revenue_millions`	float	Pre-launch industry projection of gross revenue for that year

Analysis Approach

Phase 1: Data Generation and Exploration

Since we are working with synthetic data for educational purposes, we first construct a realistic dataset, then perform exploratory analysis.

"""
Phase 1: Generate synthetic state sports betting revenue data
and perform exploratory analysis.

This module creates a realistic dataset modeled on publicly available
figures from state gaming commissions, then explores key distributions
and trends in the post-PASPA sports betting landscape.
"""

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
from typing import Tuple

np.random.seed(42)


def generate_state_betting_data() -> pd.DataFrame:
    """
    Generate synthetic state-level sports betting revenue data
    modeled on real-world patterns observed from 2018-2025.

    Returns:
        pd.DataFrame: DataFrame with columns matching the data dictionary,
            containing yearly observations for each state from launch
            through 2025.
    """
    states_info = [
        ("New Jersey", 2018, 6, 9.29, 13.0, 51.0),
        ("Delaware", 2018, 6, 0.99, 50.0, 20.0),
        ("Mississippi", 2018, 8, 2.96, 12.0, 8.0),
        ("West Virginia", 2018, 9, 1.79, 10.0, 15.0),
        ("Pennsylvania", 2018, 11, 13.00, 36.0, 34.0),
        ("Rhode Island", 2018, 11, 1.10, 51.0, 15.0),
        ("New York", 2019, 7, 20.20, 51.0, 10.0),
        ("Iowa", 2019, 8, 3.19, 6.75, 19.0),
        ("Indiana", 2019, 9, 6.83, 9.5, 25.0),
        ("Oregon", 2019, 10, 4.24, 0.0, 15.0),
        ("New Hampshire", 2019, 12, 1.38, 51.0, 50.0),
        ("Illinois", 2020, 3, 12.81, 15.0, 34.0),
        ("Michigan", 2020, 3, 10.08, 8.4, 32.0),
        ("Colorado", 2020, 5, 5.81, 10.0, 25.0),
        ("Tennessee", 2020, 11, 6.98, 20.0, 80.0),
        ("Virginia", 2021, 1, 8.63, 15.0, 37.0),
        ("Arizona", 2021, 9, 7.28, 10.0, 20.0),
        ("Connecticut", 2021, 10, 3.61, 13.75, 18.0),
        ("Maryland", 2021, 12, 6.18, 15.0, 22.0),
        ("Ohio", 2023, 1, 11.80, 10.0, 20.0),
        ("Massachusetts", 2023, 3, 7.03, 20.0, 15.0),
        ("Kentucky", 2023, 9, 4.51, 9.75, 14.0),
        ("North Carolina", 2024, 3, 10.55, 18.0, 12.0),
        ("Vermont", 2024, 1, 0.65, 20.0, 6.0),
    ]

    rows = []
    for state, launch_yr, launch_mo, pop, tax_rate, num_ops in states_info:
        for year in range(launch_yr, 2026):
            # Calculate months of operation in the launch year
            if year == launch_yr:
                months_active = 12 - launch_mo + 1
                activity_fraction = months_active / 12.0
            else:
                activity_fraction = 1.0

            years_since_launch = year - launch_yr

            # Base handle scales with population and maturity
            base_handle_per_capita = 350 + 180 * min(years_since_launch, 4)
            maturity_factor = 1.0 - 0.6 * np.exp(-0.8 * years_since_launch)

            # New York online launch in 2022 caused a massive spike
            ny_online_boost = 1.0
            if state == "New York" and year >= 2022:
                ny_online_boost = 4.5

            handle = (
                pop
                * base_handle_per_capita
                * maturity_factor
                * activity_fraction
                * ny_online_boost
                * np.random.uniform(0.85, 1.15)
            )

            # Gross revenue is typically 7-9% of handle
            hold_pct = np.random.uniform(0.065, 0.095)
            revenue = handle * hold_pct

            # Tax revenue
            tax_rev = revenue * (tax_rate / 100.0)

            # Online percentage grows over time
            if years_since_launch == 0:
                online_pct = np.random.uniform(30, 55)
            elif years_since_launch == 1:
                online_pct = np.random.uniform(55, 78)
            else:
                online_pct = np.random.uniform(78, 93)

            # Some states are retail-only or had delayed online launch
            if state == "Mississippi":
                online_pct = 0.0  # Retail only
            if state == "New York" and year < 2022:
                online_pct = 0.0  # Online launched Jan 2022

            # Industry projections (systematically conservative for early states,
            # then overly optimistic for later states)
            projection_error = np.random.uniform(0.6, 0.85) if years_since_launch <= 1 else np.random.uniform(0.9, 1.3)
            projected_handle = handle * projection_error
            projected_revenue = revenue * projection_error * np.random.uniform(0.9, 1.1)

            rows.append({
                "state": state,
                "launch_year": launch_yr,
                "launch_month": launch_mo,
                "population_millions": pop,
                "year": year,
                "total_handle_millions": round(handle, 2),
                "gross_revenue_millions": round(revenue, 2),
                "tax_revenue_millions": round(tax_rev, 2),
                "tax_rate_pct": tax_rate,
                "online_pct": round(online_pct, 1),
                "num_operators": num_ops + min(years_since_launch * 3, 12),
                "projected_handle_millions": round(projected_handle, 2),
                "projected_revenue_millions": round(projected_revenue, 2),
            })

    return pd.DataFrame(rows)


# Generate the dataset
df = generate_state_betting_data()
print(f"Dataset shape: {df.shape}")
print(f"States covered: {df['state'].nunique()}")
print(f"Year range: {df['year'].min()} - {df['year'].max()}")
print(f"\nFirst few rows:")
print(df.head(10).to_string(index=False))

"""
Exploratory analysis of the generated dataset.
"""


def explore_market_growth(df: pd.DataFrame) -> None:
    """
    Visualize aggregate market growth across all legal states.

    Args:
        df: DataFrame with state-level betting revenue data.
    """
    annual = df.groupby("year").agg(
        total_handle=("total_handle_millions", "sum"),
        total_revenue=("gross_revenue_millions", "sum"),
        total_tax=("tax_revenue_millions", "sum"),
        num_states=("state", "nunique"),
    ).reset_index()

    fig, axes = plt.subplots(1, 3, figsize=(16, 5))

    # Total handle by year
    axes[0].bar(annual["year"], annual["total_handle"] / 1000, color="#2563eb")
    axes[0].set_title("Total U.S. Legal Sports Betting Handle")
    axes[0].set_ylabel("Handle ($ Billions)")
    axes[0].set_xlabel("Year")
    axes[0].yaxis.set_major_formatter(mticker.FormatStrFormatter("$%.0fB"))

    # Gross revenue by year
    axes[1].bar(annual["year"], annual["total_revenue"] / 1000, color="#16a34a")
    axes[1].set_title("Total Gross Sportsbook Revenue")
    axes[1].set_ylabel("Revenue ($ Billions)")
    axes[1].set_xlabel("Year")

    # Number of legal states
    axes[2].plot(
        annual["year"], annual["num_states"],
        marker="o", linewidth=2, color="#dc2626"
    )
    axes[2].set_title("Number of States with Legal Sports Betting")
    axes[2].set_ylabel("Number of States")
    axes[2].set_xlabel("Year")

    plt.tight_layout()
    plt.savefig("market_growth_overview.png", dpi=150, bbox_inches="tight")
    plt.show()

    print("\nAnnual Summary:")
    print(annual.to_string(index=False))


explore_market_growth(df)

Phase 2: State-Level Adoption Patterns

"""
Phase 2: Analyze state-level adoption patterns and identify
key factors driving market size variation across states.
"""


def analyze_adoption_patterns(df: pd.DataFrame) -> pd.DataFrame:
    """
    Calculate per-capita metrics and maturity curves for each state.

    Args:
        df: DataFrame with state-level betting revenue data.

    Returns:
        pd.DataFrame: State-level summary with per-capita metrics
            and growth trajectory classifications.
    """
    # Get the most recent full year of data for each state
    latest_year = 2025
    recent = df[df["year"] == latest_year].copy()

    recent["handle_per_capita"] = (
        recent["total_handle_millions"] * 1e6 / (recent["population_millions"] * 1e6)
    )
    recent["revenue_per_capita"] = (
        recent["gross_revenue_millions"] * 1e6 / (recent["population_millions"] * 1e6)
    )
    recent["years_since_launch"] = latest_year - recent["launch_year"]

    print("=== Per-Capita Metrics by State (2025) ===\n")
    display_cols = [
        "state", "population_millions", "years_since_launch",
        "handle_per_capita", "revenue_per_capita", "online_pct"
    ]
    summary = recent[display_cols].sort_values("handle_per_capita", ascending=False)
    print(summary.to_string(index=False))

    return recent


def plot_maturity_curves(df: pd.DataFrame) -> None:
    """
    Plot handle growth trajectories for early-adopter states,
    aligned to years since launch.

    Args:
        df: DataFrame with state-level betting revenue data.
    """
    early_states = ["New Jersey", "Pennsylvania", "Indiana", "Illinois", "Colorado"]

    fig, ax = plt.subplots(figsize=(10, 6))

    for state in early_states:
        state_data = df[df["state"] == state].copy()
        state_data["years_since_launch"] = (
            state_data["year"] - state_data["launch_year"]
        )
        state_data["handle_per_capita"] = (
            state_data["total_handle_millions"] * 1e6
            / (state_data["population_millions"] * 1e6)
        )
        ax.plot(
            state_data["years_since_launch"],
            state_data["handle_per_capita"],
            marker="o", linewidth=2, label=state
        )

    ax.set_xlabel("Years Since Launch")
    ax.set_ylabel("Handle Per Capita ($)")
    ax.set_title("Sports Betting Maturity Curves: Handle Per Capita by Years Since Launch")
    ax.legend()
    ax.grid(True, alpha=0.3)

    plt.tight_layout()
    plt.savefig("maturity_curves.png", dpi=150, bbox_inches="tight")
    plt.show()


state_summary = analyze_adoption_patterns(df)
plot_maturity_curves(df)

Phase 3: Projected vs. Actual Market Size

"""
Phase 3: Compare pre-legalization market projections against
actual revenue figures to assess forecasting accuracy.
"""


def analyze_projection_accuracy(df: pd.DataFrame) -> pd.DataFrame:
    """
    Calculate projection error metrics across states and years.

    Args:
        df: DataFrame with state-level betting revenue data.

    Returns:
        pd.DataFrame: DataFrame with projection error analysis
            including MAPE, directional accuracy, and bias metrics.
    """
    analysis = df.copy()
    analysis["handle_error_pct"] = (
        (analysis["total_handle_millions"] - analysis["projected_handle_millions"])
        / analysis["projected_handle_millions"]
        * 100
    )
    analysis["revenue_error_pct"] = (
        (analysis["gross_revenue_millions"] - analysis["projected_revenue_millions"])
        / analysis["projected_revenue_millions"]
        * 100
    )
    analysis["years_since_launch"] = analysis["year"] - analysis["launch_year"]

    # Aggregate error by years since launch
    error_by_maturity = analysis.groupby("years_since_launch").agg(
        mean_handle_error=("handle_error_pct", "mean"),
        median_handle_error=("handle_error_pct", "median"),
        mean_revenue_error=("revenue_error_pct", "mean"),
        std_handle_error=("handle_error_pct", "std"),
        n_observations=("state", "count"),
    ).reset_index()

    print("=== Projection Error by Market Maturity ===\n")
    print(error_by_maturity.round(2).to_string(index=False))

    return analysis


def plot_projection_vs_actual(df: pd.DataFrame) -> None:
    """
    Create scatter plot comparing projected vs. actual handle,
    colored by market maturity.

    Args:
        df: DataFrame with state-level betting revenue data.
    """
    analysis = df.copy()
    analysis["years_since_launch"] = analysis["year"] - analysis["launch_year"]

    fig, axes = plt.subplots(1, 2, figsize=(14, 6))

    # Handle comparison
    scatter = axes[0].scatter(
        analysis["projected_handle_millions"],
        analysis["total_handle_millions"],
        c=analysis["years_since_launch"],
        cmap="viridis", alpha=0.7, edgecolors="black", linewidth=0.5
    )
    max_val = max(
        analysis["projected_handle_millions"].max(),
        analysis["total_handle_millions"].max()
    )
    axes[0].plot([0, max_val], [0, max_val], "r--", linewidth=1, label="Perfect prediction")
    axes[0].set_xlabel("Projected Handle ($ Millions)")
    axes[0].set_ylabel("Actual Handle ($ Millions)")
    axes[0].set_title("Projected vs. Actual Handle")
    axes[0].legend()

    plt.colorbar(scatter, ax=axes[0], label="Years Since Launch")

    # Revenue comparison
    scatter2 = axes[1].scatter(
        analysis["projected_revenue_millions"],
        analysis["gross_revenue_millions"],
        c=analysis["years_since_launch"],
        cmap="viridis", alpha=0.7, edgecolors="black", linewidth=0.5
    )
    max_val2 = max(
        analysis["projected_revenue_millions"].max(),
        analysis["gross_revenue_millions"].max()
    )
    axes[1].plot([0, max_val2], [0, max_val2], "r--", linewidth=1, label="Perfect prediction")
    axes[1].set_xlabel("Projected Revenue ($ Millions)")
    axes[1].set_ylabel("Actual Revenue ($ Millions)")
    axes[1].set_title("Projected vs. Actual Revenue")
    axes[1].legend()

    plt.colorbar(scatter2, ax=axes[1], label="Years Since Launch")

    plt.tight_layout()
    plt.savefig("projection_accuracy.png", dpi=150, bbox_inches="tight")
    plt.show()


projection_analysis = analyze_projection_accuracy(df)
plot_projection_vs_actual(df)

Phase 4: Online vs. Retail and Tax Revenue Impact

"""
Phase 4: Analyze the impact of online/mobile betting on market size
and evaluate tax revenue generation across regulatory frameworks.
"""


def analyze_online_impact(df: pd.DataFrame) -> None:
    """
    Examine the relationship between online betting availability
    and total market size, plus tax revenue implications.

    Args:
        df: DataFrame with state-level betting revenue data.
    """
    latest = df[df["year"] == 2025].copy()

    # Correlation between online percentage and handle per capita
    latest["handle_per_capita"] = (
        latest["total_handle_millions"] * 1e6
        / (latest["population_millions"] * 1e6)
    )

    correlation = latest["online_pct"].corr(latest["handle_per_capita"])
    print(f"Correlation between online % and handle per capita: {correlation:.3f}")

    # Tax efficiency analysis
    latest["effective_tax_per_capita"] = (
        latest["tax_revenue_millions"] * 1e6
        / (latest["population_millions"] * 1e6)
    )
    latest["revenue_yield"] = (
        latest["gross_revenue_millions"] / latest["total_handle_millions"] * 100
    )

    print("\n=== Tax Revenue Efficiency ===\n")
    tax_summary = latest[[
        "state", "tax_rate_pct", "effective_tax_per_capita",
        "revenue_yield", "online_pct"
    ]].sort_values("effective_tax_per_capita", ascending=False)
    print(tax_summary.round(2).to_string(index=False))

    # Total tax revenue generated since legalization
    total_tax = df.groupby("state")["tax_revenue_millions"].sum().sort_values(ascending=False)
    print("\n=== Cumulative Tax Revenue by State ($ Millions) ===\n")
    print(total_tax.round(1).to_string())
    print(f"\nTotal across all states: ${total_tax.sum():,.1f} million")


def plot_tax_analysis(df: pd.DataFrame) -> None:
    """
    Visualize tax rate vs. tax revenue per capita to examine
    whether higher tax rates yield more revenue.

    Args:
        df: DataFrame with state-level betting revenue data.
    """
    latest = df[df["year"] == 2025].copy()
    latest["tax_per_capita"] = (
        latest["tax_revenue_millions"] * 1e6
        / (latest["population_millions"] * 1e6)
    )

    fig, axes = plt.subplots(1, 2, figsize=(14, 6))

    # Tax rate vs tax per capita
    axes[0].scatter(
        latest["tax_rate_pct"], latest["tax_per_capita"],
        s=latest["population_millions"] * 15,
        alpha=0.7, edgecolors="black", linewidth=0.5
    )
    for _, row in latest.iterrows():
        axes[0].annotate(
            row["state"][:4], (row["tax_rate_pct"], row["tax_per_capita"]),
            fontsize=7, ha="center", va="bottom"
        )
    axes[0].set_xlabel("Tax Rate (%)")
    axes[0].set_ylabel("Tax Revenue Per Capita ($)")
    axes[0].set_title("Tax Rate vs. Per-Capita Tax Revenue\n(bubble size = population)")

    # Online % vs handle per capita
    latest["handle_per_capita"] = (
        latest["total_handle_millions"] * 1e6
        / (latest["population_millions"] * 1e6)
    )
    axes[1].scatter(
        latest["online_pct"], latest["handle_per_capita"],
        s=80, alpha=0.7, color="#2563eb", edgecolors="black", linewidth=0.5
    )
    for _, row in latest.iterrows():
        axes[1].annotate(
            row["state"][:4], (row["online_pct"], row["handle_per_capita"]),
            fontsize=7, ha="center", va="bottom"
        )
    axes[1].set_xlabel("Online Betting Share (%)")
    axes[1].set_ylabel("Handle Per Capita ($)")
    axes[1].set_title("Mobile/Online Share vs. Total Handle Per Capita")

    plt.tight_layout()
    plt.savefig("tax_online_analysis.png", dpi=150, bbox_inches="tight")
    plt.show()


analyze_online_impact(df)
plot_tax_analysis(df)

Results Summary

Key Finding 1: Market Growth Exceeded Most Projections

The U.S. legal sports betting market grew from approximately $6 billion in total handle in its first partial year (2018) to over $120 billion by 2025. This growth trajectory outpaced the majority of pre-legalization forecasts, particularly in the first two years of each state's market. Industry analysts and state budget offices systematically underestimated consumer demand during the launch phase, with actual handle exceeding projections by 20-40% in Year 1 for most states.

Key Finding 2: Online/Mobile Betting Is the Dominant Channel

States that authorized online and mobile betting saw dramatically higher per-capita handle than retail-only states. By 2025, mobile wagering accounted for 85-92% of all legal bets in states with mature online markets. Mississippi, the only state in our dataset that remained retail-only, consistently posted the lowest per-capita handle figures. This finding has significant implications for states still considering legalization: a retail-only framework captures only a fraction of potential market activity.

Key Finding 3: The Tax Rate Paradox

The relationship between tax rate and per-capita tax revenue is not linear. States with moderate tax rates (10-20%) tended to attract more operators and generate larger overall markets, in some cases producing more total tax revenue than states with higher rates. New York, with its 51% online tax rate, generated substantial tax revenue due to its enormous population, but the high rate compressed operator margins and limited promotional spending that might otherwise have expanded the market further. Tennessee's 20% rate on a purely online market also delivered strong per-capita tax returns.

Key Finding 4: Market Maturity Follows a Predictable Curve

States typically reach market maturity (measured by per-capita handle stabilization) within three to four years of launch. The maturity curve follows a rough S-shape: rapid growth in Year 1 as pent-up demand is released, continued strong growth in Year 2 driven by marketing and promotional spending by operators, deceleration in Year 3, and relative stabilization by Year 4. This pattern was remarkably consistent across states with different population sizes and regulatory frameworks.

Limitations

Synthetic data: The dataset used in this analysis is synthetic, generated to approximate real patterns. Actual state revenue figures, available from individual state gaming commissions and the American Gaming Association's annual reports, should be consulted for any policy or investment analysis.
Confounding variables: Our analysis does not control for numerous factors that affect market size, including proximity to state borders (cross-border competition), the presence of professional sports teams, existing gambling culture, marketing spend by operators, and macroeconomic conditions.
Incomplete state coverage: Several states with legal sports betting are omitted from the dataset for simplicity. As of early 2026, more than 35 states plus the District of Columbia have authorized some form of sports betting.
Projection methodology opacity: The "projected" figures in our dataset are synthetic estimates of what pre-launch forecasts looked like. Actual projections varied widely depending on the source (state budget offices, industry analysts, academic researchers) and their assumptions.
Temporal effects: The COVID-19 pandemic significantly affected both the sports calendar and betting patterns in 2020-2021, introducing disruptions that our synthetic data generation process handles only roughly.

Discussion Questions

Regulatory design trade-offs: New York chose a 51% tax rate on mobile sports betting revenue, the highest in the nation at the time. New Jersey's rate was 13%. What are the short-term and long-term trade-offs of each approach? Which framework is likely to maximize total economic welfare (including consumer surplus, operator profitability, and tax revenue)?
Market integrity: One of the original arguments for PASPA was that legal sports betting would threaten the integrity of athletic competitions. Now that we have several years of data from legalized markets, what evidence exists regarding integrity risks? Has the presence of licensed, regulated sportsbooks made it easier or harder to detect suspicious betting activity?
Substitution vs. expansion: To what extent did legal sports betting cannibalize existing forms of gambling (casino slots, lottery tickets, poker) versus expand the total gambling market? What data would you need to answer this question rigorously?
Projection methodology: Why did pre-legalization market forecasts tend to underestimate early demand? What behavioral and economic factors might explain the systematic bias? How would you design a better forecasting model for a state that has not yet legalized?
Responsible gambling: The rapid expansion of mobile betting has raised concerns about problem gambling, particularly among young adults. What data from our analysis might inform responsible gambling policy? What additional data would you want to collect?
Federal vs. state regulation: The Murphy decision left sports betting regulation entirely to the states, creating a patchwork of different rules, tax rates, and consumer protections. Make an argument for or against a federal regulatory framework that would standardize rules across states. What would the data-driven case look like?

Your Turn: Mini-Project

Project: Build a State Legalization Impact Model

Using the code and data generation framework provided in this case study, complete the following:

Extend the dataset: Add five additional states to the synthetic data generator (choose real states that have legalized sports betting). Research their actual launch dates, populations, tax rates, and approximate number of operators. Generate realistic synthetic revenue data for them.
Build a regression model: Using scikit-learn, build a linear regression model that predicts a state's Year 2 total handle using the following features: population, tax rate, number of operators, online percentage, and whether the state launched with online betting on Day 1. Report the R-squared, coefficients, and interpret the results.
Create a "what-if" simulator: Write a function simulate_state_launch(population, tax_rate, online_enabled, num_operators) that takes state characteristics as inputs and outputs projected handle and revenue for Years 1 through 5. Use the patterns you observed in the analysis to calibrate your projections.
Visualization deliverable: Create a single summary dashboard (using matplotlib subplots) with four panels: (a) the national market growth bar chart, (b) the maturity curve comparison, (c) the projection accuracy scatter plot, and (d) your regression model's predicted vs. actual values. Save it as a single PNG file.

Stretch goal: Scrape actual revenue data from two state gaming commission websites using requests and BeautifulSoup, and compare your synthetic data against reality. How close was the synthetic generation approach?