Case Study: Closing Line Value Across 5,000 Bets --- Separating Skill from Luck


Executive Summary

Closing Line Value (CLV) is widely regarded as the gold-standard metric for evaluating betting skill, but how reliable is it in practice? This case study analyzes a synthetic dataset of 5,000 NFL bets placed by a hypothetical sharp bettor ("Jordan") over four seasons (2021-2024), examining whether CLV truly predicts long-term profitability, how quickly CLV converges to a meaningful signal, and what the distribution of CLV looks like for a genuinely skilled bettor versus a lucky recreational bettor. We construct parallel analyses of Jordan's actual CLV profile alongside 1,000 simulated "random" bettors who have no edge, demonstrating that CLV separates skill from luck far more efficiently than raw win rate. The analysis reveals that Jordan's +2.1% average CLV was statistically significant after just 300 bets (compared to 2,000+ bets needed for win-rate significance), that the CLV-ROI correlation across sub-periods was 0.91, and that rolling CLV analysis detected a temporary edge decay in Season 3 that Jordan corrected through model retraining. The accompanying Python code implements the full CLV tracking, significance testing, and rolling analysis framework.


Background

The Question

Every sports bettor eventually confronts the same fundamental problem: is my performance due to skill or luck? After winning $12,000 over a season, a bettor might reasonably ask whether they are genuinely skilled or merely experiencing a fortunate run of variance that will inevitably regress.

Traditional approaches to answering this question rely on win rate and ROI, but these metrics converge to their true values painfully slowly. A bettor with a 3% ROI edge needs roughly 4,000 bets at standard -110 juice before we can be 95% confident their edge is real. Most bettors place 300-600 bets per season, meaning they would need 7-13 years of data to confirm their skill through win rate alone.

CLV offers a faster path to the answer. Because CLV measures the quality of each betting decision against the market's final assessment (rather than against the noisy binary outcome of a single game), it provides a signal with lower variance per observation. This case study quantifies exactly how much faster CLV converges to a meaningful signal.

The Dataset

Jordan is a model-driven NFL bettor who places approximately 1,250 bets per season across spreads (60%), totals (25%), and moneylines (15%). For every bet, Jordan records:

  • Odds at time of placement
  • Closing odds on the same selection and the opposite side
  • Result (win, loss, push)
  • Model probability estimate
  • Sportsbook used
  • Timing of bet placement

Over four seasons, Jordan accumulated 5,012 bets with complete CLV data. We also generated 1,000 simulated "random" bettors who picked sides randomly (no model, no skill) and bet at randomly selected available odds from the same pool of games.

Jordan's Season-by-Season Summary

Season Bets Win Rate Avg Odds ROI Avg CLV CLV Hit Rate
2021 1,248 53.1% -107.8 +2.9% +2.3% 62.4%
2022 1,290 52.4% -107.2 +2.1% +2.1% 60.8%
2023 1,232 51.8% -108.1 +0.8% +1.4% 57.2%
2024 1,242 53.8% -106.9 +4.2% +2.5% 63.6%
Total 5,012 52.8% -107.5 +2.5% +2.1% 61.0%

The Analysis

Test 1: Does CLV Predict Profitability?

We divided Jordan's 5,012 bets into 50 non-overlapping windows of 100 bets each and calculated both the average CLV and the ROI for each window.

CLV vs. ROI by 100-Bet Window:

The correlation between window-level CLV and window-level ROI was r = 0.91 (p < 0.001). This is an extraordinarily strong relationship and confirms that CLV is a powerful predictor of actual profitability.

For comparison, we also computed the correlation between the window's win rate and its ROI. This was r = 0.98 --- higher, as expected, since win rate mechanically determines ROI for fixed-odds bets. However, win rate is not a skill metric; it is an outcome metric. The relevant comparison is between CLV as a leading indicator (available before results are known) and profitability as the lagging indicator.

Binned Analysis:

CLV Quintile Windows Avg CLV Avg ROI Pct Profitable
Top 20% (Q5) 10 +3.8% +5.9% 100%
Q4 10 +2.5% +3.2% 90%
Q3 10 +1.8% +1.8% 70%
Q2 10 +1.2% +0.9% 50%
Bottom 20% (Q1) 10 +0.5% -0.4% 30%

Even Jordan's worst CLV windows (Q1, +0.5%) were only marginally negative in ROI (-0.4%), and they were still positive in CLV. This indicates that even during Jordan's cold stretches, the betting process was sound --- the negative ROI was driven by variance, not by bad decisions.

Test 2: How Quickly Does CLV Converge?

To quantify the convergence speed of CLV versus win rate, we computed the sample size at which each metric first achieves statistical significance (p < 0.05 for a one-sided test that the metric is positive).

CLV Convergence:

We computed a running t-test on Jordan's cumulative CLV values, testing H0: mean CLV = 0 vs. H1: mean CLV > 0.

Bets Mean CLV t-statistic p-value Significant?
50 +2.4% 1.72 0.046 Borderline
100 +2.2% 2.48 0.008 Yes
200 +2.3% 3.65 <0.001 Yes
300 +2.1% 4.12 <0.001 Yes
500 +2.1% 5.28 <0.001 Yes
1,000 +2.1% 7.46 <0.001 Yes

CLV achieved statistical significance at approximately 100 bets and remained significant from that point forward.

Win Rate / ROI Convergence:

We computed a running z-test on Jordan's cumulative win rate, testing H0: win rate = 52.38% (break-even at -110) vs. H1: win rate > 52.38%.

Bets Win Rate z-statistic p-value Significant?
50 54.0% 0.23 0.409 No
100 53.0% 0.12 0.452 No
200 52.5% 0.03 0.488 No
500 53.0% 0.28 0.390 No
1,000 52.9% 0.33 0.371 No
2,000 52.8% 0.37 0.356 No
3,000 52.7% 0.35 0.363 No
5,012 52.8% 0.59 0.278 No

Win rate never achieved statistical significance over the entire 5,012-bet sample. This is not because Jordan lacks skill --- a 52.8% win rate at average -107.5 juice translates to a +2.5% ROI --- but because the inherent variance of binary outcomes makes it extremely difficult to distinguish a 52.8% bettor from a 52.4% bettor over "only" 5,000 observations.

The convergence speed advantage of CLV over win rate is approximately 20:1 to 50:1. CLV provided a statistically significant signal of skill at 100 bets; win rate had not converged after 5,000.

Test 3: CLV Distribution Comparison --- Skilled vs. Random Bettors

We generated 1,000 simulated random bettors, each placing 1,000 bets with no skill (random side selection at random available odds). We then compared their CLV distributions to Jordan's.

Average CLV Distribution:

Population Mean of Avg CLV Std Dev of Avg CLV % with Positive Avg CLV
1,000 random bettors -0.02% 0.31% 47.2%
Jordan (per 1,000-bet window) +2.1% 0.42% 100%

Jordan's average CLV of +2.1% is 6.8 standard deviations above the random-bettor mean. The probability of a random bettor achieving Jordan's CLV level by chance is less than 1 in 10 billion. This conclusively establishes that Jordan's performance reflects genuine skill, not luck.

CLV Hit Rate Distribution:

Population Mean CLV Hit Rate Std Dev % Above 55%
1,000 random bettors 50.1% 1.58% 0.8%
Jordan 61.0% 2.1% (per season) 100%

Jordan's CLV hit rate of 61% means they beat the closing line on 61% of their bets. Among random bettors, only 0.8% achieved a CLV hit rate above 55% over 1,000 bets. Jordan's 61% is effectively impossible by chance.

Test 4: Detecting Edge Decay in Season 3

Jordan's Season 3 (2023) showed a notable decline: CLV dropped from +2.3% and +2.1% in Seasons 1-2 to +1.4% in Season 3, and ROI fell from +2.9% and +2.1% to +0.8%.

A rolling 200-bet CLV analysis reveals the timeline of the decay:

Window (Bets) Period Rolling CLV Rolling ROI
2,401-2,600 Early S3 +2.0% +2.4%
2,501-2,700 Mid-early S3 +1.6% +1.1%
2,601-2,800 Mid S3 +1.1% +0.3%
2,701-2,900 Mid-late S3 +0.8% -0.5%
2,801-3,000 Late S3 +0.9% +0.1%

The rolling CLV declined steadily from +2.0% to +0.8% over the first 500 bets of Season 3. Critically, CLV detected the problem before ROI did --- the rolling CLV dropped below +1.5% at bet 2,600, while the rolling ROI did not turn negative until bet 2,750.

Root Cause Analysis:

Jordan conducted a systematic review during the Season 3 bye week and identified two causes:

  1. Stale model parameters. The efficiency and rest adjustment factors had not been updated since the preseason. The NFL's new kickoff rule had changed average starting field position, which affected scoring patterns and thus totals markets.

  2. Sportsbook adaptation. Two of Jordan's most profitable books had tightened their spreads, reducing the available juice savings by approximately 1 cent on average.

Corrective Action:

Jordan retrained the model on mid-season data, updated the situational adjustments, and opened two new sportsbook accounts. The impact was visible in the CLV recovery:

Window (Bets) Period Rolling CLV Rolling ROI
2,901-3,100 Very late S3 +1.3% +0.8%
3,001-3,200 S3/S4 transition +1.8% +2.0%
3,201-3,400 Early S4 +2.4% +3.8%
3,401-3,600 Mid S4 +2.6% +4.5%

The model retraining and account refresh restored CLV to above +2.0% and ROI recovered correspondingly. Season 4 ultimately produced the best results of the four-year period (+4.2% ROI), partly due to the model improvements and partly due to favorable variance.

Test 5: CLV by Market Type

Jordan's CLV varied significantly across market types:

Market Type Bets Avg CLV Avg ROI CLV Hit Rate
NFL Spreads 3,007 +1.8% +2.0% 59.8%
NFL Totals 1,253 +2.5% +3.1% 63.2%
NFL Moneylines 752 +2.4% +3.2% 62.8%

Totals and moneylines produced higher CLV than spreads. This is consistent with the finding that the spread market is the most efficient NFL market (it receives the most sharp action), while totals and moneylines, being derivative markets, sometimes lag the spread market in incorporating information.

Jordan's model was particularly strong on totals, where weather and pace adjustments provided an edge that the market was slower to price in. The +2.5% CLV on totals over 1,253 bets is highly significant (t = 8.9, p < 0.001) and represents a genuine structural advantage.


Synthesis and Lessons

Lesson 1: CLV Is the Definitive Skill Metric

The evidence from this case study is unambiguous: CLV separates skill from luck with far greater efficiency than win rate or ROI. Jordan's CLV was statistically significant after 100 bets; their win rate was not significant after 5,000. For any bettor serious about evaluating their own performance, tracking CLV is not optional --- it is essential.

Lesson 2: Rolling CLV Detects Edge Decay Before It Appears in P&L

The Season 3 decline was visible in rolling CLV approximately 150 bets before it appeared in rolling ROI. This early warning allowed Jordan to diagnose and correct the problem before it caused catastrophic losses. A bettor who tracked only ROI would not have recognized the issue until much later and might have attributed the downturn to variance rather than a systematic problem.

Lesson 3: CLV Varies by Market Type, and That Variation Is Exploitable

Jordan's CLV was highest in totals and moneylines, suggesting that their model had the greatest relative advantage in these markets. This information is actionable: Jordan could allocate more research effort and betting capital to totals markets, where the edge is largest, and less to spreads, where the market is most efficient.

Lesson 4: Line Shopping and CLV Are Complementary

Part of Jordan's CLV came from the model's ability to identify the correct side before the market moved, and part came from obtaining better prices through line shopping. These two sources of CLV are additive: the model ensures you are betting the side the closing line will move toward, and line shopping ensures you lock in the best available price on that side.

Lesson 5: The 1,000-Bet Threshold

While CLV achieves statistical significance much faster than win rate, the practical threshold for drawing firm conclusions about market-specific edges and model performance is approximately 1,000 bets. Below this threshold, individual season effects, outlier results, and market-specific variance can distort the signal. Above 1,000 bets, the CLV signal is stable and reliable for guiding strategic decisions.


The Python Analysis

The accompanying code (case-study-code.py) includes the CLV significance testing framework used in this case study. It implements:

  1. Running t-test calculator that tracks the cumulative significance of mean CLV over an expanding sample.
  2. Random bettor simulator that generates CLV distributions for skill-less bettors to serve as a null hypothesis.
  3. Rolling CLV analyzer with configurable window sizes and decay detection thresholds.
  4. Market-type CLV decomposition that segments performance by spread, total, and moneyline markets.
  5. CLV-ROI correlation framework that computes within-bettor correlations across time windows.

Discussion Questions

  1. Jordan's CLV was significant at 100 bets, but a single 100-bet sample could also produce significant results by chance (5% of the time by definition). How would you protect against a false-positive CLV signal? Design a testing protocol that controls the false-positive rate.

  2. In Season 3, Jordan's CLV declined from +2.3% to +1.4%. At what CLV level should a bettor become alarmed? Propose a formal decision rule (using confidence intervals or hypothesis testing) for when to stop betting and reassess.

  3. Jordan's totals CLV (+2.5%) exceeded their spreads CLV (+1.8%). If Jordan shifted 50% of their spread volume to totals, would you expect their overall CLV to increase? What risks would this concentration create?

  4. The random-bettor simulation found that 47.2% of random bettors had positive average CLV. Why is this number close to 50% but not exactly 50%? What systematic factor pushes random bettors slightly below zero CLV?

  5. If the closing line becomes more efficient over time (as sportsbook models improve), what happens to CLV as a skill metric? Can a bettor maintain the same level of skill but see their CLV decline due to market improvement?