Case Study: Closing Line Value Across 5,000 Bets --- Separating Skill from Luck
Executive Summary
Closing Line Value (CLV) is widely regarded as the gold-standard metric for evaluating betting skill, but how reliable is it in practice? This case study analyzes a synthetic dataset of 5,000 NFL bets placed by a hypothetical sharp bettor ("Jordan") over four seasons (2021-2024), examining whether CLV truly predicts long-term profitability, how quickly CLV converges to a meaningful signal, and what the distribution of CLV looks like for a genuinely skilled bettor versus a lucky recreational bettor. We construct parallel analyses of Jordan's actual CLV profile alongside 1,000 simulated "random" bettors who have no edge, demonstrating that CLV separates skill from luck far more efficiently than raw win rate. The analysis reveals that Jordan's +2.1% average CLV was statistically significant after just 300 bets (compared to 2,000+ bets needed for win-rate significance), that the CLV-ROI correlation across sub-periods was 0.91, and that rolling CLV analysis detected a temporary edge decay in Season 3 that Jordan corrected through model retraining. The accompanying Python code implements the full CLV tracking, significance testing, and rolling analysis framework.
Background
The Question
Every sports bettor eventually confronts the same fundamental problem: is my performance due to skill or luck? After winning $12,000 over a season, a bettor might reasonably ask whether they are genuinely skilled or merely experiencing a fortunate run of variance that will inevitably regress.
Traditional approaches to answering this question rely on win rate and ROI, but these metrics converge to their true values painfully slowly. A bettor with a 3% ROI edge needs roughly 4,000 bets at standard -110 juice before we can be 95% confident their edge is real. Most bettors place 300-600 bets per season, meaning they would need 7-13 years of data to confirm their skill through win rate alone.
CLV offers a faster path to the answer. Because CLV measures the quality of each betting decision against the market's final assessment (rather than against the noisy binary outcome of a single game), it provides a signal with lower variance per observation. This case study quantifies exactly how much faster CLV converges to a meaningful signal.
The Dataset
Jordan is a model-driven NFL bettor who places approximately 1,250 bets per season across spreads (60%), totals (25%), and moneylines (15%). For every bet, Jordan records:
- Odds at time of placement
- Closing odds on the same selection and the opposite side
- Result (win, loss, push)
- Model probability estimate
- Sportsbook used
- Timing of bet placement
Over four seasons, Jordan accumulated 5,012 bets with complete CLV data. We also generated 1,000 simulated "random" bettors who picked sides randomly (no model, no skill) and bet at randomly selected available odds from the same pool of games.
Jordan's Season-by-Season Summary
| Season | Bets | Win Rate | Avg Odds | ROI | Avg CLV | CLV Hit Rate |
|---|---|---|---|---|---|---|
| 2021 | 1,248 | 53.1% | -107.8 | +2.9% | +2.3% | 62.4% |
| 2022 | 1,290 | 52.4% | -107.2 | +2.1% | +2.1% | 60.8% |
| 2023 | 1,232 | 51.8% | -108.1 | +0.8% | +1.4% | 57.2% |
| 2024 | 1,242 | 53.8% | -106.9 | +4.2% | +2.5% | 63.6% |
| Total | 5,012 | 52.8% | -107.5 | +2.5% | +2.1% | 61.0% |
The Analysis
Test 1: Does CLV Predict Profitability?
We divided Jordan's 5,012 bets into 50 non-overlapping windows of 100 bets each and calculated both the average CLV and the ROI for each window.
CLV vs. ROI by 100-Bet Window:
The correlation between window-level CLV and window-level ROI was r = 0.91 (p < 0.001). This is an extraordinarily strong relationship and confirms that CLV is a powerful predictor of actual profitability.
For comparison, we also computed the correlation between the window's win rate and its ROI. This was r = 0.98 --- higher, as expected, since win rate mechanically determines ROI for fixed-odds bets. However, win rate is not a skill metric; it is an outcome metric. The relevant comparison is between CLV as a leading indicator (available before results are known) and profitability as the lagging indicator.
Binned Analysis:
| CLV Quintile | Windows | Avg CLV | Avg ROI | Pct Profitable |
|---|---|---|---|---|
| Top 20% (Q5) | 10 | +3.8% | +5.9% | 100% |
| Q4 | 10 | +2.5% | +3.2% | 90% |
| Q3 | 10 | +1.8% | +1.8% | 70% |
| Q2 | 10 | +1.2% | +0.9% | 50% |
| Bottom 20% (Q1) | 10 | +0.5% | -0.4% | 30% |
Even Jordan's worst CLV windows (Q1, +0.5%) were only marginally negative in ROI (-0.4%), and they were still positive in CLV. This indicates that even during Jordan's cold stretches, the betting process was sound --- the negative ROI was driven by variance, not by bad decisions.
Test 2: How Quickly Does CLV Converge?
To quantify the convergence speed of CLV versus win rate, we computed the sample size at which each metric first achieves statistical significance (p < 0.05 for a one-sided test that the metric is positive).
CLV Convergence:
We computed a running t-test on Jordan's cumulative CLV values, testing H0: mean CLV = 0 vs. H1: mean CLV > 0.
| Bets | Mean CLV | t-statistic | p-value | Significant? |
|---|---|---|---|---|
| 50 | +2.4% | 1.72 | 0.046 | Borderline |
| 100 | +2.2% | 2.48 | 0.008 | Yes |
| 200 | +2.3% | 3.65 | <0.001 | Yes |
| 300 | +2.1% | 4.12 | <0.001 | Yes |
| 500 | +2.1% | 5.28 | <0.001 | Yes |
| 1,000 | +2.1% | 7.46 | <0.001 | Yes |
CLV achieved statistical significance at approximately 100 bets and remained significant from that point forward.
Win Rate / ROI Convergence:
We computed a running z-test on Jordan's cumulative win rate, testing H0: win rate = 52.38% (break-even at -110) vs. H1: win rate > 52.38%.
| Bets | Win Rate | z-statistic | p-value | Significant? |
|---|---|---|---|---|
| 50 | 54.0% | 0.23 | 0.409 | No |
| 100 | 53.0% | 0.12 | 0.452 | No |
| 200 | 52.5% | 0.03 | 0.488 | No |
| 500 | 53.0% | 0.28 | 0.390 | No |
| 1,000 | 52.9% | 0.33 | 0.371 | No |
| 2,000 | 52.8% | 0.37 | 0.356 | No |
| 3,000 | 52.7% | 0.35 | 0.363 | No |
| 5,012 | 52.8% | 0.59 | 0.278 | No |
Win rate never achieved statistical significance over the entire 5,012-bet sample. This is not because Jordan lacks skill --- a 52.8% win rate at average -107.5 juice translates to a +2.5% ROI --- but because the inherent variance of binary outcomes makes it extremely difficult to distinguish a 52.8% bettor from a 52.4% bettor over "only" 5,000 observations.
The convergence speed advantage of CLV over win rate is approximately 20:1 to 50:1. CLV provided a statistically significant signal of skill at 100 bets; win rate had not converged after 5,000.
Test 3: CLV Distribution Comparison --- Skilled vs. Random Bettors
We generated 1,000 simulated random bettors, each placing 1,000 bets with no skill (random side selection at random available odds). We then compared their CLV distributions to Jordan's.
Average CLV Distribution:
| Population | Mean of Avg CLV | Std Dev of Avg CLV | % with Positive Avg CLV |
|---|---|---|---|
| 1,000 random bettors | -0.02% | 0.31% | 47.2% |
| Jordan (per 1,000-bet window) | +2.1% | 0.42% | 100% |
Jordan's average CLV of +2.1% is 6.8 standard deviations above the random-bettor mean. The probability of a random bettor achieving Jordan's CLV level by chance is less than 1 in 10 billion. This conclusively establishes that Jordan's performance reflects genuine skill, not luck.
CLV Hit Rate Distribution:
| Population | Mean CLV Hit Rate | Std Dev | % Above 55% |
|---|---|---|---|
| 1,000 random bettors | 50.1% | 1.58% | 0.8% |
| Jordan | 61.0% | 2.1% (per season) | 100% |
Jordan's CLV hit rate of 61% means they beat the closing line on 61% of their bets. Among random bettors, only 0.8% achieved a CLV hit rate above 55% over 1,000 bets. Jordan's 61% is effectively impossible by chance.
Test 4: Detecting Edge Decay in Season 3
Jordan's Season 3 (2023) showed a notable decline: CLV dropped from +2.3% and +2.1% in Seasons 1-2 to +1.4% in Season 3, and ROI fell from +2.9% and +2.1% to +0.8%.
A rolling 200-bet CLV analysis reveals the timeline of the decay:
| Window (Bets) | Period | Rolling CLV | Rolling ROI |
|---|---|---|---|
| 2,401-2,600 | Early S3 | +2.0% | +2.4% |
| 2,501-2,700 | Mid-early S3 | +1.6% | +1.1% |
| 2,601-2,800 | Mid S3 | +1.1% | +0.3% |
| 2,701-2,900 | Mid-late S3 | +0.8% | -0.5% |
| 2,801-3,000 | Late S3 | +0.9% | +0.1% |
The rolling CLV declined steadily from +2.0% to +0.8% over the first 500 bets of Season 3. Critically, CLV detected the problem before ROI did --- the rolling CLV dropped below +1.5% at bet 2,600, while the rolling ROI did not turn negative until bet 2,750.
Root Cause Analysis:
Jordan conducted a systematic review during the Season 3 bye week and identified two causes:
-
Stale model parameters. The efficiency and rest adjustment factors had not been updated since the preseason. The NFL's new kickoff rule had changed average starting field position, which affected scoring patterns and thus totals markets.
-
Sportsbook adaptation. Two of Jordan's most profitable books had tightened their spreads, reducing the available juice savings by approximately 1 cent on average.
Corrective Action:
Jordan retrained the model on mid-season data, updated the situational adjustments, and opened two new sportsbook accounts. The impact was visible in the CLV recovery:
| Window (Bets) | Period | Rolling CLV | Rolling ROI |
|---|---|---|---|
| 2,901-3,100 | Very late S3 | +1.3% | +0.8% |
| 3,001-3,200 | S3/S4 transition | +1.8% | +2.0% |
| 3,201-3,400 | Early S4 | +2.4% | +3.8% |
| 3,401-3,600 | Mid S4 | +2.6% | +4.5% |
The model retraining and account refresh restored CLV to above +2.0% and ROI recovered correspondingly. Season 4 ultimately produced the best results of the four-year period (+4.2% ROI), partly due to the model improvements and partly due to favorable variance.
Test 5: CLV by Market Type
Jordan's CLV varied significantly across market types:
| Market Type | Bets | Avg CLV | Avg ROI | CLV Hit Rate |
|---|---|---|---|---|
| NFL Spreads | 3,007 | +1.8% | +2.0% | 59.8% |
| NFL Totals | 1,253 | +2.5% | +3.1% | 63.2% |
| NFL Moneylines | 752 | +2.4% | +3.2% | 62.8% |
Totals and moneylines produced higher CLV than spreads. This is consistent with the finding that the spread market is the most efficient NFL market (it receives the most sharp action), while totals and moneylines, being derivative markets, sometimes lag the spread market in incorporating information.
Jordan's model was particularly strong on totals, where weather and pace adjustments provided an edge that the market was slower to price in. The +2.5% CLV on totals over 1,253 bets is highly significant (t = 8.9, p < 0.001) and represents a genuine structural advantage.
Synthesis and Lessons
Lesson 1: CLV Is the Definitive Skill Metric
The evidence from this case study is unambiguous: CLV separates skill from luck with far greater efficiency than win rate or ROI. Jordan's CLV was statistically significant after 100 bets; their win rate was not significant after 5,000. For any bettor serious about evaluating their own performance, tracking CLV is not optional --- it is essential.
Lesson 2: Rolling CLV Detects Edge Decay Before It Appears in P&L
The Season 3 decline was visible in rolling CLV approximately 150 bets before it appeared in rolling ROI. This early warning allowed Jordan to diagnose and correct the problem before it caused catastrophic losses. A bettor who tracked only ROI would not have recognized the issue until much later and might have attributed the downturn to variance rather than a systematic problem.
Lesson 3: CLV Varies by Market Type, and That Variation Is Exploitable
Jordan's CLV was highest in totals and moneylines, suggesting that their model had the greatest relative advantage in these markets. This information is actionable: Jordan could allocate more research effort and betting capital to totals markets, where the edge is largest, and less to spreads, where the market is most efficient.
Lesson 4: Line Shopping and CLV Are Complementary
Part of Jordan's CLV came from the model's ability to identify the correct side before the market moved, and part came from obtaining better prices through line shopping. These two sources of CLV are additive: the model ensures you are betting the side the closing line will move toward, and line shopping ensures you lock in the best available price on that side.
Lesson 5: The 1,000-Bet Threshold
While CLV achieves statistical significance much faster than win rate, the practical threshold for drawing firm conclusions about market-specific edges and model performance is approximately 1,000 bets. Below this threshold, individual season effects, outlier results, and market-specific variance can distort the signal. Above 1,000 bets, the CLV signal is stable and reliable for guiding strategic decisions.
The Python Analysis
The accompanying code (case-study-code.py) includes the CLV significance testing framework used in this case study. It implements:
- Running t-test calculator that tracks the cumulative significance of mean CLV over an expanding sample.
- Random bettor simulator that generates CLV distributions for skill-less bettors to serve as a null hypothesis.
- Rolling CLV analyzer with configurable window sizes and decay detection thresholds.
- Market-type CLV decomposition that segments performance by spread, total, and moneyline markets.
- CLV-ROI correlation framework that computes within-bettor correlations across time windows.
Discussion Questions
-
Jordan's CLV was significant at 100 bets, but a single 100-bet sample could also produce significant results by chance (5% of the time by definition). How would you protect against a false-positive CLV signal? Design a testing protocol that controls the false-positive rate.
-
In Season 3, Jordan's CLV declined from +2.3% to +1.4%. At what CLV level should a bettor become alarmed? Propose a formal decision rule (using confidence intervals or hypothesis testing) for when to stop betting and reassess.
-
Jordan's totals CLV (+2.5%) exceeded their spreads CLV (+1.8%). If Jordan shifted 50% of their spread volume to totals, would you expect their overall CLV to increase? What risks would this concentration create?
-
The random-bettor simulation found that 47.2% of random bettors had positive average CLV. Why is this number close to 50% but not exactly 50%? What systematic factor pushes random bettors slightly below zero CLV?
-
If the closing line becomes more efficient over time (as sportsbook models improve), what happens to CLV as a skill metric? Can a bettor maintain the same level of skill but see their CLV decline due to market improvement?