Appendix E: Glossary
An alphabetized glossary of key terms used throughout the textbook. Each entry includes the term, primary chapter reference(s), a concise definition, and related terms.
Accumulator (Ch. 29) -- A bet combining multiple selections where all must win for the bet to pay out. The odds multiply together. Also called a parlay in North America. Related: parlay, correlation, same-game parlay.
Action (Ch. 1) -- Having a wager placed on a game or event. Also refers to the total amount of money wagered on a particular market.
Adjusted Efficiency (Ch. 22) -- A team's offensive or defensive rating after adjusting for strength of opponents and pace of play. Commonly used in basketball analytics. Related: offensive rating, defensive rating, pace.
Against the Spread (ATS) (Ch. 6) -- Betting against the point spread rather than on the moneyline. A team "covers" ATS if they beat the spread. Related: point spread, cover, push.
Alchemy (Ch. 5) -- Colloquial term for data-mining or curve-fitting strategies that appear profitable in backtests but have no genuine predictive power. Related: overfitting, p-hacking, multiple comparisons.
Alpha (Ch. 5, 14) -- (1) In statistics, the significance level of a hypothesis test, typically 0.05. (2) In finance/betting, the excess return above what the market implies. Related: significance level, edge, Type I error.
American Odds (Ch. 1) -- Odds format used primarily in the US. Positive numbers (e.g., +150) show profit on a $100 bet; negative numbers (e.g., -110) show how much to bet to win $100. Related: decimal odds, fractional odds, implied probability.
Arbitrage (Ch. 10) -- A risk-free profit opportunity created when different bookmakers offer odds that sum to less than 100% implied probability. Also called an "arb" or "surebet." Related: overround, sharp betting, closing line value.
Area Under the Curve (AUC) (Ch. 17) -- A metric measuring how well a binary classifier distinguishes between classes. AUC of 0.5 is random; 1.0 is perfect. Related: ROC curve, discrimination, log loss.
BABIP (Ch. 23) -- Batting Average on Balls In Play. The rate at which batted balls (excluding home runs) fall for hits. A key regression indicator in baseball. Related: FIP, luck, regression to the mean.
Backdoor Cover (Ch. 6) -- When a team covers the spread due to a late, often meaningless score. Creates noise in ATS records. Related: garbage time, against the spread.
Backtest (Ch. 16) -- Evaluating a strategy on historical data. Must use walk-forward methodology to avoid look-ahead bias. Related: walk-forward, out-of-sample, overfitting.
Bankroll (Ch. 8) -- The total amount of money a bettor has set aside for wagering. Related: Kelly criterion, stake, ruin probability.
Bankroll Management (Ch. 8) -- A systematic approach to determining bet sizes relative to one's bankroll. Related: Kelly criterion, fractional Kelly, fixed-unit betting.
Base Rate (Ch. 2) -- The overall frequency of an event in the population, without conditioning on any specific factors. Ignoring base rates is a common cognitive error. Related: prior probability, base rate neglect.
Bayesian Inference (Ch. 9) -- A statistical framework that updates probability estimates as new evidence is obtained, using Bayes' theorem. Related: prior, posterior, likelihood, conjugate prior.
Betfair (Ch. 10) -- The world's largest betting exchange, where users bet against each other rather than against a bookmaker. Related: betting exchange, back, lay.
Betting Exchange (Ch. 10) -- A platform where bettors can both back (bet for) and lay (bet against) outcomes, with the exchange taking a commission on winnings. Related: Betfair, commission, liquidity.
Bias (Ch. 14) -- (1) In ML, the intercept term in a linear model. (2) In statistics, the systematic error in an estimator. (3) In betting, a cognitive tendency toward irrational decisions. Related: bias-variance tradeoff, cognitive bias, intercept.
Bias-Variance Tradeoff (Ch. 14) -- The fundamental tension in model building: simple models have high bias (underfitting) while complex models have high variance (overfitting). Related: overfitting, underfitting, regularization.
Binomial Distribution (Ch. 3) -- The distribution of the number of successes in n independent Bernoulli trials, each with success probability p. Related: Bernoulli, normal approximation, win rate.
Bonferroni Correction (Ch. 5) -- A method for adjusting significance thresholds when performing multiple hypothesis tests. Divides alpha by the number of tests. Conservative. Related: multiple comparisons, false discovery rate.
Bookmaker (Ch. 1) -- An entity that sets odds and accepts bets. Also called a sportsbook or bookie. Related: odds, vig, market maker.
Brier Score (Ch. 17) -- The mean squared error between predicted probabilities and actual binary outcomes. Lower is better. Decomposes into calibration, resolution, and uncertainty. Related: calibration, log loss, reliability diagram.
Buy Points (Ch. 6) -- Paying additional vig to move the point spread in one's favor. Also called an "alternate spread." Related: point spread, teaser, vig.
Calibration (Ch. 17) -- The property of a model where predicted probabilities match observed frequencies. A well-calibrated model predicting 70% should win 70% of the time. Related: Brier score, reliability diagram, Platt scaling.
Chalk (Ch. 1) -- Slang for the favorite. "Betting chalk" means consistently backing favorites. Related: favorite, dog, moneyline.
Closing Line (Ch. 10) -- The final odds offered just before an event begins. Generally considered the most efficient line. Related: closing line value, opening line, line movement.
Closing Line Value (CLV) (Ch. 10) -- The difference between the odds at which a bet was placed and the closing odds. Consistently beating the closing line is the strongest predictor of long-term profitability. Related: closing line, sharp, market efficiency.
Coefficient of Determination (R-squared) (Ch. 6) -- The proportion of variance in the dependent variable explained by the model. Ranges from 0 to 1 in-sample. Related: regression, adjusted R-squared.
Cognitive Bias (Ch. 11) -- Systematic patterns of deviation from rationality in judgment. Includes recency bias, confirmation bias, anchoring, and the gambler's fallacy. Related: behavioral economics, heuristics, tilt.
Conjugate Prior (Ch. 9) -- A prior distribution that, when combined with a particular likelihood function, produces a posterior in the same family. Example: Beta prior with Binomial likelihood. Related: Bayesian inference, Beta distribution, posterior.
Correlated Parlay (Ch. 29) -- A parlay where the individual legs are not independent, such as betting a team to win and the game to go over the total. Related: parlay, same-game parlay, independence.
Correlation (Ch. 6) -- A measure of the linear relationship between two variables, ranging from -1 to +1. In betting, refers to the degree to which bet outcomes are related. Related: Pearson correlation, covariance, independence.
Corsi (Ch. 24) -- An NHL analytics metric measuring shot attempt differential (shots on goal plus missed shots plus blocked shots). Proxy for puck possession. Related: Fenwick, expected goals, possession.
Cover (Ch. 6) -- When a team beats the point spread. A 7-point favorite covers if they win by more than 7. Related: point spread, ATS, push.
Cross-Entropy Loss (Ch. 14) -- The negative log-likelihood for binary or multi-class classification. Also called log loss. The standard loss function for probability prediction models. Related: log loss, Brier score, likelihood.
Cross-Validation (Ch. 16) -- A resampling technique for estimating model performance. K-fold CV splits data into k parts, training on k-1 and testing on the remaining fold. Related: backtest, time-series split, overfitting.
Data Leakage (Ch. 16) -- When information from the test set inadvertently influences model training, leading to overly optimistic performance estimates. Related: look-ahead bias, feature engineering, walk-forward.
Dead Heat (Ch. 1) -- A tie between two or more selections. Different markets handle dead heats differently, often splitting the stake. Related: push, void.
Decimal Odds (Ch. 1) -- Odds format expressing the total return per unit staked, including the stake. Odds of 2.50 mean a $1 bet returns $2.50 (profit of $1.50). Related: American odds, fractional odds, implied probability.
Defensive Rating (DRtg) (Ch. 22) -- Points allowed per 100 possessions. Lower is better. Related: offensive rating, net rating, adjusted efficiency.
Derivative (Ch. 8, 14) -- The instantaneous rate of change of a function. Used in optimization (gradient descent) and derivations (Kelly criterion). Related: gradient, calculus, optimization.
Dog (Ch. 1) -- Short for underdog. The team or competitor expected to lose. Related: favorite, moneyline, value.
Dime Line (Ch. 1) -- A betting line where the total vig is 10 cents (e.g., -105/+105 on each side of a moneyline). Represents low-vig pricing. Related: vig, juice, overround.
Draw No Bet (Ch. 33) -- A soccer market where the bet is voided if the match ends in a draw. Eliminates the third outcome. Related: three-way market, Asian handicap, double chance.
DVOA (Ch. 20) -- Defense-adjusted Value Over Average. An NFL metric from Football Outsiders measuring per-play efficiency after adjusting for opponent and situation. Related: EPA, efficiency, strength of schedule.
Early Value (Ch. 10) -- Betting on a line before the market corrects it, capturing favorable odds that will later move. Related: opening line, steam move, CLV.
Edge (Ch. 1) -- The bettor's advantage over the market. Calculated as (true probability * decimal odds) - 1. Positive edge means a profitable bet in expectation. Related: expected value, CLV, vig.
Efficient Market Hypothesis (Ch. 10) -- The theory that market prices (or betting lines) fully reflect all available information. Implies that beating the market is impossible without inside information or superior models. Related: market efficiency, CLV, Pinnacle.
Elastic Net (Ch. 13) -- A regularization technique combining L1 (Lasso) and L2 (Ridge) penalties. Useful when features are correlated. Related: Ridge, Lasso, regularization.
Elo Rating (Ch. 9) -- A rating system originally designed for chess, adapted for team sports. Teams gain or lose rating points based on expected vs. actual results. Related: power rating, Glicko, strength of schedule.
Ensemble Method (Ch. 16) -- A technique combining multiple models to improve prediction. Examples: random forests (bagging), gradient boosting, stacking. Related: random forest, gradient boosting, model averaging.
EPA (Expected Points Added) (Ch. 20) -- An NFL metric measuring the value of each play in terms of expected points. Based on down, distance, yard line, and game situation. Related: WPA, success rate, DVOA.
European Handicap (Ch. 33) -- A soccer handicap where push results in a loss (unlike Asian handicap which refunds). Expressed in whole numbers. Related: Asian handicap, point spread.
Expected Goals (xG) (Ch. 33) -- A soccer metric estimating the probability that a shot results in a goal, based on shot characteristics (location, angle, body part, buildup). Related: shot quality, post-shot xG, Poisson model.
Expected Value (EV) (Ch. 1) -- The average outcome of a bet if repeated infinitely. EV = (probability of winning * payout) - (probability of losing * stake). Positive EV (+EV) is the goal. Related: edge, ROI, Kelly criterion.
Exposure (Ch. 8) -- The total amount of money at risk on a particular outcome, team, or market. Related: bankroll, hedging, risk management.
False Discovery Rate (FDR) (Ch. 5) -- The expected proportion of rejected null hypotheses that are actually true (false positives among all positives). Controlled by the Benjamini-Hochberg procedure. Related: Bonferroni correction, multiple comparisons, p-value.
Favorite (Ch. 1) -- The team or competitor expected to win, offered at shorter (lower-paying) odds. Related: dog, chalk, moneyline.
Feature (Ch. 13) -- An input variable in a predictive model. Also called a predictor, independent variable, or covariate. Related: feature engineering, feature selection, predictor.
Feature Engineering (Ch. 15) -- The process of creating new input variables from raw data. Examples: rolling averages, rest days, Elo differences. Related: feature, domain knowledge, transformation.
Feature Importance (Ch. 16) -- A measure of how much each feature contributes to model predictions. Methods: permutation importance, SHAP values, coefficient magnitude. Related: feature selection, SHAP, interpretability.
Fenwick (Ch. 24) -- An NHL analytics metric similar to Corsi but excluding blocked shots. Related: Corsi, expected goals, possession.
FIP (Fielding Independent Pitching) (Ch. 23) -- A baseball metric estimating what a pitcher's ERA would be based solely on strikeouts, walks, hit batters, and home runs. Related: ERA, BABIP, xFIP.
Fixed-Unit Betting (Ch. 8) -- Wagering the same dollar amount or percentage of bankroll on every bet regardless of edge size. Simple but suboptimal compared to Kelly. Related: Kelly criterion, bankroll management, flat betting.
Fractional Kelly (Ch. 8) -- Betting a fraction (typically 1/4 to 1/2) of the full Kelly criterion amount. Reduces variance at the cost of slower bankroll growth. Related: Kelly criterion, variance, bankroll management.
Fractional Odds (Ch. 1) -- Odds format common in the UK, expressing profit relative to stake (e.g., 5/2 means $5 profit on a $2 bet). Related: decimal odds, American odds.
Futures (Ch. 30) -- Long-term bets on season outcomes such as championship winners, MVP awards, or win totals. Typically have higher vig but more potential for finding value. Related: outrights, season-long, win total.
Gambler's Fallacy (Ch. 11) -- The mistaken belief that past random outcomes influence future ones. Believing a team is "due" for a win after a losing streak. Related: cognitive bias, independence, hot hand.
Gambler's Ruin (Ch. 8) -- The mathematical certainty that a gambler with finite bankroll playing a negative-EV game will eventually go broke. Related: bankroll management, negative expected value, ruin probability.
Garbage Time (Ch. 6, 20) -- The final portion of a game where the outcome is no longer in doubt. Statistics and scores during garbage time can distort metrics and spread results. Related: backdoor cover, win probability, EPA.
Gradient Boosting (Ch. 16) -- An ensemble learning method that builds trees sequentially, with each tree correcting the errors of the previous ones. Implementations: XGBoost, LightGBM, CatBoost. Related: random forest, ensemble, boosting.
Gradient Descent (Ch. 14) -- An iterative optimization algorithm that updates parameters in the direction of steepest decrease of the loss function. Related: learning rate, SGD, Adam optimizer.
Half Kelly (Ch. 8) -- Betting exactly half of the full Kelly fraction. A popular practical choice balancing growth and risk. Related: Kelly criterion, fractional Kelly, bankroll management.
Handicap (Ch. 6) -- A point advantage or disadvantage applied to a team to equalize the market. The spread in American sports; Asian or European handicap in soccer. Related: point spread, Asian handicap, cover.
Handle (Ch. 1) -- The total amount of money wagered on a particular event or at a particular sportsbook. Related: action, hold, volume.
Hedging (Ch. 30) -- Placing additional bets to reduce risk on an existing position, often used with futures or parlays. Related: exposure, risk management, arbitrage.
Hold (Ch. 1) -- The percentage of total handle that the sportsbook retains as profit. Related to but distinct from the vig. Related: vig, handle, overround.
Home-Field Advantage (HFA) (Ch. 6) -- The statistical tendency for home teams to perform better than expected. Varies by sport and has been declining in recent years. Related: Elo, point spread, venue effects.
Hook (Ch. 6) -- A half-point in a point spread (e.g., a 3.5-point spread has a "hook"). Landing on the hook side of a key number matters significantly for ATS results. Related: key number, point spread, push.
Hot Hand (Ch. 11) -- The belief that a player or team on a winning streak is more likely to continue winning. Recent research suggests a small but real effect in some contexts. Related: gambler's fallacy, momentum, streak.
Hyperparameter (Ch. 16) -- A model setting determined before training, not learned from data. Examples: learning rate, tree depth, regularization strength. Related: parameter, tuning, grid search.
Implied Probability (Ch. 1) -- The probability derived from betting odds. For decimal odds d: implied probability = 1/d. Includes the vig, so the sum across all outcomes exceeds 100%. Related: fair probability, odds, vig, overround.
In-Play Betting (Ch. 25) -- Wagering on events after they have started, with odds updating in real time. Also called live betting. Related: pre-match, line movement, real-time models.
Independence (Ch. 2, 29) -- Two events are independent if the occurrence of one does not affect the probability of the other. Critical assumption in parlay pricing. Related: correlation, conditional probability, parlay.
Juice (Ch. 1) -- Synonym for vig or vigorish. The bookmaker's commission built into the odds. Related: vig, overround, hold.
Kelly Criterion (Ch. 8) -- A formula for determining optimal bet size to maximize long-run bankroll growth: f = (pb - q) / b, where p is win probability, q = 1-p, and b is the net payout per unit. Related: bankroll management, fractional Kelly, expected growth.
Key Number (Ch. 6) -- A point spread value on which a disproportionate number of games land. In NFL: 3 and 7. In NBA: commonly 5, 6, 7. Related: hook, point spread, push.
K-Factor (Ch. 9) -- The update rate in an Elo system. Higher K means faster adaptation but more volatility. Lower K means more stability but slower response to changes. Related: Elo, learning rate, adaptation.
Lasso Regression (Ch. 13) -- Linear regression with an L1 penalty (sum of absolute values of coefficients). Performs feature selection by driving some coefficients exactly to zero. Related: Ridge, Elastic Net, regularization.
Lay (Ch. 10) -- On a betting exchange, to bet against an outcome (acting as the bookmaker). Related: back, betting exchange, Betfair.
Learning Rate (Ch. 14) -- The step size in gradient descent. Too high causes oscillation; too low causes slow convergence. Related: gradient descent, Adam, hyperparameter.
Liability (Ch. 10) -- The maximum amount a bookmaker stands to lose on a particular outcome. Related: exposure, risk management, lay.
Line (Ch. 1) -- The odds or point spread set by the sportsbook. "The line on Kansas City is -7." Related: opening line, closing line, line movement.
Line Movement (Ch. 10) -- Changes in the odds or point spread between opening and closing. Driven by betting volume, sharp money, and information. Related: steam move, reverse line movement, sharp money.
Log Loss (Ch. 17) -- The negative average log-likelihood of the predicted probabilities given the actual outcomes. The primary metric for evaluating probability predictions. Related: cross-entropy, Brier score, calibration.
Logistic Regression (Ch. 13) -- A classification model that predicts probabilities using the logistic (sigmoid) function. The workhorse model for binary outcome prediction in sports. Related: sigmoid, odds ratio, classification.
Long Shot (Ch. 1) -- A bet on an outcome with a low probability and correspondingly high odds. Related: underdog, favorite-longshot bias, value.
Look-Ahead Bias (Ch. 16) -- Using information in model training or feature computation that would not have been available at the time of prediction. A fatal flaw in backtesting. Related: data leakage, walk-forward, time-series split.
Market Efficiency (Ch. 10) -- The degree to which betting odds reflect true probabilities. More efficient markets are harder to beat. Pinnacle closing lines are considered the most efficient. Related: EMH, CLV, sharp bookmaker.
Market Maker (Ch. 10) -- A sportsbook that sets its own lines and accepts bets from sharp bettors, rather than copying lines from other books. Pinnacle is the leading example. Related: bookmaker, Pinnacle, sharp.
Martingale (Ch. 8) -- A staking system where the bettor doubles the bet after each loss. Mathematically guaranteed to fail with finite bankroll. Related: gambler's ruin, negative EV, staking plan.
Maximum Likelihood Estimation (MLE) (Ch. 12) -- A method for estimating model parameters by finding the values that maximize the probability of observing the data. Related: likelihood, log-likelihood, optimization.
Mean Absolute Error (MAE) (Ch. 13) -- The average of absolute differences between predictions and actual values. More robust to outliers than MSE. Related: MSE, RMSE, loss function.
Moneyline (Ch. 1) -- A bet on which team will win the game outright, without a point spread. Related: point spread, favorite, underdog.
Monte Carlo Simulation (Ch. 8, 30) -- Using repeated random sampling to estimate probabilities or distributions. Used for bankroll simulation, season projections, and prop modeling. Related: simulation, random number generation, variance.
Multicollinearity (Ch. 13) -- When predictor variables are highly correlated with each other, making regression coefficient estimates unstable. Related: VIF, Ridge regression, feature selection.
Neural Network (Ch. 14) -- A machine learning model composed of layers of interconnected nodes with learnable weights and nonlinear activation functions. Related: deep learning, hidden layer, backpropagation.
No-Vig Line (Ch. 1, 10) -- Odds with the vigorish removed, representing the market's estimate of true probability. Also called "fair odds." Related: vig, implied probability, devigging.
Odds (Ch. 1) -- The prices offered by bookmakers reflecting the implied probability of an outcome and the bookmaker's margin. Related: American odds, decimal odds, fractional odds.
Odds Ratio (Ch. 13) -- In logistic regression, the exponentiated coefficient exp(beta). Represents the multiplicative change in odds for a one-unit increase in the predictor. Related: logistic regression, coefficient, log-odds.
Offensive Rating (ORtg) (Ch. 22) -- Points scored per 100 possessions. The standard efficiency metric in basketball. Related: defensive rating, pace, net rating.
Opening Line (Ch. 10) -- The first odds or spread released by a sportsbook for an event. Related: closing line, line movement, CLV.
Overfit (Ch. 14, 16) -- When a model learns patterns specific to the training data that do not generalize to new data. The central challenge in sports prediction. Related: bias-variance tradeoff, regularization, cross-validation.
Over/Under (Ch. 7) -- A bet on whether the total combined score will be over or under a specified number set by the bookmaker. Also called the "total." Related: total, push, Poisson model.
Overround (Ch. 1) -- The total implied probability across all outcomes in a market, minus 100%. Represents the bookmaker's built-in margin. A 105% market has a 5% overround. Related: vig, hold, fair odds.
P-Hacking (Ch. 5) -- Manipulating data analysis to find statistically significant results, such as trying many variable combinations until one reaches p < 0.05. Related: multiple comparisons, false discovery, overfitting.
P-Value (Ch. 5) -- The probability of observing results at least as extreme as the data, assuming the null hypothesis is true. Not the probability that the null is true. Related: hypothesis test, significance, alpha.
Pace (Ch. 22) -- The number of possessions per game (NBA) or plays per game (NFL). Affects raw counting statistics and total points. Related: offensive rating, tempo, total.
Parlay (Ch. 29) -- A single bet that links two or more individual wagers. All selections must win for the parlay to pay out. Related: accumulator, correlation, teaser.
Permutation Importance (Ch. 16) -- A model-agnostic method for measuring feature importance by shuffling each feature's values and measuring the decrease in model performance. Related: feature importance, SHAP, random forest.
Pinnacle (Ch. 10) -- A sharp bookmaker widely considered to have the most efficient odds in the market. The de facto standard for measuring closing line value. Related: sharp bookmaker, CLV, market efficiency.
Platt Scaling (Ch. 17) -- A post-hoc calibration method that fits a logistic regression to a model's output scores to produce calibrated probabilities. Related: calibration, isotonic regression, reliability.
Poisson Distribution (Ch. 7) -- A discrete probability distribution modeling the number of events in a fixed interval (e.g., goals per match). Key parameter: lambda (mean rate). Related: soccer, hockey, expected goals, over/under.
Point Spread (Ch. 6) -- A handicap applied to the favored team to equalize the market. A -7 spread means the favorite must win by more than 7 to cover. Related: ATS, cover, key number, hook.
Posterior Distribution (Ch. 9) -- In Bayesian inference, the updated probability distribution after combining the prior with observed data via the likelihood. Related: prior, Bayes' theorem, conjugate prior.
Power Rating (Ch. 9) -- A numerical value representing a team's strength, used to predict outcomes by comparing ratings. Elo, Sagarin, and Massey are examples. Related: Elo, ranking, strength of schedule.
Prior Distribution (Ch. 9) -- In Bayesian inference, the probability distribution representing beliefs before observing data. Related: posterior, conjugate prior, informative prior.
Prop Bet (Ch. 31) -- A bet on a specific occurrence within a game, not directly tied to the final outcome (e.g., player passing yards, first team to score). Related: player prop, game prop, derivative market.
Push (Ch. 1) -- When the final result lands exactly on the spread or total, resulting in a refund of the wager. Related: key number, hook, void.
RAPTOR (Ch. 22) -- Robust Algorithm using Player Tracking and On/Off Ratings. FiveThirtyEight's NBA player metric combining box score and on/off court data. Related: PER, BPM, WAR.
Random Forest (Ch. 16) -- An ensemble method that builds many decision trees on bootstrapped samples with random feature subsets, then averages their predictions. Related: bagging, ensemble, gradient boosting.
Recency Bias (Ch. 11) -- Overweighting recent events relative to the full body of evidence. Common among both casual bettors and sportsbooks. Related: cognitive bias, sample size, trend.
Regression to the Mean (Ch. 3) -- The tendency for extreme observations to be followed by more typical ones. Teams on extreme winning or losing streaks tend to regress. Related: mean reversion, sample size, BABIP.
Regularization (Ch. 13) -- Adding a penalty term to the loss function to prevent overfitting. L1 (Lasso) and L2 (Ridge) are the most common forms. Related: Lasso, Ridge, Elastic Net, overfitting.
Reliability Diagram (Ch. 17) -- A visual tool for assessing calibration. Plots observed frequency against predicted probability. A well-calibrated model follows the diagonal. Related: calibration, Brier score, Platt scaling.
Rest Days (Ch. 20, 22) -- The number of days between consecutive games. Fatigue and rest advantages are significant predictors in NBA and NFL scheduling. Related: back-to-back, scheduling, travel.
Reverse Line Movement (Ch. 10) -- When the line moves in the opposite direction from public betting percentages, suggesting sharp money on the other side. Related: line movement, sharp money, public.
Ridge Regression (Ch. 13) -- Linear regression with an L2 penalty (sum of squared coefficients). Shrinks coefficients but does not set them to zero. Related: Lasso, Elastic Net, regularization.
ROC Curve (Ch. 17) -- Receiver Operating Characteristic curve. Plots true positive rate against false positive rate at various thresholds. Related: AUC, sensitivity, specificity.
ROI (Return on Investment) (Ch. 1, 38) -- Total profit divided by total amount staked, expressed as a percentage. The standard measure of betting profitability. Related: yield, CLV, edge.
Ruin Probability (Ch. 8) -- The probability that a bettor will lose their entire bankroll given their edge, variance, and bet sizing. Related: Kelly criterion, bankroll management, gambler's ruin.
Same-Game Parlay (SGP) (Ch. 29) -- A parlay where all legs come from the same game. Correlations between legs complicate pricing. Related: parlay, correlation, prop bet.
Sample Size (Ch. 3, 5) -- The number of observations in a dataset or the number of bets in a track record. Inadequate sample size is the most common analytical error in betting. Related: statistical significance, power, variance.
Sharp (Ch. 10) -- A sophisticated, typically professional bettor whose opinions move markets. Also used as an adjective for bookmakers who cater to informed bettors. Related: square, wise guy, CLV.
SHAP Values (Ch. 16) -- SHapley Additive exPlanations. A game-theory-based method for explaining individual predictions by assigning each feature a contribution value. Related: feature importance, interpretability, permutation importance.
Sigmoid Function (Ch. 13) -- The function sigma(z) = 1/(1 + e^{-z}) that maps any real number to the interval (0,1). The core of logistic regression. Related: logistic regression, softmax, activation function.
Simulation (Ch. 8, 30) -- Generating many random scenarios to estimate probabilities, distributions, or risks. Essential for season projections and bankroll analysis. Related: Monte Carlo, random number generation.
Square (Ch. 10) -- An unsophisticated or recreational bettor. Square money tends to favor favorites and overs. Related: sharp, public, recreational.
Stacking (Ch. 16) -- An ensemble technique where a meta-model combines the predictions of multiple base models. Related: ensemble, blending, model averaging.
Staking Plan (Ch. 8) -- A predetermined system for determining bet sizes. Options include fixed-unit, percentage-of-bankroll, and Kelly-based approaches. Related: Kelly criterion, bankroll management, fixed-unit.
Steam Move (Ch. 10) -- A sudden, sharp line movement caused by coordinated or heavy betting, typically from sharp bettors or syndicates. Related: line movement, sharp, CLV.
Stochastic Gradient Descent (SGD) (Ch. 14) -- A variant of gradient descent that updates parameters using a single training example (or small batch) at a time, rather than the full dataset. Related: gradient descent, batch size, learning rate.
Strength of Schedule (Ch. 9) -- A measure of the difficulty of a team's opponents. Critical for adjusting raw records when comparing teams. Related: Elo, power rating, adjusted efficiency.
Survivorship Bias (Ch. 5) -- The error of focusing on successful outcomes (e.g., profitable bettors) while ignoring failures. Leads to overestimation of system effectiveness. Related: selection bias, sample bias, backtesting.
Teaser (Ch. 29) -- A parlay where the bettor receives favorable point adjustments on each leg in exchange for reduced odds. In NFL, a standard teaser moves each spread by 6 points. Related: parlay, Wong teaser, key number.
Three-Way Market (Ch. 33) -- A market with three outcomes: home win, draw, away win. Standard in soccer. Related: moneyline, draw no bet, Asian handicap.
Tilt (Ch. 11) -- A state of emotional frustration that leads to poor decision-making and deviation from one's strategy. Related: cognitive bias, bankroll management, discipline.
Time-Series Split (Ch. 16) -- A cross-validation approach where training data always precedes validation data in time. Essential for any temporal prediction task like sports betting. Related: cross-validation, walk-forward, look-ahead bias.
Total (Ch. 7) -- The combined points/goals scored by both teams. The over/under line set by the sportsbook. Related: over/under, Poisson model.
True Probability (Ch. 1) -- The actual probability of an outcome, as opposed to the implied probability from odds. The difference between true and implied probability defines the edge. Related: implied probability, fair odds, edge.
Type I Error (Ch. 5) -- Rejecting the null hypothesis when it is actually true (false positive). Controlled by the significance level alpha. Related: alpha, p-value, false discovery.
Type II Error (Ch. 5) -- Failing to reject the null hypothesis when it is actually false (false negative). Controlled by statistical power (1 - beta). Related: power, beta, sample size.
Underdog (Ch. 1) -- The team or competitor expected to lose, offered at longer (higher-paying) odds. Related: favorite, dog, moneyline.
Value Bet (Ch. 1) -- A bet where the true probability of winning is higher than the probability implied by the odds. Related: edge, expected value, positive EV.
Variance (Ch. 3) -- A measure of how spread out a distribution is. In betting, high variance means large swings in bankroll even with a positive edge. Related: standard deviation, sample size, bankroll management.
Vig (Vigorish) (Ch. 1) -- The commission charged by the bookmaker, embedded in the odds. Standard vig on a -110/-110 line is approximately 4.55%. Also called juice or the take. Related: overround, hold, juice.
Walk-Forward Analysis (Ch. 16) -- A backtesting methodology where the model is trained only on data available up to each prediction point, then tested on subsequent data. The gold standard for temporal evaluation. Related: backtest, time-series split, look-ahead bias.
WAR (Wins Above Replacement) (Ch. 22, 23) -- A comprehensive metric estimating the total number of wins a player contributes above a freely available replacement-level player. Exists for baseball (bWAR, fWAR), basketball, and hockey (GAR). Related: RAPTOR, PER, player value.
Win Probability (Ch. 25) -- The estimated probability that a team will win the game given the current game state (score, time remaining, possession, etc.). Related: WPA, in-play, expected value.
Win Probability Added (WPA) (Ch. 25) -- The change in win probability attributable to a single play or event. Captures the in-game impact of each action. Related: EPA, win probability, leverage.
Wong Teaser (Ch. 29) -- A teaser betting strategy that targets spreads crossing key numbers (particularly 3 and 7 in NFL) where the additional points are most valuable. Named after Stanford Wong. Related: teaser, key number, parlay.
XGBoost (Ch. 16) -- An optimized gradient boosting library known for speed and performance. Widely used in sports prediction competitions and production models. Related: gradient boosting, LightGBM, CatBoost.
Yield (Ch. 38) -- Profit expressed as a percentage of total amount staked. Equivalent to ROI. A 3% yield means $3 profit per $100 wagered. Related: ROI, edge, profit.
Z-Score (Ch. 3, 5) -- The number of standard deviations a data point is from the mean: z = (x - mu) / sigma. Used for standardizing variables and hypothesis testing. Related: standard normal, p-value, confidence interval.
This glossary covers the principal terms across all 42 chapters. For mathematical notation, see Appendix F. For full definitions of statistical concepts, see Appendix A.