Appendix B: Glossary of Terms

Football Analytics Terms

A

Adjusted Net Yards per Attempt (ANY/A) A quarterback metric that accounts for touchdowns, interceptions, and sacks. Formula: (Passing Yards - Sack Yards + 20×TD - 45×INT) / (Pass Attempts + Sacks)

Air Yards The distance the ball travels in the air on a pass play, measured from the line of scrimmage to where the receiver catches (or would catch) the ball.

B

Backpressure In streaming systems, a mechanism to slow data producers when consumers cannot keep up with the processing rate.

Brier Score A metric for evaluating probabilistic predictions. Calculated as the mean squared error between predicted probabilities and actual outcomes (0 or 1). Lower is better.

C

Calibration The property of a probabilistic model where predicted probabilities match observed frequencies. A well-calibrated win probability model predicting 70% should see teams win approximately 70% of the time.

Completion Percentage Over Expected (CPOE) The difference between a quarterback's actual completion percentage and expected completion percentage based on factors like distance, coverage, and receiver separation.

Coverage Defensive scheme for defending passing plays. Common types include Cover 0 (man, no safety help), Cover 1 (man with single high safety), Cover 2 (two deep safeties), Cover 3 (three deep defenders), and Cover 4 (four deep defenders).

D

Defensive Havoc Rate Percentage of plays where the defense creates a tackle for loss, forced fumble, or pass deflection.

Down The current play attempt in a series of four attempts (downs) to advance the ball 10 yards. First down, second down, third down, fourth down.

Dropback When a quarterback moves backward from the line of scrimmage to throw a pass.

DVOA (Defense-adjusted Value Over Average) Football Outsiders metric measuring team and player efficiency compared to league average, adjusted for opponent strength.

E

EPA (Expected Points Added) The change in expected points resulting from a play. Positive EPA indicates a successful play for the offense.

Expected Points (EP) The average number of points a team is expected to score from a given field position, down, and distance situation.

Explosive Play A play gaining a large number of yards, typically defined as 20+ yards for passes or 10-15+ yards for runs.

F

Feature Engineering The process of creating new variables (features) from raw data for use in machine learning models.

Field Position Location on the field, typically expressed as yards from the team's own goal line or opponent's goal line.

Fourth-Down Decision The choice between attempting to convert, kicking a field goal, or punting on fourth down.

G

Game State The complete situation at any point in a game, including score, time remaining, field position, down and distance, and possession.

H

Horizontal Pod Autoscaler (HPA) Kubernetes feature that automatically adjusts the number of pod replicas based on resource utilization or custom metrics.

I

Idempotent A property where an operation produces the same result regardless of how many times it is performed.

K

Key Performance Indicator (KPI) Metrics used to evaluate success in achieving objectives.

L

Latency The time delay between an event occurring and the system responding to it.

Leverage Index (LI) A measure of the importance of a game situation. Higher leverage means the current play has greater impact on win probability.

Line of Scrimmage (LOS) The yard line where the ball is placed to start a play.

Logistic Regression A statistical model that predicts the probability of a binary outcome.

M

Microservices An architectural style where applications are structured as a collection of loosely coupled, independently deployable services.

N

Neutral Site Game A game played at a venue that is home to neither team.

O

Odds Ratio The ratio of the odds of an event occurring in one group to the odds of it occurring in another group.

P

P-Value The probability of observing results at least as extreme as the actual results, assuming the null hypothesis is true.

PAR (Points Above Replacement) A metric measuring a player's value compared to a replacement-level player.

Personnel Grouping The combination of player positions on the field for a play. Example: 11 personnel = 1 RB, 1 TE, 3 WR.

Play Action A play where the quarterback fakes a handoff before attempting to pass.

Points Per Drive Average points scored per offensive drive.

Pressure Rate Percentage of dropbacks where the quarterback is pressured by the defense.

Q

QBR (Quarterback Rating) ESPN's proprietary metric for evaluating quarterback performance, incorporating EPA and game context.

Quartile Values dividing a distribution into four equal parts.

R

R-Squared (R²) The proportion of variance in the dependent variable explained by the independent variables. Ranges from 0 to 1.

Rate Limiting Controlling the rate of requests to a service to prevent overload.

Red Zone The area between the opponent's 20-yard line and goal line.

Regression to the Mean The phenomenon where extreme observations tend to be followed by more average observations.

RMSE (Root Mean Square Error) A measure of prediction accuracy. The square root of the mean of squared differences between predicted and actual values.

S

Sack When the quarterback is tackled behind the line of scrimmage while attempting to pass.

Separation The distance between a receiver and the nearest defender, typically measured at the time of catch or throw.

SQL (Structured Query Language) A programming language for managing and querying relational databases.

Standard Deviation A measure of the amount of variation in a set of values.

Streaming Data Data that is continuously generated and processed in real-time.

Success Rate Percentage of plays deemed successful based on down-specific criteria. Typically: 40% of needed yards on 1st down, 60% on 2nd down, 100% on 3rd/4th down.

T

Target When a receiver is the intended recipient of a pass.

Time Series A sequence of data points indexed in time order.

True Positive Rate The proportion of actual positives correctly identified by a classifier. Also called sensitivity or recall.

Turnover When possession changes due to a fumble or interception.

V

Variance A measure of the spread of a distribution, calculated as the average of squared deviations from the mean.

W

WebSocket A protocol enabling full-duplex communication channels over a single TCP connection.

Win Probability (WP) The probability that a team will win based on the current game state.

Win Probability Added (WPA) The change in win probability resulting from a play. WPA = WP_after - WP_before.

Y

Yards After Catch (YAC) The distance a receiver advances after catching the ball.

Yard Line Position on the field, typically measured in yards from the team's own goal line (1-50) or from the opponent's goal line (opponent's 50 to opponent's 1).

Z

Z-Score The number of standard deviations an observation is from the mean.


Statistical Terms

Descriptive Statistics

  • Mean: Average value
  • Median: Middle value when sorted
  • Mode: Most frequent value
  • Range: Difference between maximum and minimum
  • Interquartile Range (IQR): Difference between 75th and 25th percentiles

Probability

  • Probability Distribution: Function describing the likelihood of different outcomes
  • Normal Distribution: Bell-curve distribution
  • Binomial Distribution: Distribution of successes in n independent trials
  • Poisson Distribution: Distribution of events in a fixed time interval

Inference

  • Confidence Interval: Range of values likely to contain the true parameter
  • Hypothesis Test: Procedure for testing claims about populations
  • Type I Error: Rejecting a true null hypothesis (false positive)
  • Type II Error: Failing to reject a false null hypothesis (false negative)

Machine Learning

  • Supervised Learning: Learning from labeled data
  • Unsupervised Learning: Finding patterns in unlabeled data
  • Overfitting: Model performs well on training data but poorly on new data
  • Cross-Validation: Technique for evaluating model performance on different data subsets
  • Feature: Input variable for a model
  • Label/Target: Output variable to predict

Technical Terms

Data Engineering

  • ETL: Extract, Transform, Load - process for moving data
  • Data Pipeline: Automated workflow for processing data
  • Data Lake: Repository storing raw data in native format
  • Data Warehouse: Repository storing structured, processed data

Software Engineering

  • API (Application Programming Interface): Interface for software communication
  • REST (Representational State Transfer): Architectural style for web services
  • JSON (JavaScript Object Notation): Data interchange format
  • Git: Version control system
  • Docker: Container platform for application deployment
  • Kubernetes: Container orchestration platform

Cloud Computing

  • AWS: Amazon Web Services
  • GCP: Google Cloud Platform
  • Azure: Microsoft's cloud platform
  • Serverless: Computing model where cloud provider manages servers

Acronyms

Acronym Full Term
API Application Programming Interface
AWS Amazon Web Services
CFB College Football
CPOE Completion Percentage Over Expected
CPU Central Processing Unit
CSV Comma-Separated Values
DB Database
DVOA Defense-adjusted Value Over Average
EPA Expected Points Added
ETL Extract, Transform, Load
FBS Football Bowl Subdivision
FCS Football Championship Subdivision
GB Gigabyte
GPU Graphics Processing Unit
HTML Hypertext Markup Language
HTTP Hypertext Transfer Protocol
JSON JavaScript Object Notation
LI Leverage Index
ML Machine Learning
MSE Mean Squared Error
MVP Minimum Viable Product
NFL National Football League
PAR Points Above Replacement
PFF Pro Football Focus
RAM Random Access Memory
REST Representational State Transfer
RMSE Root Mean Squared Error
ROC Receiver Operating Characteristic
SQL Structured Query Language
SVM Support Vector Machine
TB Terabyte
URL Uniform Resource Locator
WP Win Probability
WPA Win Probability Added
YAC Yards After Catch