Chapter 22: Quiz - Machine Learning Applications
Instructions
Choose the best answer for each question. Questions cover ML fundamentals, supervised learning, ensemble methods, clustering, neural networks, and evaluation.
Section 1: Machine Learning Fundamentals (Questions 1-8)
Question 1
Which ML task type is appropriate for predicting whether a team will win or lose?
A) Regression B) Classification C) Clustering D) Dimensionality reduction
Question 2
The main advantage of machine learning over traditional statistics for football analytics is:
A) ML requires less data B) ML automatically discovers complex patterns and interactions C) ML models are always more accurate D) ML doesn't require domain knowledge
Question 3
Data leakage in football ML occurs when:
A) The model uses too many features B) Future information is used to predict past outcomes C) The training set is too small D) The model is overfit
Question 4
For game outcome prediction, which train/test split strategy is most appropriate?
A) Random 80/20 split B) Stratified random split C) Temporal split (train on past, test on future) D) Leave-one-out cross-validation
Question 5
Which feature would likely cause data leakage for predicting game outcomes?
A) Home team's Elo rating before the game B) Season-ending team record C) Opponent's defensive ranking entering the game D) Home field advantage indicator
Question 6
The curse of dimensionality in football ML refers to:
A) Having too many games to process B) Performance degradation when too many features relative to samples C) Difficulty in storing large datasets D) Slow model training times
Question 7
Feature engineering in football analytics:
A) Is unnecessary with modern ML algorithms B) Converts raw data into meaningful predictive signals C) Always improves model performance D) Should be avoided to prevent overfitting
Question 8
Which statement about football ML is TRUE?
A) More complex models always perform better B) Ensemble methods never outperform single models C) Domain knowledge should guide feature creation D) Neural networks are always the best choice
Section 2: Supervised Learning (Questions 9-17)
Question 9
Logistic regression is appropriate for game prediction because:
A) It outputs exact point spreads B) It naturally produces probabilities between 0 and 1 C) It requires the least amount of data D) It captures all non-linear relationships
Question 10
Random Forest classifiers work by:
A) Training a single decision tree with many features B) Training multiple trees on random subsets of data and features C) Using random weights for each prediction D) Randomly selecting which games to predict
Question 11
Gradient boosting improves on random forests by:
A) Training trees in parallel rather than sequence B) Sequentially training trees to correct previous errors C) Using larger trees D) Requiring less hyperparameter tuning
Question 12
The learning rate parameter in XGBoost controls:
A) How fast the model makes predictions B) How much each tree contributes to the final prediction C) The number of trees to train D) The depth of each tree
Question 13
Which XGBoost parameter helps prevent overfitting?
A) n_estimators B) objective C) max_depth D) random_state
Question 14
For predicting point spreads (continuous), which model type is appropriate?
A) Logistic regression B) Random forest classifier C) Gradient boosting regressor D) K-means clustering
Question 15
Early stopping in model training:
A) Stops training when validation error starts increasing B) Stops training after a fixed number of epochs C) Stops training when accuracy reaches 100% D) Is only used for neural networks
Question 16
The subsample parameter in XGBoost:
A) Determines test set size B) Controls fraction of samples used per tree C) Sets the number of features to use D) Defines output sample rate
Question 17
Which regularization approach is used by Ridge regression?
A) L1 (sum of absolute coefficients) B) L2 (sum of squared coefficients) C) Dropout D) Early stopping
Section 3: Ensemble Methods (Questions 18-24)
Question 18
Voting ensembles combine multiple models by:
A) Training one model on predictions of others B) Averaging or voting on individual model predictions C) Selecting the best-performing model D) Concatenating model outputs
Question 19
Stacking differs from voting by:
A) Using fewer base models B) Training a meta-learner on base model predictions C) Only working with neural networks D) Requiring identical base models
Question 20
For soft voting, ensemble predictions are calculated using:
A) Majority vote of predicted classes B) Average of predicted probabilities C) Maximum probability across models D) Minimum probability across models
Question 21
When building weighted ensembles, weights should be based on:
A) Model complexity B) Training time C) Validation set performance D) Random assignment
Question 22
The primary benefit of ensemble methods is:
A) Faster training time B) Better interpretability C) Reduced variance and improved generalization D) Simpler model architecture
Question 23
Which statement about ensemble diversity is TRUE?
A) All base models should use the same algorithm B) Diversity among base models improves ensemble performance C) Only two models are needed for effective ensembles D) Correlated models make better ensembles
Question 24
A stacking ensemble with passthrough=True:
A) Skips the meta-learner B) Includes original features with base model predictions C) Uses base models in parallel only D) Prevents overfitting automatically
Section 4: Unsupervised Learning (Questions 25-29)
Question 25
K-means clustering requires:
A) Labeled training data B) Pre-specified number of clusters C) Normally distributed features D) Binary outcomes
Question 26
The silhouette score measures:
A) Number of clusters B) How well samples fit their assigned cluster vs. others C) Training time D) Feature importance
Question 27
For discovering player archetypes, which technique is appropriate?
A) Logistic regression B) K-means or GMM clustering C) Linear regression D) Decision trees
Question 28
The elbow method helps determine:
A) Optimal regularization strength B) Optimal number of clusters C) Optimal learning rate D) Optimal train/test split
Question 29
Before clustering player statistics, you should:
A) Remove all outliers B) Standardize/scale features C) Use raw statistics directly D) Reduce to one feature
Section 5: Neural Networks (Questions 30-33)
Question 30
In a feed-forward neural network for classification:
A) The output layer uses linear activation B) The output layer uses sigmoid or softmax activation C) Hidden layers use sigmoid activation D) No activation functions are used
Question 31
Dropout regularization:
A) Removes features during training B) Randomly deactivates neurons during training C) Reduces the number of layers D) Only applies to recurrent networks
Question 32
Batch normalization in neural networks:
A) Reduces batch size B) Normalizes layer inputs to stabilize training C) Is only used in image models D) Replaces activation functions
Question 33
LSTM networks are appropriate for football applications involving:
A) Single game prediction B) Sequential data like play-by-play C) Player clustering D) Feature selection
Section 6: Model Evaluation (Questions 34-40)
Question 34
The Brier score for probabilistic predictions:
A) Ranges from -1 to 1 B) Is the mean squared error between predictions and outcomes C) Should be maximized D) Only applies to regression
Question 35
A model with AUC = 0.85 means:
A) It predicts 85% of games correctly B) It ranks a random positive higher than negative 85% of the time C) 85% of predictions are calibrated D) The false positive rate is 15%
Question 36
Expected Calibration Error (ECE) measures:
A) How well probabilities match observed frequencies B) Model training efficiency C) Feature correlation D) Overfitting degree
Question 37
A calibrated model predicting 70% win probability should:
A) Always be correct B) Win about 70% of the time across similar predictions C) Never be wrong D) Have exactly 70% accuracy overall
Question 38
When comparing models, you should primarily consider:
A) Training speed B) Performance on held-out test data C) Number of parameters D) Model popularity
Question 39
For football game prediction, which metric combination is most informative?
A) Accuracy only B) Accuracy, AUC, and Brier score C) Training loss only D) F1 score only
Question 40
Isotonic regression calibration:
A) Retrains the entire model B) Maps raw predictions to better-calibrated probabilities C) Always improves accuracy D) Requires labeled clusters
Answer Key
Section 1: Fundamentals
- B - Classification (binary outcome)
- B - Automatic pattern discovery
- B - Future information used for past predictions
- C - Temporal split preserves time ordering
- B - Season-ending record uses future information
- B - Too many features relative to samples
- B - Converts raw data to predictive signals
- C - Domain knowledge guides feature creation
Section 2: Supervised Learning
- B - Outputs probabilities between 0 and 1
- B - Multiple trees on random subsets
- B - Sequential training to correct errors
- B - Contribution of each tree
- C - max_depth limits tree complexity
- C - Gradient boosting regressor for continuous
- A - Stops when validation error increases
- B - Fraction of samples per tree
- B - L2 regularization
Section 3: Ensemble Methods
- B - Averaging or voting predictions
- B - Trains meta-learner on base predictions
- B - Average of probabilities
- C - Based on validation performance
- C - Reduced variance, better generalization
- B - Diversity improves performance
- B - Includes original features with predictions
Section 4: Unsupervised Learning
- B - Pre-specified number of clusters
- B - Cluster fit quality measure
- B - K-means or GMM clustering
- B - Optimal number of clusters
- B - Standardize/scale features
Section 5: Neural Networks
- B - Sigmoid or softmax for classification
- B - Randomly deactivates neurons
- B - Normalizes layer inputs
- B - Sequential data like play-by-play
Section 6: Model Evaluation
- B - Mean squared error of predictions
- B - Ranks positive higher 85% of time
- A - Probability-frequency alignment
- B - Win about 70% across similar predictions
- B - Performance on held-out test data
- B - Accuracy, AUC, and Brier score
- B - Maps to better-calibrated probabilities
Scoring Guide
- 35-40 correct: Excellent! Ready for production ML systems
- 28-34 correct: Good understanding, review specific areas
- 21-27 correct: Solid foundation, more practice needed
- Below 21: Review chapter material before proceeding