Chapter 29 Further Reading: Neural Networks for Sports Prediction

The following annotated bibliography provides resources for deeper exploration of the neural network concepts introduced in Chapter 29. Entries are organized by category and chosen for their relevance to building deep learning models for sports prediction.


Books: Deep Learning Foundations

1. Goodfellow, Ian, Bengio, Yoshua, and Courville, Aaron. Deep Learning. MIT Press, 2016. The definitive textbook on deep learning theory. The chapters on feedforward networks (Chapter 6), regularization (Chapter 7), optimization (Chapter 8), and recurrent networks (Chapter 10) provide rigorous mathematical foundations for every concept in Chapter 29. Freely available at deeplearningbook.org. Essential for understanding why architectures and training procedures work the way they do.

2. Zhang, Aston, Lipton, Zachary C., Li, Mu, and Smola, Alexander J. Dive into Deep Learning. Cambridge University Press, 2023. A practical, code-first deep learning textbook with implementations in PyTorch, TensorFlow, and JAX. The LSTM and sequence model chapters include interactive notebooks that let you experiment with architecture variations. Particularly useful for the hands-on exercises in Chapter 29.

3. Howard, Jeremy and Gugger, Sylvain. Deep Learning for Coders with fastai and PyTorch. O'Reilly Media, 2020. A practitioner-oriented introduction to deep learning that emphasizes transfer learning and entity embeddings. Chapter 9 on tabular data and entity embeddings directly inspired the SportsEmbeddingNet architecture in this chapter. The fastai library's tabular learner demonstrates many of the patterns discussed here.

4. Stevens, Eli, Antiga, Luca, and Viehmann, Thomas. Deep Learning with PyTorch. Manning, 2020. A comprehensive guide to PyTorch implementation, covering tensors, datasets, training loops, and deployment. The chapters on custom datasets and training pipelines provide templates for the SportsModelTrainer class. Excellent for building the implementation skills needed for the code exercises.


Books and Surveys: Neural Networks in Sports

5. Hubacek, Ondrej, Sourek, Gustav, and Zelezny, Filip. "Exploiting Sports-betting Market Using Machine Learning." International Journal of Forecasting, 35(2), 2019, pp. 783-796. One of the most rigorous studies of machine learning for sports betting, comparing neural networks, random forests, and logistic regression for soccer match prediction. The paper finds that neural networks provide modest improvements when combined with proper feature engineering and calibration, consistent with the chapter's recommendation to use neural networks as complements to, not replacements for, strong baselines.

6. Guo, Cheng and Berkhahn, Felix. "Entity Embeddings of Categorical Variables." arXiv preprint arXiv:1604.06737, 2016. The seminal paper on entity embeddings for tabular data, which won 3rd place in the Kaggle Rossmann Store Sales competition. This paper demonstrates that learned embeddings capture meaningful similarity structures for categorical variables and outperform one-hot encoding. The approach was directly adopted for the SportsEmbeddingNet in this chapter.

7. Horvat, Toni and Job, Josip. "The Use of Machine Learning in Sport Outcome Prediction: A Review." Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(5), 2020. A comprehensive review of machine learning applications in sports prediction, covering neural networks, ensemble methods, and deep learning architectures. The survey identifies that neural networks underperform tree-based models on most sports prediction tasks unless embeddings or sequential models are used --- a finding that supports the chapter's decision framework.


Academic Papers: Architecture and Training

8. Hochreiter, Sepp and Schmidhuber, Jurgen. "Long Short-Term Memory." Neural Computation, 9(8), 1997, pp. 1735-1780. The original LSTM paper that introduced the gated cell architecture. While the notation differs from modern presentations, the core insight --- that additive cell state updates preserve gradients over long sequences --- is explained clearly. Essential background for understanding why LSTMs work for sequential sports data.

9. Bergstra, James and Bengio, Yoshua. "Random Search for Hyper-Parameter Optimization." Journal of Machine Learning Research, 13, 2012, pp. 281-305. Demonstrates that random search is more efficient than grid search for hyperparameter optimization, providing the theoretical foundation for Optuna's sampling approach. The key insight --- that not all hyperparameters are equally important, and random search explores the important dimensions more efficiently --- directly informs the hyperparameter tuning strategy in Section 29.5.

10. Akiba, Takuya, Sano, Shotaro, Yanase, Toshihiko, Ohta, Takeru, and Koyama, Masanori. "Optuna: A Next-generation Hyperparameter Optimization Framework." Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 2623-2631. The paper introducing Optuna, the hyperparameter optimization framework used in Section 29.5. Describes the Tree-structured Parzen Estimator (TPE) algorithm and the pruning mechanism that enables efficient search. Essential reading for understanding how Optuna makes intelligent exploration decisions.

11. Ioffe, Sergey and Szegedy, Christian. "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift." Proceedings of the 32nd International Conference on Machine Learning, 2015, pp. 448-456. Introduced batch normalization, a technique used in every feedforward architecture in this chapter. The paper explains how normalizing layer inputs stabilizes training dynamics and acts as an implicit regularizer. While the original "internal covariate shift" explanation has been debated, the practical benefits of batch normalization for tabular neural networks are well-established.

12. Srivastava, Nitish, Hinton, Geoffrey, Krizhevsky, Alex, Sutskever, Ilya, and Salakhutdinov, Ruslan. "Dropout: A Simple Way to Prevent Neural Networks from Overfitting." Journal of Machine Learning Research, 15, 2014, pp. 1929-1958. The foundational paper on dropout regularization. Explains the theoretical connection between dropout and model averaging (each training step uses a different subnetwork), and provides guidelines for setting dropout rates. The recommendation of 0.2-0.5 dropout for sports prediction models in this chapter follows directly from this paper's analysis.


Technical Resources and Tutorials

13. PyTorch Official Tutorials (pytorch.org/tutorials) The official PyTorch tutorial collection, including tutorials on custom datasets, training loops, transfer learning, and LSTM text classification. The "Learning PyTorch with Examples" tutorial is the ideal starting point for readers new to PyTorch. The "Sequence Models and Long Short-Term Memory Networks" tutorial maps directly to the SportsLSTM class.

14. PyTorch Lightning Documentation (lightning.ai/docs) PyTorch Lightning provides a structured framework for training loops that reduces boilerplate code. For production sports prediction systems, Lightning's built-in support for checkpointing, logging, early stopping, and distributed training can simplify the SportsModelTrainer class significantly. The documentation includes migration guides for converting raw PyTorch code.

15. Optuna Documentation and Examples (optuna.readthedocs.io) The official Optuna documentation with examples covering PyTorch integration, pruning strategies, multi-objective optimization, and visualization of search results. The "PyTorch Simple" example provides a template for the create_optuna_objective function in Section 29.5.

16. Weights and Biases (wandb.ai) Experiment Tracking W&B provides experiment tracking, hyperparameter sweep management, and model registry services. For teams running multiple neural network experiments for sports prediction, W&B's integration with PyTorch and Optuna enables systematic comparison of architectures and training configurations. The free tier is sufficient for individual practitioners.


Data Sources and Tools

17. NBA API and Basketball Reference (basketball-reference.com) Basketball Reference provides comprehensive historical statistics, advanced metrics, and game logs for all NBA teams and players. Combined with the nba_api Python package, these sources provide the raw data needed to construct the training datasets for the embedding and LSTM models described in this chapter. The "Advanced Stats" and "Game Log" pages are particularly relevant.

18. Hugging Face Model Hub for Sports Models (huggingface.co) While primarily associated with NLP and vision models, the Hugging Face Hub increasingly hosts custom models for tabular prediction, including some sports prediction models. The Hub's model card format provides a useful template for documenting sports prediction models, and the Spaces feature enables interactive demo deployment.


How to Use This Reading List

For readers working through this textbook sequentially, the following prioritization is suggested:

  • Start with: Goodfellow et al. (entry 1) Chapters 6-8 for theory, and PyTorch tutorials (entry 13) for implementation. These provide the foundations for all code in Chapter 29.
  • Go deeper on embeddings: Guo and Berkhahn (entry 6) and Howard and Gugger (entry 3) Chapter 9 for entity embedding theory and practice.
  • Go deeper on LSTMs: Hochreiter and Schmidhuber (entry 8) for the original theory, and Zhang et al. (entry 2) for modern PyTorch implementations.
  • Go deeper on hyperparameter tuning: Akiba et al. (entry 10) and Optuna documentation (entry 15) for the tuning framework.
  • For production systems: PyTorch Lightning (entry 14) and W&B (entry 16) for scaling from notebook experiments to production pipelines.

These resources will be referenced again in later chapters as neural network concepts are applied to model evaluation, ensemble methods, and deployment.