Chapter 16 Further Reading: Time Series Forecasting

DataField.Dev

Chapter 16 Further Reading: Time Series Forecasting

Foundational Texts

1. Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and Practice (3rd ed.). OTexts. The single best resource for learning time series forecasting. Freely available at otexts.com/fpp3, this textbook covers everything from decomposition through ARIMA, exponential smoothing, regression with time series, and modern forecasting methods — all with clear explanations, R code examples, and business applications. If you read only one supplementary resource for this chapter, make it this one. The third edition adds coverage of neural network methods and ensemble techniques.

2. Box, G. E. P., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time Series Analysis: Forecasting and Control (5th ed.). Wiley. The definitive technical reference for ARIMA and Box-Jenkins methodology. Originally published in 1970, this book established the framework that dominated time series analysis for decades. The mathematical treatment is rigorous — more demanding than this textbook requires — but the conceptual foundations are worth understanding. Read the first four chapters for the intuition; consult the remainder as a technical reference.

3. Makridakis, S., Wheelwright, S. C., & Hyndman, R. J. (2008). Forecasting: Methods and Applications (3rd ed.). Wiley. A practical, business-oriented guide to forecasting methods with extensive coverage of exponential smoothing, judgment-based forecasting, and the organizational aspects of forecasting — topics that more technical texts often neglect. Particularly strong on the behavioral and political challenges of forecasting in organizations, including the forecast-vs-target distinction discussed in this chapter.

Prophet and Modern Business Forecasting

4. Taylor, S. J., & Letham, B. (2018). "Forecasting at Scale." The American Statistician, 72(1), 37-45. The original paper introducing Facebook Prophet. Taylor and Letham explain Prophet's design philosophy — treating forecasting as a curve-fitting problem rather than a time series problem — and describe the practical considerations that drove their design choices. Essential reading for understanding why Prophet works the way it does and what its assumptions are. The paper is accessible to readers with basic statistics background.

5. Prophet Documentation. (2024). facebook.github.io/prophet. The official Prophet documentation is unusually good — clear, well-organized, and full of practical examples. The "Quick Start" guide, the section on seasonality, holidays, and additional regressors, and the cross-validation tutorial are directly relevant to this chapter's code examples. The documentation also covers advanced topics like multiplicative seasonality and logistic growth that extend beyond the chapter's scope.

6. Januschowski, T., Gasthaus, J., Wang, Y., et al. (2020). "Criteria for Classifying Forecasting Methods." International Journal of Forecasting, 36(1), 167-177. A thoughtful framework for comparing and selecting forecasting methods based on their assumptions, data requirements, and practical characteristics. Useful for making principled choices between ARIMA, exponential smoothing, Prophet, and neural network approaches — exactly the decision practitioners face when designing production forecasting systems.

Forecasting Competitions and Empirical Evidence

7. Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2018). "The M4 Competition: Results, Findings, Conclusion and Way Forward." International Journal of Forecasting, 34(4), 802-808. Results from the M4 competition, which evaluated forecasting methods on 100,000 time series. The headline finding — that simple statistical methods outperformed most machine learning methods, and that the best results came from ensembles — directly informs the chapter's guidance on model selection and ensemble techniques. The winning method (a hybrid of exponential smoothing and a neural network) demonstrated that combining classical and modern approaches outperforms either alone.

8. Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2022). "M5 Accuracy Competition: Results, Findings, and Conclusions." International Journal of Forecasting, 38(4), 1346-1364. The M5 competition focused on hierarchical retail sales forecasting at Walmart — directly relevant to Athena's challenge. Unlike M4, the M5 results showed that machine learning methods (particularly gradient boosting) outperformed statistical methods, likely because the competition provided rich feature sets (prices, promotions, events) that ML methods could exploit. The contrast with M4 underscores the chapter's message: method selection depends on the problem.

9. Petropoulos, F., Apiletti, D., Assimakopoulos, V., et al. (2022). "Forecasting: Theory and Practice." International Journal of Forecasting, 38(3), 845-1168. A monumental 324-page survey covering virtually every aspect of forecasting, written by 82 contributing authors. Topics include judgmental forecasting, machine learning methods, intermittent demand, hierarchical forecasting, forecast evaluation, and forecasting in specific domains (energy, healthcare, supply chain, finance). Encyclopedic in scope — use it as a reference rather than reading cover to cover.

Deep Learning for Time Series

10. Lim, B., & Zoph, B. (2021). "Time-Series Forecasting With Deep Learning: A Survey." Philosophical Transactions of the Royal Society A, 379(2194). A comprehensive survey of deep learning approaches for time series, including RNNs, LSTMs, Temporal Convolutional Networks (TCNs), and Transformer-based methods. The authors provide an honest assessment of when deep learning helps (many related series, complex nonlinear patterns) and when it does not (short series, limited data, strong seasonal patterns that simpler methods handle well). Valuable context for the chapter's discussion of Tom's LSTM experience.

11. Salinas, D., Flunkert, V., Gasthaus, J., & Januschowski, T. (2020). "DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks." International Journal of Forecasting, 36(3), 1181-1191. Amazon's DeepAR model demonstrates how deep learning can excel at time series forecasting when applied to large collections of related series (exactly the "global model" approach discussed in the chapter). DeepAR trains a single model across thousands of series, learning shared patterns while producing probabilistic forecasts for each individual series. Relevant for understanding when LSTMs and similar architectures are worth the complexity.

12. Nie, Y., Nguyen, N. H., Sinthong, P., & Kalagnanam, J. (2023). "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." Proceedings of the International Conference on Learning Representations (ICLR). A provocative paper showing that a simple Transformer-based approach, PatchTST, achieves state-of-the-art results on long-term forecasting benchmarks. The finding that breaking time series into "patches" (analogous to word tokens in NLP) enables Transformers to capture temporal patterns effectively has implications for the future of deep learning in forecasting. Somewhat technical but the core intuition is accessible.

Supply Chain and Demand Planning

13. Chase, C. W. (2013). Demand-Driven Forecasting: A Structured Approach to Forecasting (2nd ed.). Wiley. A practitioner-focused guide to demand forecasting in the supply chain context. Chase, a former P&G demand planner, covers the organizational, process, and technology dimensions of demand planning — not just the algorithms. Particularly relevant for understanding how forecasting fits into the broader Sales & Operations Planning (S&OP) process described in Case Study 1.

14. Gilliland, M. (2010). The Business Forecasting Deal: Exposing Myths, Eliminating Bad Practices, Providing Practical Solutions. Wiley. A sharp, contrarian guide to the politics and pathologies of business forecasting. Gilliland tackles forecast accuracy theater, the value-added question (do planner adjustments actually improve statistical forecasts?), and the chronic overconfidence that plagues most forecasting organizations. His "Forecast Value Added" framework — which measures whether each step in the forecasting process improves or degrades accuracy — is directly relevant to the chapter's discussion of common pitfalls.

15. Syntetos, A. A., Babai, Z., Boylan, J. E., Kolassa, S., & Nikolopoulos, K. (2016). "Supply Chain Forecasting: Theory, Practice, Their Gap and the Future." European Journal of Operational Research, 252(1), 1-26. An academic review of the gap between forecasting theory and supply chain practice. The authors survey the state of demand forecasting in industry and identify persistent challenges: intermittent demand, promotional effects, new product introductions, and the integration of forecasting with inventory management. Useful for understanding why many of the theoretically optimal methods described in textbooks are not used in practice.

Forecast Uncertainty and Communication

16. Silver, N. (2012). The Signal and the Noise: Why So Many Predictions Fail — But Some Don't. Penguin. While not specifically about business forecasting, Silver's book on the art and science of prediction is essential reading for anyone who produces or consumes forecasts. His analysis of weather forecasting, election prediction, and economic forecasting provides vivid illustrations of the themes in this chapter: calibration, overfitting, the role of uncertainty, and the psychological resistance to probabilistic thinking. Accessible and engaging.

17. Tetlock, P. E., & Gardner, D. (2015). Superforecasting: The Art and Science of Prediction. Crown. Tetlock's research on the Good Judgment Project demonstrates that some people are systematically better at forecasting than others — and that the skill can be learned. Key traits of "superforecasters" include probabilistic thinking, intellectual humility, willingness to update beliefs, and comfort with uncertainty. Directly relevant to the chapter's emphasis on communicating uncertainty and resisting false precision.

The 2020 Forecasting Crisis

18. Nikolopoulos, K., Punia, S., Schäfers, A., Tsinopoulos, C., & Vasilakis, C. (2021). "Forecasting and Planning During a Pandemic: COVID-19 Growth Rates, Supply Chain Disruptions, and Governmental Decisions." European Journal of Operational Research, 290(1), 99-115. An early academic analysis of how COVID-19 disrupted forecasting and supply chain planning. The authors examine the limitations of traditional time series methods during structural breaks and propose a framework for adaptive forecasting during crises. Directly relevant to Case Study 2.

19. Syntetos, A. A., Kholidasari, I., & Naim, M. M. (2016). "The Effects of Integrating Management Judgement into Intermittent Demand Forecasts." International Journal of Production Economics, 169, 163-173. While published before the pandemic, this paper's findings on integrating human judgment into statistical forecasts became acutely relevant during 2020. The research shows that structured judgment — guided by specific protocols and combined with statistical methods — outperforms both pure statistical methods and unstructured human judgment. The implications for the "human override" response described in Case Study 2 are direct.

Hierarchical and Intermittent Demand Forecasting

20. Wickramasuriya, S. L., Athanasopoulos, G., & Hyndman, R. J. (2019). "Optimal Forecast Reconciliation for Hierarchical and Grouped Time Series Through Trace Minimization." Journal of the American Statistical Association, 114(526), 804-819. The definitive technical paper on hierarchical forecast reconciliation — the mathematical framework for ensuring that forecasts at different aggregation levels are consistent with each other. Somewhat technical, but the core concept is accessible and directly relevant to the chapter's discussion of Athena's hierarchical forecasting approach.

21. Syntetos, A. A., & Boylan, J. E. (2005). "The Accuracy of Intermittent Demand Estimates." International Journal of Forecasting, 21(2), 303-314. Intermittent demand — series with many zero values and occasional non-zero spikes — is the reality for most SKUs in a large retail catalog. This paper evaluates forecasting methods for intermittent demand and finds that specialized methods (Croston's method and its variants) significantly outperform standard approaches. Essential context for understanding why Athena's team chose hierarchical forecasting over direct SKU-level modeling.

Industry Applications

22. Hong, T., Pinson, P., & Fan, S. (2014). "Global Energy Forecasting Competition 2012." International Journal of Forecasting, 30(2), 357-363. A forecasting competition focused on energy load and price forecasting — a domain where forecast accuracy directly determines operational costs (overprediction wastes fuel; underprediction causes blackouts). The competition results and winning methodologies offer perspectives on time series forecasting in a domain with different characteristics than retail demand.

23. Seaman, B. (2018). "Considerations of a Retail Forecasting Practitioner." International Journal of Forecasting, 34(4), 822-829. A rare practitioner perspective from a Walmart forecasting team member, written for the M4 special issue of the International Journal of Forecasting. Seaman discusses the practical realities of forecasting at retail scale: computational constraints, data quality issues, organizational politics, and the constant tension between model sophistication and operational simplicity. Every MBA student building a forecasting project should read this short paper.

For Python implementation resources, refer to the official documentation for Prophet (facebook.github.io/prophet), statsmodels (statsmodels.org), and scikit-learn's time series utilities. For hands-on practice with real retail data, the Kaggle "Store Sales — Time Series Forecasting" competition provides an excellent starting dataset with promotional calendars, holiday effects, and hierarchical structure.