Chapter 16 Exercises: Time Series Forecasting

DataField.Dev

Chapter 16 Exercises: Time Series Forecasting

Section A: Recall and Comprehension

Exercise 16.1 Define the four components of a time series (trend, seasonality, cyclicality, noise) in your own words. For each component, provide one business example not mentioned in the chapter.

Exercise 16.2 Explain the difference between additive and multiplicative decomposition. A startup's monthly revenue has grown from $100,000 to $2,000,000 over three years, and the December spike has grown from $15,000 above average to $300,000 above average. Which decomposition model is appropriate? Justify your answer.

Exercise 16.3 Define stationarity in non-technical language. Why does stationarity matter for ARIMA modeling, and what is the most common technique for achieving it?

Exercise 16.4 Explain each component of ARIMA(p, d, q) using a business analogy rather than mathematical notation. What does each parameter control?

Exercise 16.5 Describe the three variants of exponential smoothing (simple, double/Holt, triple/Holt-Winters). For each, state when it is and is not appropriate.

Exercise 16.6 List five reasons Prophet became the most popular forecasting tool in industry. Which of these reasons are statistical and which are operational?

Exercise 16.7 Explain the difference between MAE, RMSE, and MAPE. Under what circumstances is MAPE misleading, and what alternative should be used?

Exercise 16.8 What is walk-forward validation, and why is random train-test splitting invalid for time series data?

Section B: Application

Exercise 16.9: Decomposition Analysis You are given monthly sales data for a regional coffee chain over four years:

Month	Year 1	Year 2	Year 3	Year 4
Jan	42,000	48,000	55,000	62,000
Feb	38,000	44,000	51,000	58,000
Mar	45,000	52,000	59,000	66,000
Apr	50,000	57,000	65,000	73,000
May	55,000	63,000	72,000	81,000
Jun	62,000	71,000	80,000	90,000
Jul	65,000	74,000	84,000	95,000
Aug	60,000	69,000	78,000	88,000
Sep	52,000	60,000	68,000	77,000
Oct	55,000	63,000	72,000	81,000
Nov	58,000	66,000	75,000	85,000
Dec	60,000	68,000	77,000	87,000

(a) Identify the trend. Is it linear or nonlinear? Estimate the approximate annual growth rate.
(b) Identify the seasonal pattern. Which months are peaks and troughs? Is the seasonality additive or multiplicative?
(c) Estimate the sales for January of Year 5 using your decomposition analysis.
(d) What additional information would improve your forecast?

Exercise 16.10: Method Selection For each of the following business forecasting scenarios, recommend the most appropriate method (naive, moving average, exponential smoothing, ARIMA, Prophet, LSTM, or ensemble). Justify your choice in two to three sentences.

(a) A stable, mature business forecasting monthly revenue that shows no trend and no seasonality — just small random fluctuations around a constant mean.
(b) A fast-growing e-commerce company forecasting daily order volume with strong day-of-week effects, annual seasonality, frequent promotions, and occasional viral traffic spikes.
(c) A utility company forecasting hourly electricity demand for a regional grid, using five years of historical data plus weather forecasts.
(d) A pharmaceutical company forecasting quarterly sales for a drug that has been on the market for 18 months (only 6 data points).
(e) A large retailer forecasting weekly demand for 100,000 SKUs across 500 stores.
(f) An airline forecasting daily passenger counts for a new route that launched three months ago.

Exercise 16.11: Prophet Configuration You are building a Prophet model for a restaurant chain's daily revenue. The chain: - Has strong weekend patterns (Friday/Saturday are peak days) - Sees annual seasonality (summer is busiest, January is slowest) - Runs monthly promotions on the first weekend of each month - Closes on Thanksgiving and Christmas (zero revenue) - Recently opened 5 new locations, causing a trend acceleration

Write the Python code to configure a Prophet model that accounts for all of these factors. Include holiday specification, custom seasonality if needed, and external regressors. You do not need to fit or evaluate the model — just show the configuration.

Exercise 16.12: Interpreting Forecast Output A Prophet model produces the following 4-week forecast for a product category:

Week	Point Forecast	Lower 80%	Upper 80%
1	8,200	7,400	9,000
2	8,500	7,300	9,700
3	8,100	6,800	9,400
4	8,800	7,100	10,500

(a) Calculate the total 4-week point forecast and the total 4-week interval (note: you cannot simply sum the weekly intervals — explain why not, but provide a reasonable approximation).
(b) The supply chain team says they need to order enough inventory to cover a "reasonable worst case." What quantity would you recommend and why?
(c) If the holding cost is $5 per unit per week and the stockout cost is $30 per unit, should the team order closer to the point forecast or the upper bound? Show your reasoning.
(d) The VP of Supply Chain asks: "Why is the Week 4 interval so much wider than Week 1?" Provide an explanation suitable for a non-technical executive.

Exercise 16.13: Walk-Forward Validation Design You have 3 years of daily sales data and need to validate a forecasting model that will be used for 14-day ahead predictions.

(a) Design a walk-forward validation scheme. Specify: the minimum training period, the forecast horizon, the step size between cutoffs, and the expected number of evaluation folds.
(b) The marketing team wants to know the model's accuracy specifically during promotional periods. How would you modify the evaluation to provide this?
(c) The model shows a MAPE of 8% on the first year of test folds but 15% on the third year. What might explain this degradation? What would you do about it?

Exercise 16.14: External Regressor Analysis A retail forecasting team is considering adding the following external regressors to their Prophet model:

Daily temperature (from a weather API)
The S&P 500 closing price
A binary indicator for whether a TV advertising campaign is running
Local unemployment rate (published monthly)
Competitor pricing (scraped weekly from their website)
A binary indicator for school holidays in the store's state

For each regressor, evaluate it on the three criteria from the chapter (causal plausibility, predictive power, future availability). State whether you would include it, exclude it, or test it empirically.

Section C: Analysis and Evaluation

Exercise 16.15: The Accuracy-Complexity Tradeoff A data science team presents two models to the supply chain leadership:

Model A: Prophet with holiday effects. WMAPE: 11.2%. Training time: 2 minutes. Fully automated, runs nightly without intervention.
Model B: Custom LSTM ensemble with attention mechanism. WMAPE: 10.4%. Training time: 6 hours on GPU. Requires weekly manual hyperparameter review by a data scientist.
(a) Calculate the percentage improvement of Model B over Model A.
(b) The company has 50,000 SKUs to forecast. Estimate the computational cost difference (in GPU-hours per week) between the two approaches.
(c) If a senior data scientist costs $85/hour and GPU compute costs $3/hour, what is the weekly cost of each approach?
(d) Write a recommendation to the VP of Supply Chain on which model to deploy. Consider accuracy, cost, maintainability, and risk.

Exercise 16.16: Detecting Forecast Accuracy Theater A forecasting team reports the following to senior leadership: "Our forecast accuracy improved from 82% to 89% this year."

(a) List at least five questions you would ask to assess whether this claim is meaningful.
(b) Describe a scenario where this claim could be technically true but operationally meaningless.
(c) Propose a reporting framework that would prevent forecast accuracy theater. What metrics should be reported, at what levels of aggregation, and how often?

Exercise 16.17: Structural Breaks and Model Failure In March 2020, virtually every demand forecasting model in the world broke simultaneously due to COVID-19.

(a) Explain why time series models are inherently vulnerable to structural breaks.
(b) For each of the following industries, describe one specific way COVID-19 invalidated forecasting assumptions: (i) grocery retail, (ii) airlines, (iii) commercial real estate, (iv) home fitness equipment, (v) business travel services.
(c) A manager argues that since structural breaks are unpredictable, there is no point planning for them. Critique this argument and propose at least three organizational practices that improve resilience to structural breaks.

Exercise 16.18: Hierarchical Forecasting Decision A midsize retailer (200 stores, 20,000 SKUs) is implementing demand forecasting. The data team proposes forecasting at the SKU-store level. The supply chain VP proposes forecasting at the category-region level and disaggregating.

(a) Describe the advantages and disadvantages of each approach.
(b) Most SKUs sell fewer than 5 units per week at any given store. How does this fact affect your recommendation?
(c) Propose a hierarchical forecasting design for this retailer. Specify the aggregation levels you would forecast at and the disaggregation method you would use.

Section D: Research and Application

Exercise 16.19: Industry Forecasting Case Analysis Select an industry (not retail) where time series forecasting is critical. Research and write a 500-word analysis covering: - (a) What is being forecast and at what granularity? - (b) What data sources and external regressors are most important? - (c) What are the primary challenges and common failure modes? - (d) What would a state-of-the-art forecasting system look like in this industry?

Suggested industries: energy (load forecasting), healthcare (patient demand), finance (volatility forecasting), logistics (capacity planning), agriculture (crop yield forecasting).

Exercise 16.20: Prophet Implementation Project Using the Prophet library and a publicly available time series dataset (e.g., Kaggle's "Store Sales — Time Series Forecasting" competition), complete the following:

(a) Perform exploratory analysis: plot the series, identify trend and seasonality, check for structural breaks.
(b) Fit a baseline Prophet model with default settings.
(c) Add at least two custom features: holiday effects, custom seasonalities, or external regressors.
(d) Perform walk-forward cross-validation with at least 5 folds.
(e) Compare Prophet against at least two baselines (naive, moving average, or exponential smoothing).
(f) Create an executive-ready forecast summary showing point forecasts, prediction intervals, and a scenario analysis.
(g) Write a one-page memo summarizing your results and recommending whether Prophet should be adopted for this use case.

Exercise 16.21: Uncertainty Communication Workshop Your forecasting model produces a 90-day demand forecast with an 80% prediction interval. The following stakeholders each need a different version of the results:

(a) CEO (board presentation): Write a two-sentence summary of the forecast.
(b) VP of Supply Chain (planning meeting): Create a scenario table (optimistic / expected / conservative) with specific operational recommendations for each scenario.
(c) Regional store manager: Write an email explaining what the forecast means for their staffing and inventory decisions.
(d) CFO (financial planning): Translate the demand forecast into a revenue range with explicit assumptions.

For each audience, explain what information you emphasize, what you omit, and why.

Exercise 16.22: Forecasting Ethics Consider the following scenario: A company's forecasting model consistently under-predicts demand in stores located in predominantly lower-income neighborhoods, leading to chronic understocking and poor customer experience in those locations. The model performs well on average, and the overall MAPE looks acceptable.

(a) Why might a forecasting model exhibit this bias? Identify at least three potential causes.
(b) How would you detect this bias if you were only looking at aggregate accuracy metrics?
(c) Propose a modification to the forecasting evaluation process that would surface this kind of inequity.
(d) Connect this issue to the Responsible Innovation theme introduced in Chapter 1. How does forecast bias relate to algorithmic fairness?

Section E: Athena Application

Exercise 16.23: Athena's Forecasting Roadmap Based on the Athena Update in this chapter, create a project plan for implementing the demand forecasting system. Your plan should include: - (a) Key milestones and timeline - (b) Resource requirements (personnel, technology, data) - (c) Risk factors and mitigation strategies - (d) Success metrics and how they would be tracked - (e) A phased rollout plan (which regions/categories first, and why?)

Exercise 16.24: Athena's ROI Justification Athena's forecasting system cost approximately $450,000 in the first year and saved $6.1 million in inventory carrying costs.

(a) Calculate the first-year ROI.
(b) Estimate the ongoing annual cost (assume personnel costs remain but initial development is amortized).
(c) What assumptions underlie the $6.1 million savings estimate? Which assumptions are strongest and weakest?
(d) The CFO asks: "How do we know the savings came from the new forecasting system and not from other supply chain improvements happening at the same time?" How would you design a measurement approach to isolate the forecasting system's contribution?

Answers to selected exercises are provided in the Answers appendix. Exercises marked with an asterisk () in the appendix include detailed worked solutions. Exercises in Section D require independent research and analysis — there is no single correct answer.*