Quiz: Time Series Visualization

Q: Which matplotlib module provides date-specific locators and formatters? A) `matplotlib.pyplot` B) `matplotlib.dates` C) `matplotlib.ticker` D) `matplotlib.patches`

B. `matplotlib.dates` (typically imported as `mdates`) provides `YearLocator`, `MonthLocator`, `DateFormatter`, and related classes for datetime axis formatting.

Q: What does `df.rolling(window=30).mean()` compute? A) A simple moving average with a 30-row window B) An exponential moving average with span 30 C) A median of the first 30 rows D) The cumulative sum over 30 rows

A. `rolling(window=30).mean()` is a simple moving average. Each output value is the mean of the previous 30 observations. The first 29 values are NaN because there is not enough history.

Q: The three components of classical seasonal decomposition are: A) Min, median, max B) Trend, seasonal, residual C) Mean, variance, skew D) Year, month, day

B. Trend (long-term direction), seasonal (repeating cycle), and residual (noise remaining after removing the other two). Additive or multiplicative combination.

Q: Which statsmodels function performs seasonal decomposition using LOESS? A) `seasonal_decompose` B) `STL` C) `ARIMA` D) `ExponentialSmoothing`

B. `statsmodels.tsa.seasonal.STL` implements Seasonal-Trend decomposition using LOESS. It is more robust to outliers than `seasonal_decompose` and allows the seasonal component to evolve over time.

Q: A calendar heatmap is most useful when: A) You want to see long-term trend B) You want to see weekly cycles, seasonal patterns, and individual day outliers C) You have only monthly data D) You need to compare two series

B. Calendar heatmaps compress daily data into a compact 2D layout that makes weekly and seasonal patterns visible alongside individual day anomalies. For long-term trend, a line chart is better.

Q: Sparklines were popularized by: A) William Cleveland B) Edward Tufte C) John Tukey D) Hans Rosling

B. Edward Tufte introduced sparklines in his 2006 book Beautiful Evidence. They are very small line charts designed to sit inline with text, conveying shape without chrome.

Q: "Banking to 45 degrees" refers to: A) Rotating the chart 45 degrees B) The principle that line charts are most readable when average slopes are near 45 degrees C) A color scheme D) An axis label convention

B. William Cleveland's research showed that humans compare slopes most accurately when they are close to 45 degrees. Choose your aspect ratio so the interesting slope changes appear near 45 degrees on average.

Q: Which Plotly layout property adds a range slider to a time-series chart? A) `xaxis_range` B) `xaxis_rangeslider_visible=True` C) `slider=True` D) `rangeslider=True`

B. `fig.update_layout(xaxis_rangeslider_visible=True)` enables the range slider. Combine with range selector buttons for preset zoom levels.

Q: Which pandas method converts daily data to monthly averages? A) `df.groupby(df.index.month).mean()` B) `df.resample("M").mean()` C) `df.rolling(30).mean()` D) `df.aggregate("month")`

B. `resample("M").mean()` aggregates by month (end-of-month anchor). Use `"MS"` for start-of-month. The other options compute related but different things.

Q: Which is a common time series pitfall? A) Dual-axis abuse (showing correlation by axis manipulation) B) Non-zero baselines for area charts C) Misleading smoothing that hides events D) All of the above

D. All three are pitfalls discussed in Section 25.11. Others include invisible missing data gaps, irregular sampling, and confusing x-axis zero.

DataField.Dev

Quiz: Time Series Visualization

Part I: Multiple Choice (10 questions)

Q1. Which matplotlib module provides date-specific locators and formatters?

A) matplotlib.pyplot B) matplotlib.dates C) matplotlib.ticker D) matplotlib.patches

Answer

**B.** `matplotlib.dates` (typically imported as `mdates`) provides `YearLocator`, `MonthLocator`, `DateFormatter`, and related classes for datetime axis formatting.

Q2. What does df.rolling(window=30).mean() compute?

A) A simple moving average with a 30-row window B) An exponential moving average with span 30 C) A median of the first 30 rows D) The cumulative sum over 30 rows

Answer

**A.** `rolling(window=30).mean()` is a simple moving average. Each output value is the mean of the previous 30 observations. The first 29 values are NaN because there is not enough history.

Q3. The three components of classical seasonal decomposition are:

A) Min, median, max B) Trend, seasonal, residual C) Mean, variance, skew D) Year, month, day

Answer

**B.** Trend (long-term direction), seasonal (repeating cycle), and residual (noise remaining after removing the other two). Additive or multiplicative combination.

Q4. Which statsmodels function performs seasonal decomposition using LOESS?

A) seasonal_decompose B) STL C) ARIMA D) ExponentialSmoothing

Answer

**B.** `statsmodels.tsa.seasonal.STL` implements Seasonal-Trend decomposition using LOESS. It is more robust to outliers than `seasonal_decompose` and allows the seasonal component to evolve over time.

Q5. A calendar heatmap is most useful when:

A) You want to see long-term trend B) You want to see weekly cycles, seasonal patterns, and individual day outliers C) You have only monthly data D) You need to compare two series

Answer

**B.** Calendar heatmaps compress daily data into a compact 2D layout that makes weekly and seasonal patterns visible alongside individual day anomalies. For long-term trend, a line chart is better.

Q6. Sparklines were popularized by:

A) William Cleveland B) Edward Tufte C) John Tukey D) Hans Rosling

Answer

**B.** Edward Tufte introduced sparklines in his 2006 book *Beautiful Evidence*. They are very small line charts designed to sit inline with text, conveying shape without chrome.

Q7. "Banking to 45 degrees" refers to:

A) Rotating the chart 45 degrees B) The principle that line charts are most readable when average slopes are near 45 degrees C) A color scheme D) An axis label convention

Answer

**B.** William Cleveland's research showed that humans compare slopes most accurately when they are close to 45 degrees. Choose your aspect ratio so the interesting slope changes appear near 45 degrees on average.

Q8. Which Plotly layout property adds a range slider to a time-series chart?

A) xaxis_range B) xaxis_rangeslider_visible=True C) slider=True D) rangeslider=True

Answer

**B.** `fig.update_layout(xaxis_rangeslider_visible=True)` enables the range slider. Combine with range selector buttons for preset zoom levels.

Q9. Which pandas method converts daily data to monthly averages?

A) df.groupby(df.index.month).mean() B) df.resample("M").mean() C) df.rolling(30).mean() D) df.aggregate("month")

Answer

**B.** `resample("M").mean()` aggregates by month (end-of-month anchor). Use `"MS"` for start-of-month. The other options compute related but different things.

Q10. Which is a common time series pitfall?

A) Dual-axis abuse (showing correlation by axis manipulation) B) Non-zero baselines for area charts C) Misleading smoothing that hides events D) All of the above

Answer

**D.** All three are pitfalls discussed in Section 25.11. Others include invisible missing data gaps, irregular sampling, and confusing x-axis zero.

Part II: Short Answer (10 questions)

Q11. Write matplotlib code to set a time axis to show major ticks at every year and minor ticks at every month.

Answer

import matplotlib.dates as mdates
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y"))
ax.xaxis.set_minor_locator(mdates.MonthLocator())

Q12. Explain the difference between a centered rolling mean and a left-aligned (trailing) rolling mean.

Answer

A **left-aligned (trailing) rolling mean** averages the current observation with the previous N-1 observations, so the mean lags the raw data by about N/2. A **centered rolling mean** (set `center=True`) averages the current observation with N/2 observations before and after, so there is no lag but the first and last N/2 values are NaN because they need future data that does not exist. Centered windows are used for historical analysis; trailing windows are used for real-time monitoring.

Q13. Write code to add a shaded region to a chart representing the 2020 recession.

Answer

ax.axvspan(pd.Timestamp("2020-02-01"), pd.Timestamp("2020-04-30"),
           alpha=0.2, color="gray", label="Recession")

Q14. Describe the visual design of a forecast chart with historical and predicted values.

Answer

Historical data in one color (typically black or dark). Forecast in a contrasting color (red, orange). A shaded band around the forecast line representing the confidence interval, widening as the horizon extends. A vertical line or marker at the boundary between historical and forecast data to make the transition explicit. Legend identifying each series. The result conveys both the central prediction and the uncertainty.

Q15. What is a cycle plot (seasonal subseries plot), and when is it useful?

Answer

A cycle plot shows one small panel per season (month, quarter) with each panel containing the values for that season across multiple years, plus a reference line for the season's long-term mean. It is useful when you want to see which seasons are trending up, which are trending down, and which have the most variability. The standard use case is climate data where different months are warming at different rates — a pattern invisible in a single line chart.

Q16. Why is a 365-day rolling mean useful for climate data but potentially misleading for daily business metrics?

Answer

For climate data, the 365-day window removes the annual cycle, revealing the long-term trend. This is exactly what you want because the seasonal variation is noise relative to the warming signal. For daily business metrics, the same window removes not just annual seasonality but also weekly cycles, monthly patterns, and any events shorter than a year. It obscures exactly the patterns a business analyst needs to see. The window size should match the time scale of the phenomenon you are studying.

Q17. Describe two strategies for visualizing a time series with missing data.

Answer

(1) **Leave gaps visible**: plot NaN values as gaps in the line. Matplotlib and Plotly both do this by default. Honest but sometimes unnoticed. (2) **Highlight missing periods explicitly**: use `axvspan` to shade periods where data is missing, with a label like "No data." This makes gaps unmistakable. (3) **Interpolate and flag**: fill the gaps with interpolation and draw the filled section in a different color or line style. Avoid silent interpolation — the reader should always know which values are observed and which are inferred.

Q18. What is the difference between rolling(window=30) and ewm(span=30) in pandas?

Answer

`rolling(window=30)` computes over a fixed 30-observation window with equal weights. `ewm(span=30)` computes an exponentially weighted version where recent observations are weighted more heavily. The `span` parameter corresponds loosely to a similar effective window size, but the exponential weights mean recent changes have more influence. `ewm` reacts faster to recent changes; `rolling` has more lag but is more stable.

Q19. Write a complete Plotly Express call for a time series with range slider, unified hover, and template "simple_white".

Answer

import plotly.express as px

fig = px.line(df, x=df.index, y="value", template="simple_white", title="Time Series")
fig.update_layout(xaxis_rangeslider_visible=True, hovermode="x unified")
fig.show()

Q20. The chapter's climate project uses a 4-panel figure. Name the four panels and what each one reveals.

Answer

(1) **Full series with rolling mean**: long-term trend with short-term context. (2) **STL decomposition**: trend, seasonal, and residual components separated explicitly. (3) **Calendar heatmap**: daily anomalies over recent years, revealing individual day events and seasonal patterns. (4) **Cycle plot**: monthly means across decades, revealing which months are warming fastest. Together they give a complete picture that no single panel provides.

Scoring Rubric

Score	Level	Meaning
18–20	Mastery	You understand datetime axes, decomposition, smoothing, and the specialized chart types for time series.
14–17	Proficient	You know the main APIs; review decomposition and calendar heatmap sections.
10–13	Developing	You grasp the basics; re-read Sections 25.4-25.8 and work all Part B exercises.
< 10	Review	Re-read the full chapter and complete all Part A and Part B exercises.

After this quiz, move on to Chapter 26 (Text and NLP Visualization).