Exercises: Time Series Visualization
These exercises assume import pandas as pd, import matplotlib.pyplot as plt, import matplotlib.dates as mdates, import numpy as np, and use pandas's DatetimeIndex throughout.
Part A: Conceptual (6 problems)
A.1 ★☆☆ | Recall
Name three matplotlib.dates locators and describe when each is appropriate.
Guidance
`YearLocator` — one tick per year (optionally every N years via `base=N`). Best for multi-year or multi-decade charts. `MonthLocator` — one tick per month (or specific months via `bymonth`). Best for charts spanning months to a few years. `DayLocator` — one tick per day (or specific days via `bymonthday`). Best for charts spanning weeks. Other options include `HourLocator`, `MinuteLocator`, `WeekdayLocator`, and `AutoDateLocator`.A.2 ★☆☆ | Recall
What are the three components of a classical seasonal decomposition?
Guidance
**Trend** (the long-term direction, smooth), **seasonal** (the repeating cycle within each period), and **residual** (what is left after removing trend and seasonal — the noise). The additive model: `observed = trend + seasonal + residual`. The multiplicative model: `observed = trend × seasonal × residual`.A.3 ★★☆ | Understand
Explain the difference between a simple moving average and an exponential moving average.
Guidance
A **simple moving average (SMA)** gives equal weight to every observation in the window. A 30-day SMA is the mean of the last 30 days, each weighted 1/30. An **exponential moving average (EMA)** gives more weight to recent observations than older ones, with the weight decaying exponentially. EMA reacts faster to recent changes than SMA. Visually similar on most data, but EMA has less lag. `df.rolling(window=30).mean()` is SMA; `df.ewm(span=30).mean()` is EMA.A.4 ★★☆ | Understand
When should you use a calendar heatmap instead of a line chart?
Guidance
Use a calendar heatmap when you want to see **weekly cycles**, **seasonal patterns**, or **individual day outliers** in daily data. A line chart of daily data across multiple years is often cluttered; the calendar heatmap compresses the same information into a compact scannable layout. Line charts are better when the question is about **trend** or **magnitude** rather than the calendar pattern.A.5 ★★☆ | Analyze
Describe what "banking to 45 degrees" means and why it matters for time series charts.
Guidance
Banking to 45 degrees is the perceptual principle (from William Cleveland's 1988 research) that line charts are most readable when the average slope of the line is close to 45 degrees. At 45 degrees, the reader can compare adjacent segments' slopes most accurately. Too flat or too steep, and slope comparisons become harder. For time series, this means choosing the aspect ratio to make the interesting slopes appear at roughly 45 degrees on average — usually a wide chart for long time series, more square for volatile short ones.A.6 ★★★ | Evaluate
A colleague sends you a chart of "daily website traffic over 3 years" with a 365-day rolling mean as the only line. What do you suggest?
Guidance
Several issues. (1) A 365-day rolling mean smooths out annual seasonality entirely, which may be part of the story. Suggest also showing the raw data or a 7- or 30-day rolling mean to preserve finer patterns. (2) Without the raw data visible, specific events (outages, viral posts) disappear. Consider a two-layer chart with raw data in light gray and the smoothed line on top. (3) The 365-day window also means the first year shows nothing, because there is not enough history. Disclose this or use a shorter window for early data.Part B: Applied (10 problems)
B.1 ★☆☆ | Apply
Create a time series DataFrame with a DatetimeIndex and plot it with a formatted year axis.
Guidance
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
dates = pd.date_range("2015-01-01", "2024-12-31", freq="D")
df = pd.DataFrame({"value": np.random.randn(len(dates)).cumsum()}, index=dates)
fig, ax = plt.subplots(figsize=(12, 4))
ax.plot(df.index, df["value"])
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y"))
plt.show()
B.2 ★☆☆ | Apply
Add a 30-day rolling mean to the chart from B.1, along with the raw data shown in light gray.
Guidance
df["ma30"] = df["value"].rolling(window=30).mean()
fig, ax = plt.subplots(figsize=(12, 4))
ax.plot(df.index, df["value"], color="lightgray", linewidth=0.5, label="Daily")
ax.plot(df.index, df["ma30"], color="steelblue", linewidth=2, label="30-day MA")
ax.legend()
B.3 ★★☆ | Apply
Perform a seasonal decomposition using statsmodels and plot the four panels (observed, trend, seasonal, residual).
Guidance
from statsmodels.tsa.seasonal import seasonal_decompose
# Create a series with clear seasonality
dates = pd.date_range("2015-01-01", "2024-12-31", freq="D")
season = 5 * np.sin(2 * np.pi * np.arange(len(dates)) / 365)
trend = np.arange(len(dates)) * 0.01
noise = np.random.randn(len(dates))
df = pd.DataFrame({"value": trend + season + noise}, index=dates)
result = seasonal_decompose(df["value"], model="additive", period=365)
fig, axes = plt.subplots(4, 1, figsize=(12, 10), sharex=True)
axes[0].plot(result.observed); axes[0].set_ylabel("Observed")
axes[1].plot(result.trend); axes[1].set_ylabel("Trend")
axes[2].plot(result.seasonal); axes[2].set_ylabel("Seasonal")
axes[3].plot(result.resid); axes[3].set_ylabel("Residual")
plt.tight_layout()
B.4 ★★☆ | Apply
Add a vertical line with an annotation at a specific date ("Event X on 2020-03-15") to a time series chart.
Guidance
fig, ax = plt.subplots(figsize=(12, 4))
ax.plot(df.index, df["value"])
ax.axvline(pd.Timestamp("2020-03-15"), color="red", linestyle="--", alpha=0.7)
ax.annotate("Event X", xy=(pd.Timestamp("2020-03-15"), ax.get_ylim()[1]),
xytext=(10, -15), textcoords="offset points", color="red", fontsize=9)
B.5 ★★☆ | Apply
Highlight anomalies (points more than 2 standard deviations from the mean) as red scatter markers on top of the line chart.
Guidance
mean = df["value"].mean()
std = df["value"].std()
anomalies = df[abs(df["value"] - mean) > 2 * std]
fig, ax = plt.subplots(figsize=(12, 4))
ax.plot(df.index, df["value"], color="steelblue")
ax.scatter(anomalies.index, anomalies["value"], color="red", s=40, zorder=5,
label="Anomaly")
ax.legend()
B.6 ★★☆ | Apply
Use pd.DataFrame.resample to convert daily data to monthly averages and plot the result.
Guidance
df_monthly = df["value"].resample("M").mean()
fig, ax = plt.subplots(figsize=(12, 4))
ax.plot(df_monthly.index, df_monthly, marker="o")
ax.set_title("Monthly Mean")
For start-of-month anchoring, use `"MS"` instead of `"M"`.
B.7 ★★★ | Apply
Build an interactive Plotly time series with a range slider and unified hover mode.
Guidance
import plotly.express as px
fig = px.line(df, x=df.index, y="value", title="Time Series")
fig.update_layout(
xaxis_rangeslider_visible=True,
hovermode="x unified",
)
fig.show()
B.8 ★★☆ | Apply
Create a small-multiples chart of monthly mean values across several years, one panel per year.
Guidance
import seaborn as sns
df_monthly = df.resample("MS").mean().reset_index()
df_monthly["year"] = df_monthly["index"].dt.year
df_monthly["month"] = df_monthly["index"].dt.month
g = sns.FacetGrid(df_monthly, col="year", col_wrap=3, height=2.5, aspect=1.2)
g.map(plt.plot, "month", "value")
B.9 ★★★ | Apply
Build a calendar heatmap of daily data using calplot (or manually with matplotlib if calplot is unavailable).
Guidance
# With calplot:
import calplot
calplot.calplot(df["value"], cmap="YlOrRd")
# Manual matplotlib version (sketch):
df["year"] = df.index.year
df["week"] = df.index.isocalendar().week
df["day"] = df.index.dayofweek
pivot = df.pivot_table(values="value", index="day", columns="week")
fig, ax = plt.subplots(figsize=(20, 3))
ax.imshow(pivot, cmap="YlOrRd", aspect="auto")
B.10 ★★★ | Create
Build a forecast visualization: historical line in black, forecast line in red, 80% confidence band shaded.
Guidance
hist = df[df.index < "2023-01-01"]
forecast = df[df.index >= "2023-01-01"]
forecast_mean = forecast["value"]
forecast_lower = forecast_mean - 2
forecast_upper = forecast_mean + 2
fig, ax = plt.subplots(figsize=(12, 4))
ax.plot(hist.index, hist["value"], color="black", label="Historical")
ax.plot(forecast.index, forecast_mean, color="red", label="Forecast")
ax.fill_between(forecast.index, forecast_lower, forecast_upper,
color="red", alpha=0.2, label="80% CI")
ax.axvline(pd.Timestamp("2023-01-01"), color="gray", linestyle="--")
ax.legend()
Part C: Synthesis (4 problems)
C.1 ★★★ | Analyze
Take the climate temperature dataset (150 years of annual data). Build a 4-panel figure: (a) full series with 10-year rolling mean, (b) STL decomposition, (c) cycle plot of monthly means, (d) calendar heatmap of the last 20 years. Describe what each panel reveals that the others do not.
Guidance
Panel (a) shows the long-term trend clearly with the rolling mean; the raw data gives context. Panel (b) separates trend from seasonal from residual, making the magnitude of seasonal variation explicit. Panel (c) reveals which months have warmed fastest — in climate data, winter months typically warm faster than summer months. Panel (d) shows specific years and days that were unusually hot or cold, revealing individual events the other panels hide. Together they give a complete time series analysis; no single panel alone would answer all four questions.C.2 ★★★ | Evaluate
You are asked to visualize the number of deaths per day from COVID-19 in a country, across 2020-2023. Which techniques from this chapter apply, and why?
Guidance
(1) **Rolling mean** to smooth reporting noise (weekends had low reporting, producing a weekly cycle). (2) **Annotations** for lockdowns, vaccine rollouts, variant emergence. (3) **Log scale** for the exponential growth phases. (4) **Range slider** in Plotly if the chart is interactive, so readers can zoom into specific waves. (5) **Faceting by region** if the country has sub-national variation. (6) **Forecast visualization** if the chart includes projection. Avoid dual-axis (deaths + cases), non-zero baselines for area charts, and aggressive smoothing that hides individual wave peaks.C.3 ★★★ | Create
Build a sparkline-style chart inline with a short text summary: "Revenue ↗ $1.2M (up 15%) [sparkline]".
Guidance
Use matplotlib's `figsize=(1.5, 0.3)`, remove axes and spines, and place the chart next to text in a larger figure or an HTML document. The sparkline function from Section 25.8 is reusable: call it with your data and embed the result.C.4 ★★★ | Evaluate
The chapter argues that time series charts often need multiple visualizations at different scales. When is this overkill? Can you think of scenarios where one chart is enough?
Guidance
One chart is enough when: (1) the audience has a single specific question ("how did sales do this quarter?"), (2) the time span is short (days or weeks, not years), (3) there is no seasonality to disentangle, (4) there are no anomalies worth highlighting. In these cases, a single well-designed line chart does the job. The multi-panel approach is for exploratory or comprehensive analysis where the analyst wants to understand the series fully. For a simple operational dashboard, a single chart is usually better than four.These exercises exercise the main time series visualization techniques. Chapter 26 introduces text and NLP visualization.