Exercises: Capstone — The Complete Data Story
These exercises are the capstone project deliverables. Each is a substantial deliverable, not a quick problem. Allow 2-4 weeks for independent completion.
Part A: Climate Capstone Deliverables (10 items)
A.1 ★★★ | Create
Write a one-page project brief: the question, the audience, the data, and the planned deliverables. This is Step 1 of the workflow from Chapter 33.
Guidance
One page, covering: (1) the specific question ("How has global warming accelerated, and what does the multi-variable evidence show?"), (2) the audience (a general science-literate reader), (3) the data (climate dataset: temperature, CO2, sea level, year, era), (4) the deliverables (6 static figures, 1 dashboard, 1 PDF report, 1 slide deck). This document guides every subsequent step.A.2 ★★★ | Create
Produce an exploratory Jupyter notebook that loads the climate data, assesses quality, and generates 10+ quick charts. Document what you found.
Guidance
Use pandas for assessment (`df.info()`, `df.describe()`, `df.isnull().sum()`). Quick charts with matplotlib/seaborn: histograms of each variable, time series of each variable, scatter of CO2 vs. temperature, correlation heatmap, box plots by era. Annotate findings: "temperature distribution is right-skewed in the modern era", "CO2 has a strong linear relationship with temperature", etc.A.3 ★★★ | Create
Produce 6 publication-quality static figures, each answering a different question about the climate data. Apply the brand from Chapter 32.
Guidance
Suggested 6 figures: (1) Full temperature time series with 10-year rolling mean and annotations. (2) CO2 vs. temperature scatter with regression. (3) Monthly heatmap of temperature anomalies. (4) Multi-panel small multiples (temperature, CO2, sea level). (5) Distributional comparison (violin plots by era). (6) Pair plot of all variables colored by era. Each figure must have an action title, source attribution, brand colors, and a caption.A.4 ★★★ | Create
Build a Streamlit dashboard for the climate dataset with sidebar filters, multiple chart tabs, KPI metrics, and a download button.
Guidance
Structure: sidebar with date range slider, variable selectbox, and smoothing slider. Main area with tabs: "Time Series" (Plotly line chart), "Relationships" (scatter), "Distributions" (violin/box), "Raw Data" (st.dataframe). KPI cards at the top: mean anomaly, max anomaly, current CO2. Download button for filtered data.A.5 ★★★ | Create
Build an automated PDF report pipeline that generates a monthly climate report with 4 charts, a summary, and a data table.
Guidance
Use FPDF2 or ReportLab. Parameterize by date range. Include: title page, summary paragraph (computed from metrics), 4 charts (time series, scatter, heatmap, bar by era), data table of monthly means, source attribution, page numbers. Save as PDF. Verify by opening the PDF.A.6 ★★★ | Create
Create a 10-slide presentation (python-pptx) telling the climate data story following Chapter 9's narrative structure.
Guidance
Suggested slides: (1) Title. (2) The question. (3) Context: what the data covers. (4) The trend: temperature over time. (5) The driver: CO2 relationship. (6) The evidence: multi-variable correlations. (7) The regional view: map or regional comparison. (8) The seasonal view: heatmap or cycle plot. (9) The dashboard: screenshot or summary of the Streamlit app. (10) Conclusion and next steps. Each slide has one chart and one key message. Use python-pptx with the brand template.A.7 ★★★ | Evaluate
Apply the Master Critique Rubric from Chapter 33 to each of your 6 static figures. Document your scores and any issues found.
Guidance
Use the 25-point rubric: data integrity (4), encoding (4), design (4), accessibility (3), ethics (3), narrative (4), brand (3). Score each figure. Note items that score below expectations. Fix critical issues; document non-critical ones.A.8 ★★★ | Apply
Apply consistent branding across all outputs: the 6 figures, the dashboard, the PDF report, and the slide deck should all use the same color palette, fonts, and title style.
Guidance
Use the brand module from Chapter 32: a shared `.mplstyle` file, a Plotly template, and helper functions. Import at the top of every script. Verify consistency by placing a figure from each output side by side.A.9 ★★★ | Evaluate
Write a 1-page reflection on the capstone process: what worked, what was hardest, what you would change, what surprised you.
Guidance
Honest self-assessment. Common themes: "data assessment was more valuable than I expected", "the critique step caught issues I would have shipped", "branding made everything look more professional with minimal effort", "the hardest part was the PDF report because of font issues." The reflection is as important as the deliverables.A.10 ★★★ | Create
Archive the complete project: all source code in a git repository, with a README documenting the structure, requirements, and how to reproduce each output.
Guidance
climate-capstone/
README.md
requirements.txt
data/climate.csv
notebooks/exploration.ipynb
figures/figure_01.png ... figure_06.png
dashboard/app.py
report/generate_report.py
slides/generate_slides.py
brand/brand.py, climate_observatory.mplstyle
The README should explain how to reproduce every output. `pip install -r requirements.txt`, then run each script.
Part B: Independent Capstone (6 items)
B.1 ★★★ | Create
Choose a dataset from a different domain. Suggested options: (1) NYC taxi trips (subset), (2) World Bank development indicators, (3) US election results by county, (4) Spotify top tracks, (5) Stack Overflow developer survey, (6) your own dataset from work or research.
Guidance
Choose a dataset you find interesting and that has enough variables for multi-chart exploration. The dataset should be freely available and at least 1000 rows.B.2 ★★★ | Create
Write a project brief for the independent capstone: question, audience, data, deliverables.
Guidance
Same structure as A.1 but for the new dataset. The question should be specific and testable.B.3 ★★★ | Create
Produce an exploratory notebook for the independent dataset.
Guidance
Same structure as A.2. Load, assess, explore with quick charts, document findings.B.4 ★★★ | Create
Produce 4+ publication-quality static figures for the independent dataset, with branding.
Guidance
Fewer than the climate capstone (4 instead of 6) because this is independent work. Still requires action titles, source attribution, brand application, and captions.B.5 ★★★ | Create
Build either a Streamlit dashboard or an automated PDF report for the independent dataset (student's choice).
Guidance
Choose the output format that best fits the audience. A dashboard is better for exploration; a report is better for delivery. Whichever you choose, apply the brand.B.6 ★★★ | Evaluate
Apply the critique rubric to the independent capstone, write a reflection, and archive the project.
Guidance
Same rubric and reflection as A.7 and A.9. Archive the code in a repository with a README.Part C: Meta-Reflection (4 items)
C.1 ★★★ | Evaluate
Compare your first matplotlib chart (from Chapter 10 or 11) to your capstone figure. What has changed? What skills did you develop?
Guidance
Pull up your earliest chart from the book's exercises and place it next to a capstone figure. The difference should be dramatic: default styling vs. branded, generic title vs. action title, no annotations vs. rich annotations. List the specific improvements and trace each to the chapter that taught it.C.2 ★★★ | Evaluate
Which chapter was the most valuable to you personally, and why?
Guidance
This is subjective. Common answers: Chapter 7 (action titles), Chapter 12 (customization), Chapter 20 (Plotly for interactivity), Chapter 33 (workflow). The answer depends on your background and goals. The exercise is to identify what you value most so you can invest in that area going forward.C.3 ★★★ | Evaluate
What skill gap do you still have after finishing this book? What is your plan to address it?
Guidance
Honest self-assessment. Common gaps: D3/JavaScript for custom web viz, advanced statistical methods, specific domain knowledge, design skills beyond what a style sheet can provide. Plan: read a specific book, take a specific course, build a specific project. The exercise closes the loop from "what I learned" to "what I will learn next."C.4 ★★★ | Create
Build a personal visualization portfolio: 5-10 of your best charts from this book, with a brief description of each and what it demonstrates.
Guidance
Pick your best work across the book. For each chart: a thumbnail, the question it answers, the technique it demonstrates, and the chapter it came from. Publish as a GitHub Pages site, a personal website section, or a PDF portfolio. This is a professional asset for job interviews and client pitches.With the capstone complete, you have a full portfolio of visualization work spanning every technique in the book. Chapter 35 is the Visualization Gallery — a permanent reference of 50 chart types with code for each.