Case Study 2: Writing for Different Audiences — Technical Report vs. Blog Post vs. Slide Deck

Contributors to Introduction to Data Science

Case Study 2: Writing for Different Audiences — Technical Report vs. Blog Post vs. Slide Deck

Tier 2 — Attributed Narrative: This case study uses a fictional data scientist and a constructed analysis scenario to illustrate real communication challenges. The transit data patterns are inspired by published urban mobility research, but specific numbers and characters are fictional. The three communication formats shown are representative examples, not excerpts from real documents.

One Analysis, Three Documents, Three Audiences

James Chen had just finished an analysis he was genuinely excited about. Working with three years of public transit ridership data from the Metro City Transit Authority, he had discovered something unexpected: ridership recovery after the pandemic was not uniform across routes. Some routes had recovered to pre-pandemic levels; others were at 40% of their former ridership. And the pattern was not random — it correlated strongly with how many remote-work-eligible jobs were located near each route.

Routes serving downtown office districts were still struggling. Routes serving hospitals, warehouses, universities, and retail areas had mostly recovered. The implication was significant: the transit authority was running the same bus schedules as 2019, allocating service based on historical ridership that no longer reflected reality. Millions of dollars in service hours were being spent on near-empty buses running downtown routes, while recovered routes were overcrowded.

James needed to communicate this finding to three different audiences in the same week:

Monday: His data science team, who needed to review his methodology and extend the analysis
Wednesday: The transit authority's board of directors, who needed to decide whether to restructure routes
Friday: The public, via the transit authority's blog, because the community would be affected by any route changes

Same data. Same finding. Three completely different documents.

Document 1: The Technical Report (For the Data Science Team)

James's technical report was shared as a Jupyter notebook in the team's GitHub repository. Here is how it was structured:

Opening Section

# Transit Ridership Recovery Analysis: Route-Level Patterns and
# Workforce Composition Correlations

**Author:** James Chen
**Date:** March 2026
**Data:** MCTA ridership data (Jan 2019 – Dec 2025),
         Census ACS 5-year estimates (2020), BLS employment data
**Repository:** github.com/mcta-analytics/ridership-recovery

## Summary

This analysis examines route-level ridership recovery patterns across
MCTA's 47 bus routes, comparing current ridership to 2019 baselines.
We find that recovery rates correlate strongly with the proportion of
remote-work-eligible employment along each route corridor (Pearson r = -0.78,
p < 0.001). Routes serving areas with >60% remote-eligible jobs are at
52% of 2019 ridership on average, while routes serving areas with <20%
remote-eligible jobs have recovered to 94%.

We use Census ACS employment data to classify jobs by remote-work
eligibility following the Dingel & Neiman (2020) taxonomy and match
employment centers to transit routes using a 400-meter buffer zone.

Methodology

The notebook included detailed code showing every step:

import pandas as pd
import geopandas as gpd
from scipy import stats

# Load ridership data
ridership = pd.read_csv('data/mcta_ridership_2019_2025.csv',
                        parse_dates=['date'])

# Calculate recovery ratio for each route
baseline_2019 = (ridership[ridership['year'] == 2019]
                 .groupby('route_id')['daily_riders']
                 .mean())

current = (ridership[ridership['year'] == 2025]
           .groupby('route_id')['daily_riders']
           .mean())

recovery = (current / baseline_2019 * 100).rename('recovery_pct')

The notebook contained 34 code cells, 18 Markdown cells, and produced 12 visualizations. It documented every data cleaning decision, explained why James chose a 400-meter buffer zone (citing pedestrian access research), discussed the limitations of the Dingel-Neiman remote work classification, and included sensitivity analyses showing that the core finding held under different buffer sizes and classification thresholds.

What the Technical Report Prioritized

Reproducibility: Every step was in code. A teammate could re-run the notebook and get identical results.
Methodology transparency: Decisions were justified with references. Alternative approaches were discussed.
Statistical rigor: Correlations included confidence intervals. Regression diagnostics were shown. Potential confounders (route length, neighborhood income, service frequency) were tested.
Limitations: A full section discussed what the analysis could and could not conclude.

What the Technical Report Did NOT Include

Recommendations. James's team does not make policy decisions — they provide analysis.
Simplified language. Terms like "Pearson correlation," "buffer zone," and "sensitivity analysis" were used without explanation because the audience knew them.
Emotional framing. No stories about individual riders. No photographs. The tone was neutral and methodological.

Document 2: The Slide Deck (For the Board of Directors)

On Wednesday, James presented to the transit authority's board — seven appointed directors, most of whom had backgrounds in business, law, or city planning. They were not data scientists. They had 15 minutes of agenda time for this topic.

James built a 10-slide deck. Here is what it contained:

Slide 1: Title "Our Bus Routes Were Designed for 2019. Our Riders Live in 2026."

No methodology. No data source. Just a provocative statement that framed the problem in terms the board would understand: the system was out of date.

Slide 2: The overall picture "Total ridership has recovered to 78% of pre-pandemic levels — but the average hides a dramatic split."

A single bar showing 78% overall recovery, with an annotation: "But this number masks a critical pattern."

Slide 3: The split "Routes serving office districts are at 52%. Routes serving hospitals, campuses, and retail are at 94%."

Two stacked horizontal bars, color-coded. The visual gap was immediately obvious. No statistics, no correlation coefficients — just the key comparison.

Slide 4: Why "The difference is remote work. Office workers can work from home. Nurses, warehouse workers, and retail staff cannot."

A simple icon-based graphic showing which job types returned to in-person work and which did not. No regression output — just the explanation in plain language.

Slide 5: The consequence "We are running 2019 schedules for a 2026 city. This means empty buses downtown and overcrowded buses in medical and retail corridors."

Two photographs side by side: a near-empty bus (representative of a downtown route) and a standing-room-only bus (representative of a healthcare corridor route). Below: "This costs $4.2 million per year in misallocated service hours."

Slide 6: Who is affected "The riders we are underserving are disproportionately lower-income essential workers."

A demographic breakdown showing that riders on recovered routes — the overcrowded ones — had lower average incomes and were more likely to be people of color. James was careful with this slide: he did not editorialize, but he presented the equity dimension clearly.

Slide 7: The opportunity "Reallocating 15% of downtown service hours to high-recovery routes would eliminate overcrowding and save $1.1 million annually."

A before-and-after comparison showing the proposed reallocation. Simple, visual, specific.

Slide 8: What other cities are doing "Three comparable transit agencies have already made similar adjustments."

Brief case references to transit agencies in other cities that had restructured post-pandemic routes. This provided social proof — the board was not being asked to do something radical.

Slide 9: The recommendation "We recommend a phased reallocation beginning in Q3, with community input sessions in affected neighborhoods."

Specific. Actionable. Included the community input element because James knew the board would worry about public backlash.

Slide 10: Key takeaway "The question is not whether to change routes. It is how long we continue spending $4.2 million per year serving a ridership pattern that no longer exists."

Reframing the decision: inaction is also a choice, and it has a cost.

What the Slide Deck Prioritized

Decision-relevance: Every slide connected to the board's decision.
Plain language: No statistical terminology. No code. No methodology.
Visual impact: Charts were simple and annotated. Numbers were rounded.
Specific recommendation: Not "further study" but a concrete plan.
Equity framing: The board needed to understand who was affected.

What the Slide Deck Did NOT Include

Code or methodology details (available in the technical report if requested)
Confidence intervals or p-values
Sensitivity analyses
Caveats beyond what was necessary for honest communication

Document 3: The Blog Post (For the Public)

On Friday, James published a blog post on the transit authority's website. The audience was the general public — riders, community members, journalists, and local activists. Here is an abbreviated version:

Why Your Bus Might Be Packed While Another Route Runs Empty

By James Chen, Data Analyst, Metro City Transit Authority

If you ride the Route 7 to Memorial Hospital, you have probably noticed it is crowded. Really crowded. Standing-room-only at 7:30 AM, every weekday.

And if you have ever glanced out the window and seen a nearly empty Route 22 heading downtown, you might have wondered: why is that bus running half-empty while mine is packed?

We wondered the same thing. So we dug into the data.

The short answer: the pandemic changed where people work — but our bus schedules have not caught up yet.

Here is what we found: before 2020, our busiest routes were the ones serving the downtown office district. That made sense — tens of thousands of people commuted to offices every weekday.

Then remote work happened. Many of those office workers now work from home two or three days a week. They still ride the bus, but not every day — and some have stopped riding entirely.

Meanwhile, the people who have to be somewhere in person — nurses, warehouse workers, teachers, retail staff — are riding just as much as before. In some cases, more.

But our bus schedules are still based on 2019 ridership patterns. That means we are running lots of service on routes where demand has dropped, and not enough service on routes where demand has stayed strong or grown.

What the numbers show:

Routes serving downtown offices are carrying about half the riders they did in 2019
Routes serving hospitals, schools, and retail areas are back to about 95% of pre-pandemic ridership
On the most recovered routes, buses are overcrowded during peak hours

What we are proposing:

We are recommending a phased adjustment to our routes, starting later this year. The goal is to match our service to where riders actually are today — not where they were five years ago.

This does not mean eliminating any routes. It means running fewer buses during off-peak hours on low-ridership downtown routes and adding service on the routes that are overcrowded.

Before any changes happen, we will hold community input sessions in every affected neighborhood. Your voice matters — these are your buses, and we want to get this right.

We want to hear from you:

How has your commute changed since 2020? Are you experiencing overcrowded buses? Or are you riding a route that seems emptier than it used to be? Send us your feedback at feedback@mcta.gov or attend one of our upcoming community sessions.

What the Blog Post Prioritized

Accessibility: Short paragraphs, everyday language, no jargon.
Relatability: It opened with an experience the reader had likely had — a crowded bus.
Transparency: It explained what was happening and why, treating the public as intelligent adults.
Community voice: It invited feedback and emphasized public input.
Reassurance: "This does not mean eliminating any routes."

What the Blog Post Did NOT Include

Any statistics beyond the most basic percentages
Any mention of methodology, regression, or data sources
Policy recommendations (just a description of the proposal)
Technical visualization (the few numbers were in text, not charts)

Comparing the Three Documents

Element	Technical Report	Slide Deck	Blog Post
Audience	Data scientists	Board of directors	General public
Length	~15 pages + code	10 slides	~600 words
Opening	Summary of methods and key finding	Provocative framing statement	Relatable personal experience
Language	Technical (Pearson r, buffer zones, sensitivity analysis)	Business (cost, reallocation, ROI)	Conversational (your bus, your commute)
Charts	12 detailed plots with statistical annotations	5 simple, annotated charts	None
Numbers	Precise with confidence intervals	Rounded with context	Minimal, rounded
Recommendation	None (analysis only)	Specific and actionable	Described, with community input emphasis
Uncertainty	Extensive (sensitivity analyses, limitations section)	Brief caveats	Not discussed
Emotional content	None	Light (photographs, equity framing)	Moderate (empathy with crowded riders)
Call to action	"Extend this analysis to..."	"Approve the reallocation plan"	"Share your experience"

The Lesson: Same Truth, Different Translations

All three documents conveyed the same fundamental truth: bus routes designed for 2019 commuting patterns are misaligned with 2026 ridership, and the mismatch wastes money and harms riders.

But each document translated that truth into a different language:

The technical report spoke the language of evidence and method. Its purpose was to establish that the finding was real, reproducible, and robust.
The slide deck spoke the language of decisions and dollars. Its purpose was to enable the board to act.
The blog post spoke the language of experience and community. Its purpose was to explain, build trust, and invite participation.

None of these was the "right" way to communicate the finding. All three were necessary. The technical report without the slide deck would have been ignored by decision-makers. The slide deck without the technical report would have lacked a credible foundation. The blog post without either would have been an announcement without evidence.

Together, they formed a communication ecosystem: rigor underneath, decisions in the middle, and public understanding on top.

What James Learned

After the week was over, James reflected on what he had learned:

Writing for others is harder than writing for yourself. The technical report was the easiest to write — it was organized the way James naturally thought. The blog post was the hardest. Simplifying without distorting, being conversational without being condescending, and explaining without jargon required more revision than any regression analysis.

Different audiences have different trust mechanisms. The data team trusted methodology. The board trusted cost comparisons and peer examples. The public trusted transparency and the invitation to participate. Credibility is audience-specific.

Omission is not dishonesty. The blog post did not mention p-values. The slide deck did not include sensitivity analyses. This was not deception — it was appropriate simplification. The full analysis was available to anyone who wanted it. But forcing every audience to wade through methodology they did not need would have been a different kind of dishonesty — the kind that pretends communication happened when it did not.

The real skill is not analysis — it is translation. James was the same analyst across all three documents. What changed was his ability to translate between the language of data and the language of each audience. That translation skill, he realized, was at least as valuable as his ability to run a regression.

Discussion Questions

James chose not to include confidence intervals in the blog post or the slide deck. A colleague argued that this was "dumbing down" the analysis and that uncertainty should always be communicated. Who do you agree with, and why?
The slide deck included photographs of crowded and empty buses. Are these photographs "data"? Are they appropriate in a data-driven presentation? Could they be manipulative?
The blog post invited community feedback. From a data science perspective, what risks does this create? (Hint: think about selection bias in who responds.) How could James address this?
James included an equity dimension in his slide deck (Slide 6). Should data analysts raise equity issues, or should they stick to "neutral" analysis? Is neutral analysis possible when it comes to public services?
If James had only enough time to create one document, which should he have created? Why?
Think about a finding from your own work (school or professional). How would you communicate it differently to (a) a classmate, (b) a professor, and (c) a family member who knows nothing about data science?