Exercises — Chapter 27: Writing About Data

DataField.Dev

Exercises — Chapter 27: Writing About Data

Writing is learned by writing. Most of these ask you to produce or revise real text. Where a task is open-ended, a self-assessment rubric stands in for an answer key. Selected solutions: appendices/answers-to-selected.md.

Part A — Analyze This ⭐

Read each piece of data writing and identify what works and what's broken. Name the specific principle.

A1. A memo opens: "To assess our Q3 retention performance, I queried the subscriptions table, joined it against the events log, and aggregated cancellations by signup cohort using a 90-day window." What has the writer led with, who is this opening right for, and who is it wrong for?

A2. A finding states: "Page-load time on the checkout page is 3.2 seconds." Which level of the observation–interpretation–recommendation ladder is this, and what's missing?

A3. A figure caption reads: "Figure 4. Conversion rate by traffic source, Q1–Q3." Diagnose it against Chapter 9's caption levels, and say what a scanner who reads only captions takes away.

A4. A dashboard tile shows: Churn: 6.1%—nothing else. The company's target is under 4%. What is a viewer scanning for two seconds unable to determine, and why?

A5. An executive summary ends its first paragraph: "This report presents our key findings and recommendations below." What's wrong with ending the summary's opening paragraph this way?

A6. A recommendation reads: "Because mobile users abandon carts more often, we should rebuild the entire mobile app." The analysis measured abandonment rates but did not investigate why mobile users abandon. What's the flaw, and how would you reword it honestly?

A7. Two subject lines for the same churn analysis: (a) "Q3 Churn Analysis" and (b) "Recommend moving retention budget into onboarding." Which serves a scanning VP better, and what does each one promise the reader?

A8. A memo reports: "The intervention produced a statistically significant lift in retention (p = 0.012)." For a non-technical decision-maker, what's wrong with this as the headline, and what should lead instead?

Discussion notes for Part A

- **A1:** Led with **method** (the queries and joins). Right for an analyst auditing the work; wrong for the decision-maker, who needs the finding first. The opening spends the most valuable sentence on apparatus. - **A2:** **Level 1 (observation).** Missing the "so what?"—is 3.2s good or bad, what does it cost, what should we do? (e.g., "3.2s is well above the 2s threshold where conversion drops; recommend we…") - **A3:** A **Level-1 label**—names the topic, not the finding. A caption-only scanner takes away a topic ("there's a chart about conversion by source") and none of the insight. - **A4:** Whether 6.1% is **good or bad**. No comparison (vs. target, vs. last period, vs. trend). A bare number is an observation; the viewer can't interpret it without context the tile fails to carry. - **A5:** It ends with a *promise of findings* instead of the findings. A reader who reads only that paragraph learns a study happened and nothing about what it found. The summary must *be* the findings, not preview them ([Chapter 20](../../part-04-professional-workplace-writing/chapter-20-proposals-business-cases/index.md)). - **A6:** The recommendation outruns the evidence—the data shows *that* mobile abandons more, not *why*, so "rebuild the entire app" is a hunch in a recommendation's clothing. Honest version: "Mobile abandonment is our largest leak; the drop-off concentrates at checkout, so the likely fix is the checkout flow—recommend we confirm with session recordings before committing to a rebuild." - **A7:** (b) serves the scanning VP—it states the recommendation, so she knows the decision in the subject line. (a) promises a topic and makes her open the memo to learn whether it matters. - **A8:** A p-value is the wrong headline for a non-technical reader—it states statistical significance, not consequence. Lead with the *effect and its meaning*: "Retention rose X points (from Y% to Z%)—worth roughly $N—strong enough to roll out." The p-value belongs in a "how confident" line.

Part B — Revise This ⭐⭐

Rewrite each weak passage. Give your reasoning in one line.

B1. Reorder this finding to lead with the consequence, in the reader's units:

"After applying a logistic regression to the trailing-twelve-month transaction data and controlling for seasonality, we observed that the coefficient on the 'mobile' variable was positive and significant, indicating elevated abandonment among mobile sessions."

B2. Push this observation to Level 3 (recommendation). Invent a plausible action and quantify the stakes:

"Email open rates have declined from 24% to 16% over the past six months."

B3. Turn this method-first executive-summary opening into an insight-first one:

"This analysis was conducted using six weeks of A/B test data comparing the new and old onboarding flows across a randomized sample of new users, measuring 30-day retention as the primary outcome and using a two-proportion z-test for significance." (The result: the new flow lifted 30-day retention from 58% to 70%.)

B4. Rewrite these four topic-style dashboard titles as question/takeaway titles:

"Revenue" · "Active Users" · "Support Tickets" · "Server Errors"

B5. Add the missing context to each bare dashboard label so a viewer can interpret it at a glance (invent reasonable targets/comparisons):

"Conversion: 2.8%" · "Avg. order value: $54" · "Uptime: 99.2%"

B6. This caption labels; rewrite it to interpret (Level 3). The chart shows weekly active users flat for three months after a year of steady growth:

"Figure 1. Weekly active users, trailing 15 months."

B7. Trim the method out of the lead and demote it. Rewrite so the finding comes first and the method becomes a single "how confident" sentence:

"We collected 18 months of churn data across all account tiers, built a survival model with time-varying covariates, validated it with a concordance index of 0.74, and found that accounts without an assigned customer-success manager churned at roughly double the rate of those with one."

B8. This recommendation overclaims. Rewrite it to flag the gap between what the data shows and what it can't:

"Customers who use our mobile app have higher lifetime value, so we should push every customer to install the app."

Self-check for Part B

For each: Did you (1) lead with the finding or recommendation in the reader's units, (2) demote or remove the method, (3) reach Level 3 where an action was warranted, (4) keep the claim honest about what the data supports? B8 especially: usage may *correlate* with high LTV without *causing* it (high-value customers may simply be more engaged), so the honest version flags causation as unconfirmed before recommending a push.

Part C — Write This ⭐⭐–⭐⭐⭐

Produce the document the scenario asks for.

C1. (Recommendation-first memo.) You analyzed support tickets and found that 58% concern one feature—password reset—and that resolving them takes an average of 12 minutes each, consuming roughly 30 hours of agent time per week. Write a one-paragraph, recommendation-first memo to the Head of Support. Include a subject line, the recommendation with a next step, the finding as support, and a one-line "how confident."

C2. (Interpretive caption.) You have a line chart showing API error rates flat near 0.2% until a deploy on the 14th, after which they jump to 3.1% and stay elevated. Write a Level-3 interpretive caption and describe one annotation you'd add to the chart itself (with its alt-text).

C3. (Dashboard text.) Design the text for a single dashboard tile reporting "monthly recurring revenue (MRR)." Write the title, the labeled value (with comparison), and a one-line tooltip. Assume MRR is $420K, up 3% from last month, with a target of $450K by quarter-end.

C4. (Executive summary for data.) You ran an analysis of customer-acquisition cost (CAC) by channel and found that paid social costs $180 per customer versus $40 for referrals, and that referral customers also retain longer. Write a four-to-five-sentence executive summary that leads with the insight, states a recommendation, and quantifies the stakes (invent reasonable volumes).

C5. (The "so what?" chain.) Take this raw finding and write three sentences, one at each level (observation, interpretation, recommendation): "Customers in the enterprise tier file 4× more support tickets per account than self-serve customers, but churn at one-third the rate."

Rubrics for Part C

- **C1:** Recommendation and next step in the first two lines? Finding in the reader's units (hours saved, not "ticket category distribution")? Method demoted to one line? Subject line states the ask or finding, not "ticket analysis"? - **C2:** Caption states what the chart *means* (the deploy broke something), gives the decisive comparison (0.2% → 3.1%), and implies an action (investigate/roll back). Annotation sits at the jump on the 14th with a few words; alt-text describes the pattern for a non-visual reader. - **C3:** Title states a takeaway or question, not "MRR"; value carries the comparison ("▲ 3% vs. last month · target $450K"); tooltip explains or contextualizes ("…on pace to reach 93% of target; gap driven by slower enterprise renewals"), not just repeats the number. - **C4:** Opens with the insight (referrals cost less *and* retain better—paid social is overpriced acquisition), states a recommendation (shift budget toward referrals), quantifies (e.g., "reallocating $X would acquire N more customers at equal spend"), names confidence, and points to (not recites) method. - **C5:** Level 1: states the ticket and churn numbers. Level 2: interprets ("enterprise customers are high-touch but high-loyalty—the support cost buys retention"). Level 3: recommends ("the support load is an investment, not a cost to cut—recommend we protect enterprise support staffing rather than trim it").

Part D — Synthesis & Critical Thinking ⭐⭐⭐

D1. (Rewrite this methodology-first memo as recommendation-first.) Here is a full memo. Diagnose its specific failures, then rewrite it as a recommendation-first memo to a non-technical VP of Product. Keep the finding intact; demote the method; supply the missing recommendation.

To: VP Product Subject: Feature-usage analysis

I pulled product analytics events for the last quarter (about 2.4M events across 18,000 active accounts) and ran a clustering analysis to segment users by feature-usage patterns. After standardizing the feature-frequency vectors and applying k-means (k selected via the elbow method at k=4), I identified four distinct usage segments. The largest segment (41% of accounts) uses only the core reporting feature and never touches the collaboration tools. This segment also has a 90-day retention rate of 52%, versus 81% for accounts that use at least one collaboration feature. The correlation between collaboration-feature adoption and retention is strong (and holds after controlling for account size and tier). Charts attached. Let me know your thoughts.

D2. (Apply the "so what?" test.) Below are four findings. For each, state which level it stops at, then push it to Level 3 with a supportable recommendation. Where the finding doesn't support a confident action, say so and recommend the next investigative step instead. - (a) "Accounts that contact support in their first week renew at higher rates than those that don't." - (b) "Our Tuesday email sends get 30% higher open rates than other days." - (c) "The new pricing page has a 12% lower conversion rate than the old one." - (d) "Users who connect a second integration have 3× the lifetime value of users who connect only one."

D3. (Translate for three audiences.) Take one finding—"Customers who reach first value within 7 days churn at 3%; those who take longer than 21 days churn at 22%"—and write the lead sentence three ways: (i) for the analytics peer who'll audit your model, (ii) for the VP who decides budget, (iii) for the general public reading a blog post. Note what leads in each and why (ties Chapter 2 and previews Chapter 28).

D4. (Find the flaw.) A colleague's dashboard has a tile permanently titled "Customer health: strong." The underlying metric is a composite health score that updates hourly. Explain why this title is a problem and propose two fixes (one using the title text, one using a status indicator).

D5. (The honest caveat.) You're asked to recommend, from a single quarter of data, whether to double down on a new acquisition channel that's showing strong early numbers. The sample is small and the channel is only six weeks old. Write the recommendation and the caveat so the reader can act without being misled about your confidence.

Guidance for Part D

- **D1:** Failures: method-first ordering (clustering, k-means, elbow method lead); jargon that subtracts credibility for this reader; the finding (52% vs. 81% retention) buried in paragraph four; **no recommendation**; "let me know your thoughts" hands the VP the interpretive step. The rewrite should open with something like: *"Recommend we drive collaboration-feature adoption among single-feature accounts—it's our biggest retention lever. Why: 41% of accounts use only core reporting and retain at 52%, versus 81% for accounts using any collaboration feature—a 29-point gap that holds after accounting for size and tier. I'd like to scope an in-product nudge to get reporting-only accounts into one collaboration feature. Method below."* Note the recommendation must hedge causation slightly (adoption *correlates* with retention; a nudge experiment would confirm it causes retention). - **D2:** (a) Likely **correlation, not cause**—early support contact may signal engaged users, not that support *causes* retention; recommend testing a proactive-outreach experiment rather than asserting causation. (b) Level 1; Level 3: shift more sends to Tuesday and test, quantify the open-rate gain. (c) Level 1, and it's a *problem*; Level 3: investigate what changed on the new page (the data shows the drop, not the cause), recommend rolling back or A/B testing fixes. (d) Strong, actionable: recommend prompting single-integration users toward a second integration—but note adoption may correlate with engagement, so an experiment confirms causation. - **D3:** (i) leads with method/mechanism ("TTFV is the dominant churn predictor; partial-dependence steepens past day 14"); (ii) leads with the consequence and recommendation ("we lose 7× as many customers who don't get value in week one—recommend moving budget upstream"); (iii) leads with a hook/analogy ("customers quit in the first week for the same reason gym members do—they never got to the good part"). Same fact, three leads, ordered by what each reader does with it. - **D4:** The title is a *fixed claim* on *changing data*—it becomes false the moment health drops, and nobody will notice because the words don't move. Fix 1: retitle as a question ("Customer health: on track?") or strip the claim ("Customer health score"). Fix 2: pair the score with a red/amber/green status dot that updates with the data, so the *indicator* carries the interpretation truthfully. - **D5:** Recommendation can be directional, caveat must be explicit: *"Early signs are strong enough to justify a measured increase in spend—recommend we double the budget for one more quarter and re-evaluate. Caveat: this is six weeks of data on a new channel, so the numbers are promising but not yet stable; I'd treat this as a funded experiment, not a permanent reallocation, until we have a full quarter."*

Part M — Mixed Practice (Interleaved) ⭐⭐–⭐⭐⭐

These mix this chapter with earlier ones, so you must choose the right approach.

M1. (Ch 27 + Ch 2.) The same churn finding goes to two readers this week: the analytics team lead (who will reuse your model) and the CFO (who controls budget). Decide for each whether to lead with finding or recommendation, what to do with the method, and write the opening sentence for each. Then state, in one line, the Chapter 2 question that drove both choices.

M2. (Ch 27 + Ch 9.) You have a 12-row table of monthly churn rates that a stakeholder finds hard to read. Decide—figure, table, or prose—based on the reader's task (they need to see whether churn is trending, not look up exact months). Then write the interpretive caption for whichever you chose.

M3. (Ch 27 + Ch 20.) Your data memo's recommendation is "invest $200K in an onboarding redesign." A reader from Chapter 20's world (a budget committee) is involved. What does the data memo need to borrow from a business case to get the spend approved, and what one number from Chapter 20 must appear?

M4. (Ch 27 + Ch 4.) A stakeholder will read your two-page analysis on a phone, scanning. Using Chapter 4's inverted pyramid plus this chapter's recommendation-first rule, sketch the order of the document's first half-page (what goes in the first three lines, what goes next).

M5. (Ch 27 + Ch 12.) You've drafted a recommendation-first memo. Apply Chapter 12's editing hierarchy: which level (content → structure → paragraph → sentence → word → proofread) do you check first to confirm the memo leads with the recommendation, and why is it wrong to start by fixing word choice?

M6. (Ch 27 + Ch 7.) Your "how confident" line currently reads: "The results unambiguously prove that onboarding causes churn." Using Chapter 7's hedging-vs-certainty calibration, rewrite it to match what a correlational analysis can actually claim.

Notes for Part M

- **M1:** Team lead → **findings-first**, full method (they audit and reuse). CFO → **recommendation-first**, method as one "how confident" line, stakes in dollars. The [Chapter 2](../../part-01-writing-is-thinking/chapter-02-audience/index.md) question: *what does this reader have to produce—an informed judgment, or a decision?* - **M2:** A **figure** (line chart)—the task is seeing a trend, which is a shape, not a lookup. Caption at Level 3: "Churn has climbed steadily all year, from 2.1% to 4.0% (Figure X)—nearly doubling—so the deterioration is a trend, not a one-month blip." - **M3:** Borrow the **cost-of-inaction** and **ROI/payback**—the memo must show what the $200K returns and what *not* acting costs (the churned revenue). The number that must appear: a **return figure** (e.g., recoverable ARR) against the $200K, so the committee sees the investment pays back. - **M4:** First three lines: the recommendation + the one decisive number (inverted-pyramid "conclusion first"). Next: the finding as support, then "how confident," then "details below." Headings let the scanner navigate; the method lives past the fold. - **M5:** Check **structure first**—does the recommendation lead? Word-level fixes are wasted if the whole document is in the wrong order; [Chapter 12](../../part-02-building-blocks/chapter-12-editing-and-revision/index.md)'s rule is top-down (never proofread before you've fixed the order). Polishing sentences in a method-first memo is rearranging deck chairs. - **M6:** Correlational data can't claim "proof" or "causes." Rewrite: "The pattern is strong and consistent and holds after controlling for account size, so onboarding speed is very likely a real driver—though to confirm it *causes* churn we'd run an experiment." ([Chapter 7](../../part-02-building-blocks/chapter-07-word-choice-tone-voice/index.md): "proves" → "strongly suggests"; match the claim to the evidence.)

Part E — Extension ⭐⭐⭐⭐ (optional)

E1. Find a real public dashboard (a government open-data dashboard, a status page, a published analytics report). Audit its text: list three titles or labels that name topics rather than deliver takeaways, and rewrite each. Then find one number with no comparison and add the context that would let a viewer interpret it.

E2. Take a finding from your own work or study and write it as a complete data-analysis memo in both structures—findings-first and recommendation-first. Then write two sentences arguing which is right for your specific reader, using Chapter 2's purpose dial. The point is to feel the difference, not to declare one universally better.

E3. Write a short "anti-pattern gallery": collect three real (anonymized) examples of data writing that stopped at Level 1 (an observation with no "so what?"). For each, write the Level-3 version that should have been sent. Use only material you have permission to share.

Selected solutions and rubrics: appendices/answers-to-selected.md. For open-ended tasks, the self-assessment notes above are your rubric.