Case Study 1: Tufte's Redesign of the New York Hospital Mortality Chart

Case Study 1: Tufte's Redesign of the New York Hospital Mortality Chart

Edward Tufte built his reputation by taking bad charts and fixing them. Nothing illustrates the declutter procedure better than a worked example from his own hand — the chart he redesigned to teach a generation of designers what "less is more" means in practice.

The Situation

In 1983, Edward Tufte published The Visual Display of Quantitative Information, the book that would define modern data visualization and introduce most of the principles that structure Part II of this textbook. One of Tufte's favorite techniques for teaching the data-ink ratio was the before/after redesign: take a real published chart, identify its chart-junk, apply the declutter procedure, and present the improved version next to the original. The contrast was the lesson. Over the course of Visual Display, Tufte redesigned dozens of charts from government reports, academic publications, and corporate annual reports, each time showing how much visual noise was hiding how much signal.

One of the most cited examples in Tufte's work is his redesign of a chart showing hospital mortality rates in New York. The original appeared in a government report on healthcare quality; the chart plotted death rates for various New York hospitals alongside expected rates, so that readers could compare actual performance to what statistical models predicted. The point of the chart was to let viewers identify hospitals that were performing better or worse than expected — a classic comparison question with clear policy implications.

The original chart, as published, was a minor disaster of chart-junk. It had heavy black borders, thick gridlines in both directions, three-dimensional shading on the bars, a dense legend in a boxed frame, tick marks every few pixels along both axes, and a background shaded in a decorative tone. The data was present — you could read the bars, find the hospitals, and compare actual to expected rates — but it took effort. The chart was correct and unreadable at the same time.

Tufte looked at the chart and saw an opportunity to teach. He created a side-by-side comparison: the original on the left, his redesigned version on the right. The redesigned version used the same data, the same chart type (bar chart with paired comparison), and even the same organization of hospitals. What changed was everything that was not data. The redesign applied the declutter procedure with extreme discipline — removing the borders, deleting the spines and heavy gridlines, eliminating the decorative background, simplifying the legend to direct labels, reducing the tick marks to the minimum necessary for quantitative reading. The before/after comparison has been reproduced in hundreds of visualization courses and design books in the forty years since.

The hospital mortality redesign is worth studying in detail for two reasons. First, it shows the declutter procedure applied to a realistic published chart (not a toy example). Second, it reveals exactly which categories of chart-junk were present in the original and how each category was treated by the redesign. Every design choice has a theoretical justification in the framework this chapter has been building.

The Data

The underlying data for the chart was straightforward. For each New York hospital in the sample, the report provided:

Observed mortality rate — the actual number of deaths as a percentage of relevant patient admissions over the study period
Expected mortality rate — the statistically predicted rate given the hospital's case mix, patient demographics, and severity of admissions

The question the chart was trying to answer was: which hospitals have observed mortality rates substantially different from their expected rates? A hospital performing better than expected would have a low observed rate relative to its expected rate. A hospital performing worse than expected would have a high observed rate relative to its expected rate. The comparison that mattered was the pairwise difference between two values for each hospital — and the ranking of hospitals by how far above or below their expected performance they were.

This is a comparison question, in the Chapter 5 sense. It can be answered by several chart types — paired bar charts, slope charts, dot plots with reference lines — but the original publication chose a bar chart with paired bars (observed and expected side by side for each hospital). The chart type was reasonable. The execution was the problem.

The Visualization: Original Version

The original chart, as it appeared in the government publication, had the following elements (reconstructed from Tufte's reproduction in Visual Display):

Heavy black figure border around the entire chart.
Thick black top and right spines enclosing the plotting area.
Thick black bottom and left spines (the functional axes).
Horizontal gridlines drawn in dark gray at every major tick.
Vertical gridlines between hospital groups, drawn in the same dark gray.
3D effect on the bars: each bar was drawn as a rectangular solid with a darker top face and a lighter front face, creating a simulated depth that distorted the height comparison.
Background shading: the plotting area was filled with a subtle cross-hatch pattern or colored background tone.
A legend box in the upper right of the plotting area, with a thick black border, identifying which bar was "observed" and which was "expected."
Dense tick marks: short black tick marks on both axes, spaced so closely together that they formed near-continuous lines.
All-caps bold title in a prominent serif font at the top.
Subtitle and source attribution in smaller font at the bottom, but framed in their own box with a decorative line underneath.

The data ink — the parts that actually encoded data — was limited to the bars themselves and the hospital labels on the x-axis. Everything else was non-data ink. By rough estimate, the data-ink ratio was around 0.15 to 0.20. Most of the ink on the page was structure, decoration, or framing.

The effect on the reader was predictable. The visual system had to sort through the structural noise to find the data. The 3D shading on the bars introduced a subtle distortion that made bars of identical actual height appear to differ, because the perspective effect shifted the apparent top of each bar. The heavy gridlines competed with the bars for visual weight. The legend box pulled the eye away from the data. The overall impression was of a chart that was trying to look professional and serious, and in doing so, actively interfering with its own readability.

The Visualization: Tufte's Redesign

Tufte's redesigned version applied the declutter procedure in its most aggressive form. The result was dramatically cleaner than the original and dramatically easier to read. The design choices, decomposed by category:

Removal (Step 1 of the declutter procedure):

Figure border — deleted. No rectangle around the chart; the chart is its own container.
Top spine — deleted. The plotting area no longer needs to be visually enclosed.
Right spine — deleted. Same reason.
3D effect on the bars — deleted. Bars became flat rectangles, removing the perspective distortion and restoring the length-based encoding that bar charts depend on.
Background shading — deleted. The plotting area became plain white.
Legend box — deleted. In its place, Tufte used direct labels on a single pair of bars at the left of the chart, identifying which was "observed" and which was "expected" once. The labels served as a legend for the rest of the chart.
Vertical gridlines — deleted. For a bar chart comparing categorical groups, vertical gridlines serve no reading function (the bars themselves separate the categories).
Most tick marks — deleted. Tufte reduced the tick density dramatically, keeping only a few tick marks at meaningful intervals (for example, every 5 percentage points rather than every half point).

Lightening (Step 2):

Bottom spine — lightened. The remaining x-axis spine was drawn in a thin medium gray, present but receding.
Left spine — lightened. Same treatment for the y-axis.
Horizontal gridlines — lightened dramatically. Tufte used very pale gray gridlines at major intervals, visible enough to help the reader estimate values but faint enough to recede behind the bars.
Tick marks — lightened. The few remaining tick marks were thin and short.
Title and attribution text — lightened. The title became smaller and less prominently weighted; the source attribution became a simple line of italicized text without decorative framing.

Simplification (Step 3):

Bar colors — simplified. The original had used multiple colors for visual effect. The redesign used two tones — one for observed, one for expected — differentiated only by shade. No palette of colors, no gradients.
Font choices — simplified. One font family throughout, in a small number of weights and sizes. No all-caps except where semantically appropriate.
Axis labels — simplified. Units inline with the axis labels, no separate unit annotations.
Arrangement — preserved. The order of hospitals was kept the same as the original for fair comparison, but Tufte added a subtle visual grouping: hospitals clearly above expected were grouped together, hospitals clearly below expected were grouped together, and hospitals near expected were in the middle. This small organizational change made the finding visible without adding any ink.

The result was a chart that looked, at first glance, almost empty compared to the original. There were bars, labels, a couple of axis marks, and a title. That was it. The data-ink ratio had moved from perhaps 0.15 in the original to perhaps 0.60 or 0.70 in the redesign — a roughly fourfold increase. Every drop of ink on the page was now either data or essential comprehension aid. Nothing was decoration. Nothing was structure for its own sake.

The Impact

Tufte's redesign was not a real-world intervention in the sense of the Fox News or Challenger case studies. The original chart had been published in a government report; Tufte's version appeared only in his book as a teaching example. No policy decisions were changed by the redesign, no hospitals altered their practices, and the original continued to exist in the archival record.

But the impact of the redesign — and of similar redesigns throughout Visual Display — was enormous in the field of data visualization. Tufte's before/after comparisons became the canonical teaching examples for the declutter procedure. Visualization courses in statistics, journalism, and design have used them for forty years. Design guidelines at major newspapers and data journalism teams cite Tufte's redesigns as the reference standard for clean chart design. The New York Times graphics desk, the Financial Times, and the Washington Post have all produced work that is recognizably in the Tufte lineage — sparse charts, pale gridlines, direct labeling, action titles, no decorative borders.

More importantly, the redesigns taught a generation of designers a specific mental move: look at a chart and see what can be removed. Before Tufte, the default creative instinct was to add — more colors, more labels, more decoration, more polish. After Tufte, the default creative instinct for trained designers shifted to subtracting. This shift is the main reason Chapter 6 even exists: the declutter mindset is a learned discipline, and Tufte taught it to the field.

The hospital mortality chart is one of many redesigns in the book, but it is particularly useful as a teaching example because the original was genuinely typical of government chart production at the time — not a strawman, not a cherry-picked bad example. Every element of chart-junk that appeared in the original appears in real published charts today. The redesign's moves are the same moves you will make on your own charts, in the same order, for the same reasons.

Why It Worked: A Theoretical Analysis

The Tufte redesign succeeded because each declutter move had a theoretical justification in perceptual science and design principles. Going element by element:

Removing the figure border. The figure border adds non-data ink without helping the reader understand the data. It is structural chart-junk in the chapter's taxonomy. Removing it reduces visual weight and gives the eye one less element to process. Pre-attentive processing is a limited resource (Chapter 2), and the figure border consumes a small portion of that resource for no benefit.

Removing the top and right spines. Same category as the figure border: structural chart-junk. The top and right spines enclose the plotting area but serve no reading function — the reader does not look at the top spine to read values from the y-axis, because the y-axis values are on the left spine. Deleting the top and right spines is a pure win.

Removing the 3D effect. This is the single most important change for accuracy. The 3D effect was dimensional chart-junk that introduced distortion into every comparison the reader tried to make. Chapter 4 argued that 3D effects decrease encoding accuracy because viewers cannot mentally reverse the perspective transformation. By flattening the bars, the redesign restored the pure length encoding that bar charts depend on (Cleveland and McGill, Chapter 2) and eliminated a source of false visual differences.

Removing the background shading. Decorative chart-junk. The background shading did not encode anything; it was there for aesthetic effect. Removing it let the bars stand out against a plain white background, which maximized the luminance contrast between data and non-data elements — exactly the hierarchy that the chapter's "luminance first" heuristic (from Chapter 3, applied here) would recommend.

Replacing the legend with direct labels. Redundant chart-junk. A legend in a boxed frame is non-data ink, and if direct labeling can accomplish the same identification with less visual weight, the legend is redundant. Direct labels also reduce the eye movement required to interpret the chart — the reader does not have to shift gaze between the bars and the legend box.

Lightening the gridlines and spines. The elements that survived Step 1 became visually less prominent in Step 2. This served the hierarchy principle: data should be the most prominent visual element, and all non-data comprehension aids should recede into the background. Pale gray gridlines are still readable but do not compete with the bars. Thin spines frame the plotting area without dominating it.

Simplifying the color palette. With only two bar categories (observed and expected), two colors are enough. Using more colors would have added decoration without encoding additional information. By simplifying to a minimal palette, the redesign freed color as a channel for future use — if the report later needed to highlight specific hospitals, the highlight could be a color change that would pop out against the plain two-tone background.

Grouping by performance. This was a subtle organizational change that made the finding more visible without adding any ink. Hospitals performing above expectations were clustered together; hospitals performing below expectations were clustered together; hospitals near expectations were in the middle. The reader's eye could identify the two "performance groups" pre-attentively, even before reading individual hospital names. This is an application of Gestalt principles (proximity and similarity) to chart organization — a free improvement that costs nothing in ink.

Every move had a reason. Every reason tied back to principles from Chapters 2 through 5. The redesign was not an exercise in minimalism for its own sake — it was the systematic application of a framework.

Complications and Limits

Tufte's redesigns are famous, but they are not universally praised, and it is worth acknowledging the limits.

Some critics argue the redesigns are too austere. The Bateman et al. 2010 paper discussed in Section 6.5 provides empirical evidence that thematic embellishment can improve memorability of charts. Tufte's redesigns tend to strip every embellishment, which may hurt memorability in contexts where readers need to remember the finding weeks later. For a government report on hospital mortality — where the reader may read the report once and then recall the main finding in a meeting — a little memorable embellishment might help. The redesign optimized for reading accuracy and reading speed, which are not the only metrics.

Some of Tufte's choices are matters of taste, not law. The decision to delete the legend box entirely and use direct labeling is defensible but not the only choice. A lighter, smaller legend in a corner — without the heavy border — could have achieved similar decluttering without the slightly awkward inline labels. Different designers would make different choices within the same framework.

The redesign assumes a specific reading context. Tufte's chart is optimized for a reader who has time to study it — the kind of reader who would read a book like Visual Display. For a reader who will glance at the chart for 5 seconds in a policy memo, some of the decluttering might have gone too far (for example, the minimal title might not tell the skimmer what the chart is about). Context matters. The same redesign might be too minimal for some uses and not minimal enough for others.

The original chart's data was complex enough to warrant discussion. Tufte focused on the visual presentation, but he did not substantially engage with whether the data itself was appropriate — whether the expected rates were correctly calculated, whether the hospital sample was representative, whether mortality is the right quality metric, or whether confounding factors were adequately controlled. These are statistical and policy questions, not chart-design questions. A redesign can make a chart more readable without making the underlying analysis more rigorous. Clean presentation does not substitute for good statistics.

Lessons for Modern Practice

The Tufte redesign is forty years old, but the lessons apply directly to chart design today.

Study real charts, not toy examples. The hospital mortality chart was a real published chart, and Tufte's redesign applied the declutter procedure to a real case. When you want to practice decluttering, find real charts (in publications, reports, or old presentations) rather than textbook demonstrations. The real-world charts have all the messiness of actual production, and the decluttering moves you make on them are the moves you will need in your own work.

The order of moves matters. Tufte's redesign follows the remove-lighten-simplify sequence. The heaviest lifting is in the removal step — deleting the 3D effect, the figure border, the spines, the background, and the legend box. The lightening step is smaller in its individual moves but important for the overall hierarchy. The simplification step polishes what remains. This is the same order the chapter recommended, and it is the order that minimizes wasted work.

Subtle changes can have big effects. Some of Tufte's moves — like the grouping of hospitals by performance — added almost no ink but made the finding dramatically more visible. When decluttering, look for these "free improvements": changes that cost nothing in visual weight but enable the reader to see something they could not see before. Reordering, re-sorting, and regrouping are all candidates for free improvements.

Direct labeling is often better than a legend. If you have two or three categories, direct labeling on the first instance of each usually works and lets you delete the legend entirely. The viewer's eye does not have to move to a separate legend box, and the chart becomes cleaner. This is a specific technique worth adopting by default.

Colors should be used sparingly. The original used multiple colors; the redesign used two tones differentiated by shade. Most charts need fewer colors than they have. When in doubt, use fewer. Reserve color as a tool for highlighting specific things — a selected series, an annotated point, a threshold — rather than distributing color across all categories by default.

Every design choice needs a reason. The Tufte redesign was not "make it look nicer." Every move had a theoretical justification in perceptual science and chart design principles. When you declutter your own charts, be able to state the reason for each move. If you cannot articulate why you are lightening a particular gridline or deleting a particular element, either the move is not justified or you do not yet understand why it is. Working through the justification builds the mental framework you will use on future charts.

Discussion Questions

On the limits of minimalism. Tufte's redesigns have been praised for forty years but also criticized as too austere. Where do you think the limit is? What kinds of charts benefit most from aggressive decluttering, and what kinds benefit from some retained embellishment? How do you decide?
On the role of 3D effects. The original chart used 3D bars, which the redesign flattened. 3D effects are still common in business intelligence tools and corporate slide templates. Why do you think 3D effects persist despite the evidence against them? What would it take to eliminate them from everyday practice?
On the "free improvement" of reordering. Tufte's grouping of hospitals by performance added no ink but made the finding visible. Think about a chart you have made. Could you reorder the categories (alphabetical, by value, by some other meaningful grouping) to make the finding more visible without changing the chart type or adding ink?
On direct labeling vs. legends. The redesign eliminated the legend box in favor of direct labeling. When is this the right move, and when does a legend still make sense? How many categories is too many for direct labeling?
On the difference between clean design and good analysis. Tufte's redesign made the chart more readable without making the underlying analysis more rigorous. Is there a risk that clean-looking charts give a false impression of analytical rigor? How should you, as a chart maker, balance the craft of presentation with the discipline of analysis?
On your own default charts. Think about the default charts your plotting tool produces. How many of the chart-junk categories are represented in those defaults? What would be the most valuable change to your personal defaults (rcParams, theme, style file) to make every chart you produce closer to the Tufte redesign by default?

Tufte's hospital mortality redesign is a single example of a general practice. The general practice — look at a chart, identify chart-junk, apply the declutter procedure, produce a cleaner version — is the craft this chapter has been teaching. Tufte was exceptionally disciplined about it, which is why his redesigns became canonical. The discipline is available to you. The tools are in this chapter. The only thing left is practice.