Case Study 1: The Chart That Changed Public Health — John Snow's Cholera Map
Tier 1 — Verified Historical Account: This case study describes the well-documented 1854 Broad Street cholera outbreak in London and John Snow's investigation. The events, dates, locations, and conclusions described here are based on historical records. Snow's map and analysis are real and are preserved in the John Snow Archive and Research Companion at UCLA. The primary scholarly source is Steven Johnson's The Ghost Map: The Story of London's Most Terrifying Epidemic — and How It Changed Science, Cities, and the Modern World (Riverhead Books, 2006), supplemented by Snow's own publication, On the Mode of Communication of Cholera (2nd edition, 1855).
The Setting
London, August 1854. The city is the largest in the world — home to over 2.5 million people — and it is suffocating under its own growth. The sewers, where they exist, dump directly into the Thames. Cesspools overflow into basements. The streets of the poorest neighborhoods are open gutters. The smell, by all accounts, is indescribable.
And people are dying. Not slowly, not mysteriously — violently and rapidly. Cholera has returned to London, and in the neighborhood around Broad Street in Soho, it is killing with terrifying speed. In the first three days of September 1854, 127 people living within a few blocks of each other will die. By the end of the outbreak, the death toll in this small neighborhood will reach over 600.
The medical establishment has an explanation: miasma. The prevailing theory holds that diseases like cholera are caused by "bad air" — the noxious fumes rising from rotting organic matter, stagnant water, and human waste. Miasma theory is not stupid; it correctly associates disease with unsanitary conditions. But it has the mechanism exactly wrong. The air isn't the problem. The water is.
One man suspects this. His name is John Snow, and he is about to create one of the most famous visualizations in the history of science.
The Doctor and His Hypothesis
John Snow was not a typical Victorian physician. Born in 1813 to a working-class family in York, he worked his way into medicine through an apprenticeship and years of study. By 1854, he had already made a name for himself as one of London's leading anesthesiologists — he had administered chloroform to Queen Victoria during the birth of Prince Leopold. He was respected, accomplished, and quietly convinced that the medical mainstream was wrong about cholera.
Snow had been developing his waterborne theory of cholera transmission since the previous London epidemic in 1848-49. He had published a pamphlet, On the Mode of Communication of Cholera, arguing that the disease was transmitted through contaminated water, not through the air. The pamphlet was largely ignored. The miasma theory had centuries of tradition behind it, and Snow's evidence, while suggestive, was not yet conclusive.
The Broad Street outbreak gave him the data he needed.
Gathering the Data
Snow did something that now seems obvious but was revolutionary for his time: he went door to door. In the middle of a deadly epidemic, while most people with the means to leave were fleeing the neighborhood, Snow walked into the afflicted area and began collecting data.
He recorded the address of every cholera death he could identify. He interviewed surviving family members. He asked where they got their water. He visited the local water pumps and inspected the water. He compiled a dataset — though he would not have used that word — of deaths and their spatial locations.
This is the data science lifecycle in action, two centuries before the term existed:
- Question: Is cholera transmitted through contaminated water, specifically from a particular source?
- Data collection: Door-to-door interviews, death records from the General Register Office, addresses mapped to locations.
- Cleaning: Snow had to verify addresses, correct records, and account for people who had died in hospitals rather than at home (and thus were recorded at the hospital address rather than where they were actually exposed).
- Analysis: Spatial analysis — plotting deaths on a map to reveal geographic patterns.
- Communication: The map itself, presented to local authorities as evidence for action.
The Map
Snow's map — which he published in the second edition of On the Mode of Communication of Cholera in 1855 — is a masterpiece of data visualization, all the more remarkable because "data visualization" was not yet a recognized discipline.
The map shows a section of London's streets. Small black bars, stacked next to each building where they occurred, represent individual cholera deaths. The map also shows the locations of the neighborhood's water pumps, marked with dots.
The pattern is unmistakable. The deaths cluster — overwhelmingly, densely, undeniably — around one pump: the Broad Street pump. As you move away from that pump, the death counts drop sharply. Blocks only a few hundred meters away have almost no deaths.
Let's analyze this map through the grammar of graphics lens from this chapter:
| Component | Snow's Map |
|---|---|
| Data | Individual cholera deaths, geocoded to street addresses |
| Aesthetics | Position (latitude/longitude of each death), with marks at each address |
| Geom | Small rectangular bars, stacked at each address (a kind of spatial bar chart) |
| Scale | Geographic — one death per bar |
| Coordinates | Geographic (a street map of Soho) |
| Faceting | None |
The aesthetic mapping is simple: each bar represents one death, and its position on the map represents where that person lived. But the simplicity is the genius. No statistical test is needed. No regression. No p-value. The spatial pattern screams the answer: something about the Broad Street pump is killing people.
The Clinching Evidence
The map alone was powerful, but Snow went further. He identified anomalies in his own data — cases that seemed to contradict his theory — and investigated them.
One anomaly: a workhouse (a large poorhouse) located very close to the Broad Street pump had remarkably few cholera cases. If proximity to the pump was the problem, the workhouse should have been devastated. Snow investigated and discovered that the workhouse had its own private water well. The residents never used the Broad Street pump.
Another anomaly: a woman living in Hampstead, miles away from Broad Street, died of cholera during the outbreak. How could she have been exposed to the Broad Street pump? Snow tracked down her family and learned that she had formerly lived in Broad Street, loved the taste of the water from the pump there, and had a bottle of it delivered to her Hampstead home every day by cart. She and her niece, who visited her and drank the same water, were the only cholera deaths in Hampstead.
These anomalies — the workhouse that should have had cases but didn't, and the Hampstead woman who shouldn't have had a case but did — provided exactly the kind of evidence that a simple spatial cluster could not. They demonstrated that it was not proximity to the pump that mattered, but actual use of the pump's water.
The Outcome
On September 7, 1854, Snow presented his evidence to the Board of Guardians of St James's Parish. The board agreed to remove the handle of the Broad Street pump, making it inoperable. The epidemic was already waning by this point — many residents had fled the neighborhood — but the symbolic and scientific significance of the act was enormous. Snow had demonstrated, through careful data collection and visual analysis, that cholera was waterborne.
The full acceptance of Snow's theory took years. The miasma theory did not die immediately. But Snow's map became an iconic argument — a visualization that made the evidence visible, tangible, and impossible to dismiss. It is now considered a foundational work in both epidemiology and data visualization.
What This Case Study Teaches About Visualization
Snow's cholera map illustrates several principles from this chapter:
1. Visualization is analysis, not just presentation. Snow didn't create his map after he had already figured out the answer. The map was his analytical tool. Plotting the deaths on the map revealed the spatial pattern that pointed to the pump. This is exploratory visualization at its most powerful — visualization as thinking.
2. Simple is powerful. The map uses one of the simplest possible encodings: position on a geographic coordinate system, with small marks representing individual observations. No color encoding. No fancy statistics. No three-dimensional effects. The simplicity is why it works — the pattern is immediately, viscerally obvious.
3. Anomalies are as important as patterns. Snow didn't just point to the cluster and declare victory. He investigated the exceptions — the workhouse, the Hampstead woman — and showed that they actually strengthened rather than weakened his theory. Good data visualization raises questions as well as answering them.
4. The chart served a specific audience and purpose. Snow wasn't making the map for an academic journal. He was presenting evidence to local officials who needed to make a decision: should we disable this pump? The map was an explanatory visualization with a clear persuasive goal — and it worked.
5. Data collection matters as much as data display. The map is famous, but the real heroism was in the data collection. Snow walked door to door during a deadly epidemic, interviewed grieving families, tracked down anomalies, and verified addresses. The visualization was powerful only because the data underneath it was painstakingly gathered and carefully validated.
The Legacy
Snow's map has been called "the founding document of epidemiology." It demonstrated that spatial analysis of disease could reveal transmission mechanisms. It pioneered the idea that a visualization of data could constitute a scientific argument. And it showed that data science — the combination of a clear question, careful data collection, and insightful analysis — could save lives.
Today, geographic information systems (GIS) and spatial data analysis are core tools in public health. Disease mapping is routine. Contact tracing uses the same logic Snow applied. And every time a data scientist plots data on a map and discovers a spatial pattern, they are following in the footsteps of a Victorian physician who grabbed a pencil, drew some bars on a street map, and changed the course of medicine.
The pump handle was removed on September 8, 1854. A replica of the pump — without a handle — still stands at the corner of Broadwick Street and Lexington Street in London's Soho neighborhood. It's a monument to what one chart can do.
Discussion Questions
-
Snow's map was an exploratory visualization — he used it to discover the pattern — that also served as an explanatory visualization when presented to the Board of Guardians. Is it common for the same chart to serve both purposes? What, if anything, might Snow have changed about the map when presenting it to officials versus using it for his own analysis?
-
The miasma theory wasn't unreasonable — disease really did correlate with bad smells, because bad smells and contaminated water often co-occurred in the same unsanitary conditions. How does this example illustrate the difference between correlation and causation? (We'll explore this distinction formally in Chapter 24.)
-
Snow's investigation of anomalies (the workhouse, the Hampstead woman) was crucial to his argument. In your own project data, what would an "anomaly" look like, and how would you investigate it?
Sources
- Snow, John. On the Mode of Communication of Cholera. 2nd edition. London: John Churchill, 1855. (Original publication containing the map.)
- Johnson, Steven. The Ghost Map: The Story of London's Most Terrifying Epidemic — and How It Changed Science, Cities, and the Modern World. New York: Riverhead Books, 2006.
- Tufte, Edward R. Visual Explanations: Images and Quantities, Evidence and Narrative. Cheshire, CT: Graphics Press, 1997. (Chapter on Snow's map and the analysis of spatial evidence.)
- Brody, Howard, et al. "Map-making and myth-making in Broad Street: the London cholera epidemic, 1854." The Lancet 356, no. 9223 (2000): 64-68.
End of Case Study 1