44 min read

> "Not everything that can be counted counts, and not everything that counts can be counted."

Learning Objectives

  • Define the streetlight effect (observational bias) and recount the classic parable that gives it its name, identifying the structural error at its core
  • Analyze how the streetlight effect shapes research agendas in psychology (the WEIRD problem), medicine (neglected tropical diseases), archaeology (excavation bias), and economics (GDP as proxy for wellbeing)
  • Evaluate the consequences of the McNamara Fallacy -- measuring what is easy to count rather than what counts -- across military strategy, education, policing, and organizational management
  • Explain the data availability bias in big data and data science, recognizing that algorithmic conclusions are bounded by the representativeness of the training data
  • Identify countermeasures for the streetlight effect -- dark data awareness, triangulation, mixed methods, and the deliberate study of the hard-to-measure -- and evaluate their effectiveness
  • Apply the threshold concept -- Measurement Creates Its Own Reality -- to recognize that the choice of what to measure shapes what counts as knowledge, what receives attention, and ultimately what gets done

Chapter 35: The Streetlight Effect -- How Every Field Searches Where the Light Is Good

Research Methodology, Policing, Data Science, Archaeology, Medicine, Economics

"Not everything that can be counted counts, and not everything that counts can be counted." -- Often attributed to Albert Einstein, though likely originating with sociologist William Bruce Cameron, 1963


35.1 The Joke That Explains Everything

A police officer finds a drunk man crawling on his hands and knees under a streetlight. "What are you doing?" asks the officer. "Looking for my keys," the drunk replies. "Where did you lose them?" "Over there, in the park." "Then why are you looking here?" "Because the light is better here."

The joke is old -- versions of it appear in Sufi teaching stories attributed to Nasruddin Hodja as early as the thirteenth century, and it circulates in virtually every culture with a tradition of wisdom humor. The reason it persists is not that it is funny, though it is. The reason it persists is that it describes, with painful accuracy, a pattern so pervasive in human inquiry that it shapes the structure of knowledge itself.

The drunk's error seems absurd. No rational person would search where the keys are not, simply because the searching is easier there. And yet -- this chapter will argue -- virtually every field of human inquiry commits precisely this error, systematically and repeatedly. Psychologists study Western undergraduates because they are convenient, not because they are representative of humanity. Economists measure GDP because it is quantifiable, not because it captures wellbeing. Police patrol where the data is, not where the crime is. Archaeologists excavate where the ground is accessible, not where the settlements were. Medical researchers study diseases that attract funding, not diseases that kill the most people. Data scientists analyze what is in the dataset, not what should be in it.

The pattern has a name: the streetlight effect, also known as the drunkard's search. And its formal structure is this: when searching for something, humans and institutions systematically bias their search toward areas where observation is easy, data is available, or methods are well-developed -- regardless of whether those areas are likely to contain what they are looking for. The bias is not random. It is structural. It operates at every level of inquiry, from the individual researcher choosing a topic to entire civilizations deciding what counts as knowledge.

This chapter traces the streetlight effect across six domains: research methodology, policing, data science, archaeology, medicine, and economics. In each domain, the same pattern appears: the light of available data, established methods, or institutional convenience illuminates certain questions and leaves others in darkness. The illuminated questions get studied. The dark questions do not. And over time, the accumulated body of knowledge -- the sum total of what "we know" -- is shaped not by what is important but by what is visible.

The deeper pattern, the one that connects all six domains, is this: methodological convenience masquerades as methodological rigor. We study what we can study and quietly pretend it is what we should study. The streetlight does not just reveal. It defines.

Fast Track: The streetlight effect is a universal pattern of observational bias -- searching where it is easy to look rather than where the answer is. If you already grasp this core idea, skip to Section 35.5 (Data Science) for the contemporary big data version, then read Section 35.8 (The Deeper Pattern) for the formal structure, Section 35.9 (Countermeasures) for practical remedies, and Section 35.10 for the threshold concept synthesis. The threshold concept is Measurement Creates Its Own Reality: by choosing what to measure, we do not just describe reality -- we shape what counts as knowledge, what gets attention, and ultimately what gets done.

Deep Dive: The full chapter develops each domain's streetlight effect in concrete detail, extracts the shared deep structure, connects it to signal and noise (Ch. 6), the McNamara Fallacy (Ch. 15), and survivorship bias (Ch. 37), and examines countermeasures that have proven effective in specific fields. Read everything, including both case studies. Section 35.8 on the deeper pattern is where the chapter's most ambitious synthesis occurs.


35.2 Research Methodology -- The McNamara Fallacy and the WEIRD Problem

Counting What Doesn't Count

In the early 1960s, Robert McNamara -- the former Ford Motor Company president whom John F. Kennedy appointed as Secretary of Defense -- brought a revolution in management thinking to the Pentagon. McNamara was a quantitative thinker, a devotee of systems analysis, operations research, and statistical control. He believed that if you could measure something, you could manage it. And if you could not measure it, it probably did not matter.

Applied to the Vietnam War, this philosophy produced the most famous streetlight effect in modern military history. McNamara and his team of analysts -- the "Whiz Kids" -- needed a metric for whether the war was being won. The problem was that the war in Vietnam did not lend itself to the kind of metrics that worked in manufacturing or even in conventional warfare. There was no front line advancing or retreating. There was no territory being permanently captured. There was no clear division between combatants and civilians.

What could be measured was bodies. Specifically, enemy bodies. The "body count" became the primary metric of success in Vietnam -- the number of enemy combatants killed per engagement, per week, per month, per quarter. The metric was quantifiable, aggregable, and reportable. It could be graphed. It could be compared across units, commanders, time periods. It satisfied every requirement of rigorous data analysis -- except the one that mattered: it did not measure whether the war was being won.

The body count created perverse incentives throughout the military chain of command. Officers inflated their numbers because promotions depended on them. Civilians were reclassified as combatants after being killed. Units prioritized engagements that would produce high body counts over engagements that would achieve strategic objectives. The metric became the mission. And because the metric kept going up -- more bodies every quarter -- the analysts in Washington could report, with genuine statistical confidence, that the war was going well. It was not going well. The Tet Offensive of 1968 demonstrated what the body count had obscured: the enemy's capacity and will to fight were undiminished.

The philosopher Charles Handy later named this pattern the McNamara Fallacy, and it has four steps:

  1. Measure whatever can be easily measured.
  2. Disregard what cannot be easily measured, or give it an arbitrary quantitative value.
  3. Presume that what cannot be easily measured is not important.
  4. Presume that what cannot be easily measured does not exist.

This is the streetlight effect formalized as a decision-making methodology. And if it seems like a relic of 1960s technocratic hubris, consider how many modern institutions operate by exactly the same logic. Schools measure standardized test scores because they are quantifiable, then design curricula to maximize those scores, then declare that the scores represent educational quality. Hospitals measure readmission rates, patient throughput, and procedure counts, then optimize for those metrics, then treat the metrics as proxies for patient health. Companies measure quarterly earnings, stock price, and market share, then manage to those numbers, then confuse the numbers with actual value creation. The connection to Chapter 15's discussion of how metrics become targets -- and how targets corrupt the behavior they were designed to measure -- should be immediately apparent. Goodhart's Law (when a measure becomes a target, it ceases to be a good measure) is the McNamara Fallacy applied to governance.

Connection to Chapter 15 (Goodhart's Law): The McNamara Fallacy is the streetlight effect applied to management. Goodhart's Law is the McNamara Fallacy applied to incentive design. The streetlight effect is the deepest version: the systematic tendency to let what is measurable define what is important. The three concepts form a nested hierarchy of the same structural error, operating at different scales.

The WEIRD Problem

If the McNamara Fallacy describes the streetlight effect in military strategy, the WEIRD problem describes it in the behavioral sciences. In 2010, psychologists Joseph Henrich, Steven Heine, and Ara Norenzayan published a paper with a title that sent a tremor through their discipline: "The Weirdest People in the World?" Their argument was devastating in its simplicity.

The vast majority of published research in psychology -- the studies from which textbooks derive their claims about human nature, cognition, emotion, perception, and social behavior -- is based on a vanishingly narrow sample of humanity. The subjects are overwhelmingly Western, Educated, Industrialized, Rich, and Democratic -- WEIRD. More specifically, they are American college undergraduates, who participate in studies for course credit. This population constitutes roughly twelve percent of the world's population but accounts for the majority of subjects in published psychological research.

The problem is not merely that the sample is narrow. The problem is that WEIRD populations are, on many psychological dimensions, outliers. Henrich and colleagues documented that WEIRD subjects score differently from non-WEIRD populations on measures of visual perception (the Muller-Lyer illusion, to which many non-Western populations are immune), spatial reasoning, categorization, moral reasoning, concepts of fairness (the Ultimatum Game, in which WEIRD subjects behave unlike most of the world's cultures), self-concept (the independent vs. interdependent self), and cooperation. The differences are not marginal. On some measures, WEIRD subjects are at the extreme tail of the global distribution.

Yet for decades, researchers had been studying WEIRD undergraduates and publishing their findings as claims about "human" psychology -- as if the behavior of American twenty-year-olds in a university laboratory were representative of the species. The streetlight effect was operating with textbook precision: WEIRD subjects were studied because they were available (they were literally down the hall from the researcher's office), cooperative (they needed the course credit), and cheap (no travel, no translators, no cross-cultural protocols). The research was conducted under the streetlight of convenience, and the conclusions were generalized to the entire park.

The WEIRD problem remains largely unresolved. Cross-cultural research is expensive, logistically difficult, and methodologically complex. It requires language skills, cultural competence, institutional partnerships, and funding mechanisms that are not well-developed in most psychology departments. The streetlight is still on, and most researchers are still searching under it -- not because they are lazy or ignorant, but because the institutional incentives (publish frequently, publish quickly, use established methods) align perfectly with the convenience bias.


🔄 Check Your Understanding

  1. Explain the McNamara Fallacy in your own words and identify an example outside of military strategy where the same four-step logic operates.
  2. Why are WEIRD subjects outliers rather than representative of humanity? What does this imply about the generalizability of findings in published psychology research?
  3. How does the streetlight effect differ from simple laziness? Why does the chapter argue that the bias is structural rather than personal?

35.3 Policing -- Hot Spots and Dark Figures

The streetlight effect in policing is not a metaphor. It is, in a sense, literal.

In the 1990s, a revolution in policing took place under the banner of "data-driven" or "evidence-based" law enforcement. The most influential innovation was hot-spot policing -- the practice of concentrating police resources in the specific geographic locations where crime data indicates crime is most concentrated. The logic seems impeccable: crime data shows that a disproportionate share of reported offenses occurs in a small number of locations. Concentrate patrols and interventions in those hot spots, and you will reduce crime more efficiently than if you distribute resources uniformly.

Hot-spot policing has produced genuinely positive results in numerous controlled trials. The approach is not fraudulent. The data are real. Concentrating resources in high-crime areas does reduce certain types of crime in those areas. But the streetlight effect operates at a deeper level than the strategy's advocates typically acknowledge.

The "crime data" that identifies hot spots is not a census of all crimes committed. It is a record of crimes reported to the police, observed by the police, or discovered through police investigation. This distinction is crucial. Criminologists have long recognized the existence of the dark figure of crime -- the gap between crimes committed and crimes recorded. For many categories of offense -- domestic violence, sexual assault, wage theft, white-collar fraud, environmental violations, police misconduct -- the dark figure is enormous. Estimates suggest that the majority of crimes in these categories are never reported, never investigated, and never enter the dataset.

Hot-spot policing, therefore, does not concentrate resources where crime is. It concentrates resources where recorded crime is. And recorded crime is shaped by a cascade of observational biases: where police are already patrolling (more patrol means more observed crime, which means more data, which means more patrol -- a feedback loop), which communities trust the police enough to report crime (wealthy neighborhoods report property crime; marginalized communities may not report anything), which types of crime are visible to patrol officers (street-level drug dealing is visible; corporate fraud is not), and which offenses police prioritize (drug possession may be aggressively enforced; wage theft may not be enforced at all).

The result is a data-driven system that perpetuates and amplifies existing patterns of enforcement rather than responding to the actual distribution of crime. Neighborhoods that are already heavily policed generate more data, which attracts more police, which generates more data. Neighborhoods that are under-policed generate less data, which attracts less police, which generates less data. The system is not finding crime. It is finding the crime it is already looking for, in the places it is already looking.

This is the streetlight effect operating through a feedback loop -- a connection to Chapter 2 that should be immediately apparent. The streetlight is not just illuminating the search. It is expanding its own circle of light while deepening the darkness elsewhere.

Connection to Chapter 2 (Feedback Loops): Hot-spot policing creates a positive feedback loop: policing generates data, data directs policing, more policing generates more data. The loop is self-reinforcing and self-confirming. The areas that receive the most policing appear to have the most crime, which justifies the policing, which generates the data that identifies the crime. This is the streetlight building its own lamp post.

The implications are not merely statistical. They are human. Predictive policing algorithms trained on historically biased crime data systematically direct police attention toward communities that are already over-policed -- disproportionately low-income communities and communities of color. The algorithm does not intend bias. It inherits bias from its training data, which inherits bias from decades of racially disparate enforcement patterns, which themselves were shaped by earlier biases. The streetlight effect, operating through algorithmic amplification, becomes a mechanism for perpetuating structural inequality.

Retrieval Prompt: Pause before continuing. Can you articulate the difference between "crime data" and "crime"? Why is this distinction critical for evaluating data-driven policing? How does the dark figure of crime relate to the streetlight effect? And can you identify a feedback loop similar to the one described in hot-spot policing operating in another domain -- education, medicine, or social media?


35.4 Archaeology -- Digging Where It's Easy

Archaeology might seem like an odd domain for a discussion of observational bias. Archaeologists are, after all, professionals trained specifically to uncover the hidden and the buried. But the discipline has been shaped by one of the most persistent streetlight effects in all of science: the tendency to excavate where the digging is easy.

Consider the problem of site selection. Where should an archaeologist dig? In principle, the answer is: where ancient people lived, worked, worshipped, and died. But the distribution of ancient human activity does not map neatly onto the distribution of modern accessibility. Ancient settlements were everywhere -- on mountaintops, in dense forests, on now-submerged coastlines, beneath modern cities, in regions that are politically unstable or physically dangerous. The sites that get excavated are, overwhelmingly, the ones that are accessible, affordable, and politically permissible.

Valley bias is perhaps the most well-documented. The great archaeological discoveries of the nineteenth and twentieth centuries were concentrated in river valleys and coastal plains: Mesopotamia, the Nile Valley, the Indus Valley, coastal Peru, the Mediterranean basin. This is partly because river valleys supported the largest ancient populations. But it is also because river valleys are flat, their soil is soft and easy to excavate, their climates are often temperate, and -- critically -- they are where modern populations live, which means they are where universities, museums, and national governments are located. The infrastructure of archaeological research -- funding, labor, equipment, institutional support -- clusters in the same places that the accessible sites are.

Mountain archaeology, by contrast, is expensive, physically demanding, and logistically difficult. Forest archaeology is constrained by vegetation that conceals surface evidence and complicates excavation. Underwater archaeology -- which would be necessary to study the vast stretches of coastline that were exposed during Ice Age sea-level lows and are now submerged -- requires specialized equipment and expertise that most archaeology departments do not possess. And archaeology in politically unstable regions is often simply impossible.

The result is a systematic distortion of our understanding of the human past. We know a great deal about civilizations that arose in accessible river valleys and relatively little about human communities in mountains, forests, deserts, and now-submerged coastlines. The archaeological record is not a representative sample of human history. It is a biased sample, shaped by the streetlight of modern accessibility.

This bias interacts with another: preservation bias. The materials that survive for millennia -- stone, fired ceramics, metal -- are the materials that archaeologists can find and study. The materials that decay -- wood, textiles, leather, food, paper -- are largely invisible in the archaeological record. Civilizations that built in stone (Egypt, Rome, the Maya) leave monumental evidence that is easy to find and interpret. Civilizations that built in wood, thatch, or perishable materials leave evidence that is subtle, fragmentary, or absent entirely. The result is that the archaeological record is systematically biased toward cultures that used durable materials in accessible locations -- and systematically silent about cultures that used perishable materials in difficult terrain.

The consequences for historical understanding are profound. For decades, the standard narrative of human civilization was built almost entirely on evidence from a handful of river valley civilizations. The complexity, sophistication, and extent of human societies in tropical forests, arctic environments, mountain ranges, and coastal zones were systematically underestimated -- not because evidence of their achievements did not exist, but because it had not been searched for in places that were hard to reach, or it had not survived in materials that were easy to find.

Recent work using remote sensing technologies -- LIDAR in particular, which can penetrate forest canopies to reveal structures invisible from the ground -- has begun to reveal the extent of the distortion. LIDAR surveys in Central America, Southeast Asia, and the Amazon basin have uncovered vast urban complexes, road networks, and agricultural systems that were completely invisible to traditional archaeological methods. The Maya civilization, already known to be impressive, turns out to have been far larger and more urbanized than anyone had suspected. The Amazon, long thought to have been a near-empty wilderness before European contact, supported large, complex societies that modified the landscape on a massive scale.

These discoveries did not reveal new information. They revealed old information that had been hidden -- not by nature, but by the streetlight effect. The evidence was always there. We simply had not been looking in the right places, with the right tools.

Spaced Review (Ch. 31): Recall the senescence patterns from Chapter 31 -- the accumulation of individually rational compromises that collectively degrade a system's capacity. The streetlight effect in archaeology is a form of institutional senescence: the field's methods, funding structures, and training pipelines were optimized for accessible sites over decades, creating a rigid methodological infrastructure that perpetuated valley bias long after the bias was recognized. The new technologies (LIDAR, remote sensing) represent a kind of rejuvenation -- breaking through the accumulated methodological rigidity to reveal what had been invisible.


35.5 Data Science -- The Availability Bias in Big Data

The contemporary incarnation of the streetlight effect is, paradoxically, most acute in the field that most prides itself on empirical rigor: data science and machine learning.

The promise of big data is that the sheer volume of information will overcome the biases and limitations of small samples. With millions of data points, the argument goes, the patterns that emerge are robust, representative, and reliable. This promise is seductive -- and it is often false, for a reason that is structurally identical to the drunk looking under the streetlight.

Big data is not all data. It is data that has been collected, digitized, stored, and made accessible. And the processes of collection, digitization, storage, and access are shaped by the same convenience biases that operate in every other domain. The data that exists in big datasets is data that someone had a reason and the means to collect. The data that does not exist in big datasets is data that no one collected -- because it was too expensive, too difficult, too politically sensitive, or simply because no one thought to collect it.

This is the data availability bias: the systematic tendency to draw conclusions from the data that exists rather than the data that should exist. The bias is particularly dangerous in machine learning, where algorithms are trained on existing datasets and learn to reproduce whatever patterns those datasets contain -- including the biases, gaps, and distortions that shaped the data's collection.

Consider a few examples.

Facial recognition systems were trained predominantly on datasets containing faces of light-skinned individuals. The systems achieved high accuracy on light-skinned faces and dramatically lower accuracy on dark-skinned faces -- not because the algorithm was designed to be biased, but because the training data was collected under the streetlight of convenience. The researchers who assembled the datasets used images that were readily available, which in the context of Western internet infrastructure meant primarily images of white faces. The algorithm faithfully learned the patterns in the data it was given. It had no way of knowing what was not in the data.

Medical algorithms trained on data from academic medical centers -- which serve disproportionately insured, urban, demographically non-representative populations -- may perform poorly when applied to rural, uninsured, or demographically different populations. The algorithm works well under the streetlight. It fails in the dark.

Natural language processing systems trained on text from the internet learn to reproduce the biases embedded in that text -- gender stereotypes, racial associations, cultural assumptions -- because the internet is not a representative sample of human thought. It is a streetlight: bright, accessible, voluminous, and profoundly non-representative.

Satellite-based environmental monitoring provides abundant data about conditions observable from above -- deforestation, surface temperature, atmospheric composition. It provides little data about conditions that are invisible from above -- groundwater depletion, soil microbiology, the health of organisms living under forest canopies. Environmental policy shaped by satellite data is shaped by the satellite's streetlight.

The core problem is not that big data is useless. It is that the volume of data creates an illusion of completeness. A dataset with a billion records feels comprehensive. But if the billion records are all drawn from the same biased source, the conclusions drawn from them will be biased in the same direction, with very high statistical confidence. Big data does not eliminate the streetlight effect. It amplifies it -- because the sheer volume of data suppresses the uncertainty that would otherwise signal that the conclusions might be wrong.

Connection to Chapter 6 (Signal and Noise): Chapter 6 examined the challenge of separating signal from noise in data. The streetlight effect adds a dimension that Chapter 6 did not fully explore: the data itself may be systematically biased. The most sophisticated signal-processing technique in the world cannot extract a signal that is not in the data. If the signal lives in the dark -- in the data that was not collected, the population that was not sampled, the phenomenon that was not measured -- no amount of analysis will find it. The streetlight effect is not a noise problem. It is a missing-signal problem.


🔄 Check Your Understanding

  1. Why does the volume of big data create an illusion of completeness? How can a dataset with a billion records still be systematically biased?
  2. Explain how the training data problem in facial recognition systems is a streetlight effect. What was the "streetlight" in this case, and what was the "park"?
  3. The chapter claims that the data availability bias is "particularly dangerous in machine learning." Why is the combination of algorithmic learning and biased training data worse than traditional statistical analysis with biased data?

35.6 Medicine -- Funding Follows the Money

The global burden of disease is distributed roughly inversely to the global distribution of medical research funding. This sentence contains one of the most consequential streetlight effects in human affairs.

Neglected tropical diseases (NTDs) -- a group of parasitic, bacterial, and viral infections including schistosomiasis, lymphatic filariasis, onchocerciasis, Chagas disease, leishmaniasis, and trachoma -- collectively affect more than one billion people, predominantly in low-income countries in tropical and subtropical regions. They cause immense suffering: blindness, disfigurement, chronic pain, developmental delays, organ damage. They trap communities in cycles of poverty by reducing labor productivity, school attendance, and cognitive development.

The funding devoted to NTD research is a fraction of the funding devoted to diseases that primarily affect wealthy nations. The reasons are straightforward and structural: pharmaceutical companies invest in drugs that will be purchased by customers who can pay. Governments fund research that addresses their own populations' health concerns. Universities pursue grants from agencies that prioritize their own nations' disease burdens. The research infrastructure -- the laboratories, clinical trial networks, regulatory frameworks, and manufacturing capacity -- is concentrated in wealthy countries, optimized for the diseases of wealthy countries.

This is the streetlight effect operating through the economics of knowledge production. The "light" is not data availability (as in data science) or methodological convenience (as in psychology). It is money. The diseases that get studied are the diseases that someone will pay to cure. The diseases that do not get studied are the diseases of people who cannot pay. The resulting distribution of medical knowledge -- what we know how to treat, what drugs exist, what clinical protocols have been developed -- reflects not the distribution of human suffering but the distribution of purchasing power.

The numbers are stark. Global health research spending has historically devoted roughly ninety percent of its resources to diseases that account for roughly ten percent of the global disease burden -- the diseases of wealthy nations. This is sometimes called the "10/90 gap," and while the exact ratio has improved since the term was coined in the 1990s (partly due to the Gates Foundation and other philanthropic interventions), the structural disparity persists.

The streetlight effect in medicine extends beyond the global level. Within wealthy countries, the same pattern operates at smaller scales. Rare diseases that affect small populations receive less research funding than common diseases, not because they cause less suffering per patient but because the market for their treatment is small. Diseases that disproportionately affect marginalized populations -- sickle cell disease, for instance, which primarily affects people of African descent -- have historically received less funding per affected patient than diseases that affect demographically powerful populations. Mental health research has been chronically underfunded relative to its disease burden, partly because mental illness is harder to measure and treat with the methods that dominate biomedical research (drug trials, imaging studies, molecular biology) and partly because mental illness carries stigma that reduces political advocacy.

In each case, the pattern is the same: the light shines where the funding is, and the research follows the light. What gets studied is determined not by what needs to be known but by what someone is willing to pay to know.

Retrieval Prompt: Pause before continuing. You have now seen the streetlight effect in five domains: research methodology, policing, archaeology, data science, and medicine. Before reading the next section, try to articulate the common structural pattern. What is the "streetlight" in each case? What is the "park" where the keys are actually lost? And what is the force that keeps the search under the light?


35.7 Economics -- The GDP Illusion

In 1934, the economist Simon Kuznets presented to the United States Congress a report that would shape the twentieth century's understanding of prosperity. Kuznets had developed a system for measuring the total economic output of a nation -- what would become Gross Domestic Product (GDP). The system was revolutionary: for the first time, governments had a single number that summarized the productive activity of an entire economy. GDP could be calculated quarterly, compared across nations, tracked over time, and used to evaluate the effectiveness of economic policies.

Kuznets himself warned, explicitly and repeatedly, that GDP was not a measure of national wellbeing. "The welfare of a nation," he wrote, "can scarcely be inferred from a measurement of national income." GDP counts the monetary value of goods and services produced. It does not count, and was never designed to count, the things that make life worth living: health, leisure, security, equality, environmental quality, community, purpose, or happiness.

The warning was ignored. GDP became the dominant measure of national success. Political leaders run on promises to grow GDP. International organizations rank countries by GDP. Economists evaluate policies by their impact on GDP. The streetlight that Kuznets built -- useful, focused, quantifiable -- became the only light that mattered.

The distortions are well-documented and profound.

Unpaid labor is invisible. GDP counts goods and services exchanged in markets. It does not count unpaid domestic labor (cooking, cleaning, childcare), unpaid care work (eldercare, disability support), or volunteer work. By some estimates, the value of unpaid household labor in wealthy countries equals twenty to forty percent of measured GDP. But because it is not exchanged in a market, it does not exist in the metric. Societies that shift care work from the household to the market -- from a parent raising a child to a paid daycare provider raising the same child -- register GDP growth. The actual amount of care provided may not have changed at all.

Environmental degradation counts as production. When a factory pollutes a river, the goods it produces are counted in GDP. When the government spends money to clean up the river, that spending is also counted in GDP. When residents develop health problems from the pollution and spend money on medical treatment, that spending is counted in GDP too. Every stage of the destruction-and-remediation cycle adds to GDP. The loss of the clean river -- its recreational value, its ecological value, its aesthetic value, its value as a source of clean water -- is not subtracted.

Wellbeing and GDP diverge. In the United States, GDP per capita has roughly tripled since 1970. Over the same period, self-reported happiness has been essentially flat. Rates of anxiety and depression have increased. Social trust has declined. Life expectancy gains have slowed and, in some demographics, reversed. If GDP were a measure of national wellbeing, these trends would be impossible. They are not impossible; they are predictable, because GDP was never measuring wellbeing.

The streetlight shapes policy. Because GDP is the primary metric of national success, policies are designed to maximize GDP growth. This means that policies whose benefits are not captured by GDP -- investments in environmental protection, community health, social cohesion, leisure, or equity -- are systematically undervalued in policy analysis. And policies whose costs are not captured by GDP -- environmental destruction, labor exploitation, community disruption -- are systematically underpenalized. The metric defines the agenda. The streetlight determines the search.

The economics profession has long recognized these limitations. Alternative measures have been proposed: the Genuine Progress Indicator (GPI), the Human Development Index (HDI), Bhutan's Gross National Happiness (GNH), the OECD Better Life Index, and others. But none has displaced GDP as the primary metric of national success, for a reason that the streetlight effect predicts: GDP is quantifiable, standardized, internationally comparable, and computationally straightforward. The alternatives are messier, more subjective, harder to calculate, and less amenable to the kind of clean, numerical comparison that policymakers and journalists prefer. GDP persists not because it is good but because it is easy.

Spaced Review (Ch. 33): Recall the lifecycle S-curve from Chapter 33 -- the universal pattern of slow start, explosive growth, saturation, and plateau or decline. GDP as a metric followed its own S-curve: introduced in the 1930s (slow start), adopted worldwide after World War II (explosive growth), becoming the universal standard by the 1970s (saturation). The metric is now arguably in its plateau phase -- widely recognized as inadequate but deeply entrenched. Its carrying capacity as a descriptor of national success has been reached. The question is whether a successor metric will emerge on a new S-curve, or whether GDP will persist indefinitely as a senescent standard, too embedded in institutional infrastructure to be replaced.


35.8 The Deeper Pattern -- Methodological Convenience as Methodological Rigor

Six domains. Six streetlights. The same structural pattern in each:

Domain The Streetlight What's in the Dark
Research methodology Measurable variables, WEIRD subjects Important-but-unmeasurable variables, global human diversity
Policing Reported and observed crime data Unreported crime, white-collar crime, systemic harm
Archaeology Accessible sites, durable materials Mountain sites, forest sites, submerged coastlines, perishable materials
Data science Digitized, collected, accessible data Uncollected data, undigitized experience, the lives of the disconnected
Medicine Diseases of wealthy nations Neglected tropical diseases, mental health, diseases of the poor
Economics Market-exchanged goods and services (GDP) Unpaid labor, environmental value, wellbeing, community

The deeper pattern is not simply that each field has blind spots. Every competent practitioner knows that. The deeper pattern is that the blind spots are systematic -- they are shaped by the same structural forces across every domain, and they produce the same types of distortion everywhere.

Three structural forces create the streetlight effect:

1. Measurement availability. Things that can be easily measured get measured. Things that are hard to measure do not. Over time, the measured things accumulate a body of evidence, a literature, a set of established findings. The unmeasured things remain in what the statistician David Hand calls dark data -- the data that does not exist, not because the phenomena do not exist, but because no one has measured them. The asymmetry between the measured and the unmeasured grows over time, because each round of measurement builds infrastructure (methods, datasets, expertise, institutional knowledge) that makes the next round of measurement of the same things easier, while doing nothing to make measurement of the unmeasured things easier.

2. Institutional incentives. Researchers, institutions, and funders operate within incentive structures that reward productivity, speed, and demonstrable results. Studying what is convenient -- the WEIRD undergrad, the accessible archaeological site, the disease with pharmaceutical market potential -- is faster, cheaper, and more likely to produce publishable results than studying what is important but difficult. The incentive structures are not designed to produce the streetlight effect. They produce it as an emergent consequence of locally rational decisions. Each researcher's choice to study the convenient topic is individually sensible. Collectively, the choices create a systematic bias in what the field knows.

3. Path dependence. Once a field has invested heavily in studying certain topics with certain methods, switching to different topics or methods is costly. The investment in existing approaches creates what economists call "switching costs" -- the expense and risk of abandoning established methods, retraining personnel, rebuilding infrastructure, and forfeiting the accumulated advantage of the current approach. Path dependence means that the streetlight effect, once established, is self-reinforcing. The field's methods are optimized for what has already been studied. Its training pipelines produce researchers skilled in the established approaches. Its journals and peer-review systems are calibrated to evaluate work using established methods. Breaking out of the streetlight requires not just individual initiative but institutional transformation.

These three forces -- measurement availability, institutional incentives, and path dependence -- combine to produce a phenomenon that the chapter has been circling around: the data availability bias. This is the formal version of the streetlight effect, and it can be stated as a principle:

The Data Availability Principle: The conclusions of any inquiry are bounded by the representativeness of the data on which they are based. When the data is systematically non-representative -- collected under conditions of convenience rather than conditions of validity -- the conclusions will systematically misrepresent the reality they claim to describe. The degree of misrepresentation is proportional to the gap between what was measured and what should have been measured.

This principle seems obvious. But its implications are radical. It means that the entire body of published knowledge in many fields -- the sum total of "what we know" -- is systematically distorted by the streetlight effect. Not wrong, necessarily. But incomplete in ways that are not random. The incompleteness has a shape, and the shape is determined by what was easy to study rather than what was important to know.

Connection to Chapter 22 (Survivorship Bias Forward Reference): The streetlight effect and survivorship bias (which Chapter 37 will examine in detail) are complementary distortions. The streetlight effect describes what we search for: we look where the light is good. Survivorship bias describes what we see in the results: we see only what survived, succeeded, or was recorded. Together, they form a double filter on human knowledge. First, we search in the wrong places (streetlight effect). Then, from the wrong places where we searched, we notice only the salient results (survivorship bias). The cumulative distortion is profound.


🔄 Check Your Understanding

  1. Name the three structural forces that create the streetlight effect. For each, give a concrete example from a domain discussed in this chapter.
  2. State the Data Availability Principle in your own words. Why does the chapter claim its implications are "radical"?
  3. How does path dependence make the streetlight effect self-reinforcing? What would breaking out of path dependence look like in a specific field?

35.9 Countermeasures -- Looking for the Dark

The streetlight effect is structural, but it is not inescapable. Across the domains examined in this chapter, practitioners have developed countermeasures -- strategies for deliberately searching outside the streetlight. None is a complete solution. But collectively, they represent a toolkit for resisting the pull of convenience.

Countermeasure 1: Dark Data Awareness

The first step is simply recognizing that the data you have is not all the data there is. David Hand's concept of dark data -- the data you do not have, which may be more important than the data you do have -- provides a framework for this recognition. Hand identifies fifteen types of dark data, ranging from data that you know is missing (you know that certain populations were not sampled) to data whose absence you do not suspect (you do not even know that a relevant variable exists).

The practical discipline is to ask, before drawing any conclusion: What data would I need to have confidence in this conclusion? What data do I actually have? What is the gap? And could the missing data change the conclusion? This is negative-space thinking applied to evidence -- looking at what is absent rather than what is present.

Countermeasure 2: Triangulation

Triangulation is the practice of approaching a question from multiple independent methodological directions. The term is borrowed from surveying, where a point's location is determined by measuring its angle from two or more known positions. In research, triangulation means studying the same phenomenon using different methods, different data sources, different populations, or different theoretical frameworks -- and checking whether the conclusions converge.

If a conclusion holds up under triangulation -- if different methods, applied to different data, in different contexts, all point in the same direction -- the conclusion is more likely to reflect reality than the streetlight. If the conclusions diverge, the divergence itself is informative: it reveals the boundaries of the streetlight and suggests where the dark areas are.

Countermeasure 3: Mixed Methods

Mixed methods research deliberately combines quantitative approaches (which are strong on measurement but weak on context) with qualitative approaches (which are strong on context but weak on measurement). The quantitative component provides the precision that the streetlight illuminates well. The qualitative component provides the contextual understanding that the streetlight cannot illuminate -- the lived experiences, the cultural meanings, the institutional dynamics, the things that are real but not easily quantifiable.

Mixed methods is not a compromise between rigor and softness. It is a recognition that rigor applied to the wrong question -- the question that happens to be under the streetlight -- is less valuable than rougher evidence applied to the right question. The challenge is methodological humility: accepting that different questions require different methods, and that no single method illuminates everything.

Countermeasure 4: Deliberately Studying the Hard-to-Measure

The most direct countermeasure is the most difficult: choosing to study what is hard to study, precisely because it is hard. This means cross-cultural psychology that invests in non-WEIRD populations. It means archaeology that ventures into mountains, forests, and underwater sites. It means medical research that prioritizes neglected diseases. It means data science that audits its datasets for representativeness before training its algorithms. It means economics that develops and deploys wellbeing measures alongside GDP.

The obstacle is always the same: studying the hard-to-study is expensive, slow, methodologically uncertain, and career-risky. It requires institutional support, protected funding, and a promotion culture that rewards importance over convenience. In short, it requires changing the incentive structures that create the streetlight effect in the first place.

Countermeasure 5: Searching for Absence

Perhaps the most powerful countermeasure is training yourself to notice what is not there. In every dataset, every research literature, every policy analysis, there are absences -- topics not studied, populations not sampled, variables not measured, questions not asked. The absences are invisible by definition: you cannot see what is not there. But you can train yourself to look for them.

The discipline is to ask, after reviewing any body of evidence: Who is not represented? What was not measured? What question was not asked? What would the picture look like if the missing information were included? This is the intellectual equivalent of deliberately walking into the dark park instead of staying under the streetlight. It is uncomfortable. It may not yield immediate results. But it is where the keys are.

Connection to Chapter 14 (Absence of Evidence): Chapter 14 examined how the absence of evidence is often mistaken for the evidence of absence -- how the fact that something has not been observed is treated as proof that it does not exist. The streetlight effect explains why evidence is absent in many cases: not because the phenomenon does not exist, but because no one has looked in the right place. The two chapters are companion pieces: Chapter 14 describes the logical error of treating non-observation as non-existence; Chapter 35 describes the structural force that makes non-observation systematically likely for certain phenomena.


35.10 The Threshold Concept -- Measurement Creates Its Own Reality

Every chapter in this book contains a threshold concept -- an idea that, once grasped, permanently changes how you see the world. The threshold concept for the streetlight effect is this: Measurement Creates Its Own Reality.

The insight is deeper than "we measure the wrong things." It is that the act of measuring -- the choice of what to count, what to track, what to report -- does not merely describe reality. It shapes reality. It shapes what counts as knowledge. It shapes what receives attention. It shapes what gets funded, what gets published, what gets taught, what gets rewarded, and ultimately what gets done.

When Robert McNamara chose to measure body counts, he did not just describe the war. He changed the war. Officers changed their behavior to produce the metric. Strategies were redesigned to maximize the metric. Resources were allocated to activities that moved the metric. The metric did not reflect the reality of the war; the reality of the war reshaped itself to reflect the metric.

When psychologists chose to study WEIRD undergraduates, they did not just describe a narrow slice of human psychology. They defined what "human psychology" meant. Textbooks were written around WEIRD findings. Curricula were designed around WEIRD norms. Clinical interventions were developed for WEIRD populations. The measurement created a reality in which WEIRD psychology was psychology.

When economists chose to measure GDP, they did not just track economic activity. They defined what "the economy" meant. Policies were designed to maximize GDP. Political campaigns were fought over GDP growth rates. National identities were constructed around GDP rankings. Countries that were rich in community, ecological health, or cultural vitality but poor in GDP were classified as "developing" -- as if their task were to develop into GDP-producing machines.

In each case, the measurement did not passively record a pre-existing reality. It actively constructed a reality -- a reality in which the measured things were important and the unmeasured things were invisible. The streetlight does not just reveal what is beneath it. It defines the territory that matters.

This is the sense in which measurement creates its own reality. And it is the sense in which the streetlight effect is not a minor methodological nuisance but a fundamental force shaping the structure of human knowledge. What we know is a function of what we have looked at. What we have looked at is a function of what was easy to see. What was easy to see has become, through decades of accumulated study, what we believe is real. The streetlight has become the world.

Before grasping this threshold concept, you see the streetlight effect as a correctable bias -- a nuisance that better sampling, better methods, or better data could fix. You see measurement as neutral: a lens that reveals reality without distorting it. You assume that what we know reflects what is true, subject to the usual caveats about sample size and methodology.

After grasping this concept, you see measurement as constitutive: the act of measurement creates categories, directs attention, allocates resources, and shapes institutions in ways that remake the reality being measured. You see the streetlight effect not as a bias to be corrected but as a structural force to be navigated -- one that is always operating, in every field, in every institution, in every inquiry. You begin to ask, automatically, of every claim: What was measured to produce this claim? What was not measured? And how might the claim look different if the measurement had been different?

How to know you have grasped this concept: When you encounter a statistic, a research finding, a policy metric, or a data visualization, your first thought is not "What does this show?" but "What was measured to produce this, and what was left in the dark?" When you design a study, a metric, or an evaluation system, you are aware that the choice of what to measure will shape what people do, not just what you observe. When you hear someone say "the data shows..." you mentally append "...the data that was collected, from the population that was sampled, using the methods that were available, under the institutional constraints that were operating."


35.11 The Pattern Library Checkpoint

Add the streetlight effect to your Pattern Library. Here is the entry:

Pattern: The Streetlight Effect (Observational Bias) Structure: Inquiry is systematically biased toward areas where observation is easy, data is available, or methods are well-developed -- regardless of whether those areas contain what the inquiry seeks. The bias is structural (driven by measurement availability, institutional incentives, and path dependence), self-reinforcing (studying under the streetlight builds infrastructure that makes the next study under the streetlight easier), and constitutive (the accumulated body of knowledge shaped by the bias comes to define what is considered real and important). Signature: Look for the gap between what is studied and what matters. If a field's knowledge base is concentrated in areas that are methodologically convenient, the streetlight effect is operating. Countermeasures: Dark data awareness, triangulation, mixed methods, deliberately studying the hard-to-measure, searching for absence. Adjacent patterns: McNamara Fallacy (Ch. 15), signal and noise (Ch. 6), survivorship bias (Ch. 37), feedback loops (Ch. 2), senescence (Ch. 31).

Spaced Review Connection: Look back at your Pattern Library entries for senescence (Ch. 31) and the lifecycle S-curve (Ch. 33). The streetlight effect has its own lifecycle dynamics: a field's streetlight bias starts small (early researchers happen to study convenient topics), grows through institutional reinforcement (the topics build infrastructure and attract followers), reaches saturation (the convenient topics are thoroughly studied), and eventually faces challenge (someone notices what has been missed). Can you identify where the streetlight effect in each domain discussed in this chapter is on its own S-curve?


35.12 Looking Into the Dark -- What Comes Next

The streetlight effect prepares us for the remaining chapters of Part VI. Each subsequent chapter examines a different systematic failure in human reasoning -- and each can be understood as a variation on the same theme: the tendency to see what is visible rather than what is real.

Chapter 36 will examine narrative capture -- how compelling stories hijack reasoning by making the visible, dramatic, and emotionally resonant seem more true than the invisible, statistical, and emotionally neutral. Narrative capture is the streetlight effect operating through story: the narrative illuminates certain causal pathways and leaves others in darkness, and we mistake the illuminated pathways for the complete picture.

Chapter 37 will examine survivorship bias -- the tendency to draw conclusions from what survived, succeeded, or was recorded, while ignoring what failed, was destroyed, or was never recorded. Survivorship bias is the streetlight effect operating through selection: the survivors are visible; the non-survivors are in the dark.

Chapter 38 will examine Chesterton's fence -- the failure to ask "why does this exist?" before removing a structure, rule, or tradition. Chesterton's fence is the streetlight effect operating through understanding: the fence's costs are visible (it is in the way); its benefits are in the dark (they protect against threats that are not currently apparent).

In each case, the structural diagnosis is the same: we see what is illuminated and miss what is in darkness. The streetlight is everywhere. The keys are always in the park.

The question this chapter leaves you with is not whether the streetlight effect is operating in your own field, your own institution, your own research, your own thinking. It is. The question is: what are you going to do about it? Will you stay under the light? Or will you walk into the dark?

Retrieval Prompt: Final check. Without looking back, can you (1) define the streetlight effect and name the three structural forces that produce it, (2) give an example from at least four of the six domains discussed, (3) state the Data Availability Principle, (4) name at least three countermeasures, and (5) articulate the threshold concept -- Measurement Creates Its Own Reality -- in your own words? If you can do all five, you have grasped this chapter's core architecture. If not, revisit the sections where the gaps are.


Summary

The streetlight effect -- the systematic tendency to search where observation is easy rather than where the answer is -- operates identically across research methodology (the McNamara Fallacy, the WEIRD problem), policing (hot-spot policing and the dark figure of crime), archaeology (valley bias and preservation bias), data science (the data availability bias in big data and machine learning), medicine (neglected tropical diseases and the funding-follows-money distortion), and economics (GDP as a proxy for wellbeing). The deeper pattern is that methodological convenience masquerades as methodological rigor: we study what we can study and pretend it is what we should study. Three structural forces -- measurement availability, institutional incentives, and path dependence -- make the streetlight effect self-reinforcing across every domain. Countermeasures exist (dark data awareness, triangulation, mixed methods, deliberately studying the hard-to-measure, searching for absence) but require institutional transformation, not just individual awareness. The threshold concept -- Measurement Creates Its Own Reality -- reveals that the choice of what to measure does not passively describe reality but actively shapes what counts as knowledge, what receives attention, and ultimately what gets done. The streetlight does not just reveal. It defines.