Chapter 35: Further Reading

Chapter 35: Further Reading

This reading list is organized by the 3-tier citation system introduced in Section 1.7. Tier 1 sources are verified and directly cited in or relevant to the chapter's core arguments. Tier 2 sources are attributed to specific authors and widely discussed in the relevant literature but have not been independently verified at the citation level for this text. Tier 3 sources are synthesized from general knowledge and multiple unspecified origins. All annotations reflect our honest assessment of each work's relevance and quality.

Tier 1: Verified Sources

These works directly inform the arguments and examples in Chapter 35. They are well-established publications whose claims have been independently confirmed.

Joseph Henrich, Steven J. Heine, and Ara Norenzayan, "The Weirdest People in the World?" (Behavioral and Brain Sciences, 2010)

The paper that named and documented the WEIRD problem in psychology. Henrich and colleagues showed that the vast majority of published psychological research is based on subjects who are Western, Educated, Industrialized, Rich, and Democratic -- a population that constitutes roughly twelve percent of humanity but accounts for the majority of research subjects. The paper systematically documented how WEIRD populations are outliers, not representative, on measures of visual perception, moral reasoning, fairness, self-concept, and other fundamental psychological dimensions. The paper triggered a discipline-wide reckoning that is still ongoing.

Relevance to Chapter 35: Henrich et al. provide the primary evidence for the WEIRD problem discussed in Section 35.2. Their work is the most thoroughly documented example of the streetlight effect operating through convenience sampling in an entire academic discipline.

Best for: Readers interested in research methodology, cross-cultural psychology, and the structural biases of academic knowledge production. The paper is technical but accessible, and the core argument is devastating.

David Hand, Dark Data: Why What You Don't Know Matters (2020)

Hand, a professor of mathematics at Imperial College London and former president of the Royal Statistical Society, provides the most comprehensive treatment of the data we do not have -- and why it matters more than the data we do. He identifies fifteen types of dark data, from the mundane (data lost through clerical error) to the profound (data that was never collected because no one knew it should be). The book connects dark data to decision-making failures across medicine, finance, government, and everyday life.

Relevance to Chapter 35: Hand's concept of dark data provides the theoretical framework for the chapter's discussion of countermeasures (Section 35.9) and the deeper pattern (Section 35.8). His taxonomy of missing data types is the most systematic treatment available of the "park" where the keys are.

Best for: Readers who want a rigorous, mathematically informed treatment of what is missing from our data and why it matters. Hand writes clearly and uses excellent examples. The book is the definitive treatment of the dark side of the streetlight.

Joy Buolamwini and Timnit Gebru, "Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification" (Proceedings of Machine Learning Research, 2018)

The landmark study documenting racial and gender disparities in commercial facial recognition systems. Buolamwini and Gebru tested three leading systems and found error rates below one percent for light-skinned males but over thirty percent for dark-skinned females. The paper demonstrated that the training data used by these systems was systematically non-representative -- a streetlight effect with concrete, measurable consequences for the people in the dark.

Relevance to Chapter 35: Buolamwini and Gebru provide the primary evidence for the data availability bias in machine learning discussed in Section 35.5 and Case Study 2. Their work is the clearest demonstration that the volume of data does not correct for the bias of data.

Best for: Readers interested in algorithmic fairness, machine learning ethics, and the real-world consequences of biased training data. The paper is concise, empirically rigorous, and profoundly important.

Robert S. McNamara with Brian VanDeMark, In Retrospect: The Tragedy and Lessons of Vietnam (1995)

McNamara's own accounting of the errors that led to the Vietnam War's disastrous prosecution. Written three decades after the events, the book is remarkable for its combination of personal responsibility and structural analysis. McNamara identifies eleven major errors, several of which map directly onto the streetlight effect: the reliance on quantitative metrics that missed the war's political dimensions, the failure to understand Vietnamese motivations, and the systematic undervaluation of factors that could not be reduced to numbers.

Relevance to Chapter 35: McNamara provides the primary source for the body count and McNamara Fallacy discussion in Section 35.2 and Case Study 1. His retrospective analysis is unusually candid about the mechanisms of the streetlight effect, even though he does not use that term.

Best for: Readers interested in military history, decision-making under uncertainty, and the psychology of institutional error. The book is essential reading for anyone interested in how intelligent, well-intentioned people can systematically deceive themselves through the wrong metrics.

Charles Handy, The Empty Raincoat: Making Sense of the Future (1994)

The book in which Handy articulated the McNamara Fallacy as a four-step progression from measuring the easy to presuming the unmeasurable does not exist. Handy, a British management thinker, situated the fallacy within a broader argument about the limitations of quantitative management in an era of increasing complexity.

Relevance to Chapter 35: Handy provides the formal articulation of the McNamara Fallacy's four steps discussed in Section 35.2. His framing has become the standard formulation cited across management, policy, and education literature.

Best for: Readers interested in management philosophy, organizational behavior, and the limits of quantification. Handy writes with elegance and humanity.

Tier 2: Attributed Claims

These works are widely cited in the literature on observational bias, data quality, and methodological critique. The specific claims attributed to them here are consistent with how they are discussed by other scholars.

Cathy O'Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (2016)

O'Neil, a mathematician and former Wall Street quant, examines how mathematical models and algorithms -- designed to be objective and fair -- perpetuate and amplify existing inequalities. Her analysis covers predictive policing, college rankings, credit scores, hiring algorithms, and insurance pricing. Each case illustrates the data availability bias: the algorithms learn from biased historical data and reproduce those biases at scale, under the guise of mathematical objectivity.

Relevance to Chapter 35: O'Neil provides extensive case material for the data science streetlight effect discussed in Section 35.5 and the predictive policing case in Case Study 1. Her concept of "weapons of math destruction" -- models that are opaque, widespread, and destructive -- captures the algorithmic amplification of the streetlight effect.

Best for: Readers who want a passionate, accessible, example-rich account of how algorithms perpetuate bias. O'Neil writes with moral clarity and technical competence.

Simon Kuznets, "National Income, 1929-1932" (U.S. Senate Document, 1934)

Kuznets's original report to Congress introducing the national income accounting system that would evolve into GDP. The report includes Kuznets's explicit warning that the metric should not be used as a measure of national welfare -- a warning that was systematically ignored for decades. The document is a primary source for understanding both the promise and the limitations of GDP.

Relevance to Chapter 35: Kuznets provides the origin story for the GDP discussion in Section 35.7. His explicit warning against using GDP as a welfare measure -- and the subsequent ignoring of that warning -- is a textbook case of the streetlight effect overwhelming the intentions of the metric's creator.

Best for: Historically inclined readers who want to see the streetlight being built, complete with the builder's warning label.

Joseph Stiglitz, Amartya Sen, and Jean-Paul Fitoussi, Mismeasuring Our Lives: Why GDP Doesn't Add Up (2010)

The report of the Commission on the Measurement of Economic Performance and Social Progress, convened by French President Nicolas Sarkozy. Three Nobel Prize-winning economists argue that GDP is a fundamentally inadequate measure of societal progress and propose alternatives that capture wellbeing, sustainability, and equity. The report is the most authoritative critique of GDP-centrism available.

Relevance to Chapter 35: Stiglitz, Sen, and Fitoussi provide the economic critique underlying Section 35.7. Their proposal for multidimensional measurement is a direct countermeasure to the GDP streetlight.

Best for: Readers interested in alternative economic indicators, the measurement of wellbeing, and the policy implications of choosing different metrics.

Anna Roosevelt, Moundbuilders of the Amazon: Geophysical Archaeology on Marajo Island, Brazil (1991)

Roosevelt's groundbreaking work challenged the prevailing "empty Amazon" narrative by documenting complex pre-Columbian societies on Marajo Island at the mouth of the Amazon. Her archaeological evidence of large settlements, sophisticated ceramics, and intensive agriculture contradicted decades of received wisdom about the limits of tropical forest civilization.

Relevance to Chapter 35: Roosevelt provides early archaeological evidence for the Amazonian complexity discussed in Case Study 2. Her work was initially controversial precisely because it challenged the streetlight-shaped consensus.

Best for: Readers interested in Amazonian archaeology, the history of archaeological debate, and the process by which new evidence overturns established narratives.

Andrew Gelman and Eric Loken, "The Statistical Crisis in Science" (American Scientist, 2014)

Gelman and Loken describe the "garden of forking paths" -- the many small, seemingly innocuous decisions researchers make during data analysis that collectively bias their results toward statistical significance. While focused on the replication crisis, their analysis illuminates a form of the streetlight effect at the micro level: researchers analyze their data in the ways that are most likely to produce publishable results, not in the ways that are most likely to reveal the truth.

Relevance to Chapter 35: Gelman and Loken extend the streetlight effect from the macro level (what fields study) to the micro level (how individual studies analyze data). Their work connects the streetlight effect to the replication crisis in science.

Best for: Readers with statistical training who want to understand how the streetlight effect operates within individual research projects, not just across fields.

Kristian Lum and William Isaac, "To Predict and Serve?" (Significance, 2016)

Lum and Isaac's analysis of PredPol's predictive policing algorithm demonstrated that the system directed police disproportionately to neighborhoods with high Black populations, reproducing historical enforcement patterns rather than predicting actual crime distribution. Their simulation showed that the algorithm would produce racially biased predictions even if the underlying crime rate were uniform across neighborhoods, simply because the training data reflected racially disparate enforcement history.

Relevance to Chapter 35: Lum and Isaac provide the primary technical evidence for the predictive policing discussion in Section 35.3 and Case Study 1. Their work demonstrates the feedback loop through which the streetlight effect is amplified by algorithmic systems.

Best for: Readers interested in the technical details of algorithmic bias in criminal justice. The paper is accessible and methodologically transparent.

Tier 3: Synthesized and General Sources

These recommendations draw on general knowledge and multiple sources rather than specific texts.

The history of LIDAR in archaeology

The LIDAR revolution in Mesoamerican and Amazonian archaeology draws on numerous sources. Key works include the Pacunam LIDAR Initiative's surveys of the Maya lowlands (published in Science, 2018), which revealed that the Maya civilization was far larger than previously estimated; the work of Heiko Pruemers, Carla Jaimes Betancourt, and colleagues documenting geometric earthworks in the Bolivian Amazon; and the ongoing research by Jonas Gregorio de Souza and collaborators mapping pre-Columbian settlement networks in the upper Tapajos Basin. For an accessible overview, see Albert Lin and Sarah Parcak's work on satellite and LIDAR archaeology, and the National Geographic coverage of the Maya LIDAR discoveries.

Relevance to Chapter 35: LIDAR archaeology provides the primary evidence for the archaeological streetlight effect discussed in Section 35.4 and Case Study 2. The technology's ability to see through forest canopies is a literal example of extending the circle of light into the dark.

Neglected tropical diseases and the 10/90 gap

The literature on neglected tropical diseases and global health research disparities is vast. Key institutional sources include the World Health Organization's reports on NTDs, the Drugs for Neglected Diseases initiative (DNDi) publications, and the Global Forum for Health Research's reports on the 10/90 gap (the finding that roughly ten percent of global health research spending addresses diseases that account for ninety percent of the global disease burden). For an accessible entry point, see Peter Hotez's Forgotten People, Forgotten Diseases (2013) and the Lancet Commission reports on NTDs.

Relevance to Chapter 35: The NTD literature provides the evidence for the medical streetlight effect discussed in Section 35.6. The 10/90 gap is one of the most quantifiable manifestations of the streetlight effect in any domain.

GDP critiques and alternative indicators

The literature on GDP's limitations and alternative indicators is extensive and multidisciplinary. Beyond the Stiglitz-Sen-Fitoussi report cited in Tier 2, key works include Robert Kennedy's famous 1968 speech on the limitations of GNP, the OECD Better Life Index project, the New Economics Foundation's Happy Planet Index, and Bhutan's Gross National Happiness framework. For economic analyses, see Diane Coyle's GDP: A Brief but Affectionate History (2014) and Kate Raworth's Doughnut Economics (2017). For a philosophical perspective, see Martha Nussbaum's capabilities approach and Amartya Sen's Development as Freedom (1999).

Relevance to Chapter 35: The GDP critique literature provides the evidence for the economics streetlight effect discussed in Section 35.7 and underlies the chapter's argument that measurement creates its own reality.

Chapter 35: Further Reading

Tier 1: Verified Sources

Joseph Henrich, Steven J. Heine, and Ara Norenzayan, "The Weirdest People in the World?" (Behavioral and Brain Sciences, 2010)

David Hand, Dark Data: Why What You Don't Know Matters (2020)

Joy Buolamwini and Timnit Gebru, "Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification" (Proceedings of Machine Learning Research, 2018)

Robert S. McNamara with Brian VanDeMark, In Retrospect: The Tragedy and Lessons of Vietnam (1995)

Charles Handy, The Empty Raincoat: Making Sense of the Future (1994)

Tier 2: Attributed Claims

Cathy O'Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (2016)

Simon Kuznets, "National Income, 1929-1932" (U.S. Senate Document, 1934)

Joseph Stiglitz, Amartya Sen, and Jean-Paul Fitoussi, Mismeasuring Our Lives: Why GDP Doesn't Add Up (2010)

Anna Roosevelt, Moundbuilders of the Amazon: Geophysical Archaeology on Marajo Island, Brazil (1991)

Andrew Gelman and Eric Loken, "The Statistical Crisis in Science" (American Scientist, 2014)

Kristian Lum and William Isaac, "To Predict and Serve?" (Significance, 2016)

Tier 3: Synthesized and General Sources

The history of LIDAR in archaeology

Neglected tropical diseases and the 10/90 gap

GDP critiques and alternative indicators

Suggested Reading Order