Case Study 2: Data Science and Archaeology -- The Invisible Civilizations and the Biased Algorithm

"The most important things are the hardest to see -- not because they are small, but because we are not looking for them." -- Adapted from a principle in archaeological survey methodology


Two Fields, One Blindness

This case study examines the streetlight effect in two domains where the bias has recently been exposed by new technologies and new methodologies: data science, where biased training data has produced algorithms that systematically misrepresent the populations they claim to serve, and archaeology, where remote sensing has revealed entire civilizations that were invisible to traditional methods. In both cases, the streetlight effect did not merely limit what was known -- it constructed a false picture that was confidently mistaken for the truth.


Part I: The Biased Algorithm -- How Training Data Creates Training Blindness

The ImageNet Story

In 2009, a computer scientist named Fei-Fei Li and her collaborators at Stanford University released ImageNet -- a database of over fourteen million images, each labeled with a description of what it depicted. ImageNet became the foundational dataset for the deep learning revolution in computer vision. Nearly every major advance in image recognition since 2009 -- the systems that identify faces, classify medical images, interpret satellite photographs, and power autonomous vehicles -- traces its lineage to algorithms trained on ImageNet or its descendants.

ImageNet was a monumental achievement. It was also a streetlight.

The images in ImageNet were scraped from the internet -- specifically, from the English-language internet. The labels were assigned by workers on Amazon's Mechanical Turk platform, who were disproportionately English-speaking, American, and young. The result was a dataset that reflected the visual world as it appeared on the English-language internet in the late 2000s -- a world that was predominantly Western, urban, technologically mediated, and culturally specific.

Objects common in Western households -- power outlets, mailboxes, traffic signs, kitchen appliances -- were abundantly represented. Objects common in other cultural contexts -- different styles of cooking implements, agricultural tools, architectural features, clothing -- were underrepresented or absent. Faces were overwhelmingly light-skinned. Activities were predominantly those visible in photographs shared on English-language websites.

The algorithms trained on ImageNet learned to see the world through this streetlight. They became very good at recognizing what was in the data: Western objects, light-skinned faces, urban settings. They became correspondingly poor at recognizing what was not in the data: non-Western objects, dark-skinned faces, rural settings in developing countries.

The Consequences of the Streetlight

The consequences emerged gradually, then all at once.

In 2015, Google's photo-recognition algorithm automatically labeled photographs of Black individuals as "gorillas." The error was not the result of deliberate racism in the algorithm. It was the result of a training dataset in which dark-skinned faces were underrepresented and in which the algorithm had not been given enough examples to learn the distinction that any human would find obvious. Google's response was telling: rather than retraining the algorithm on a more representative dataset, the company simply removed "gorilla" as a label. The streetlight's boundary was adjusted, but the streetlight itself was not moved.

In 2018, Joy Buolamwini and Timnit Gebru published a landmark study examining the accuracy of commercial facial recognition systems across different demographic groups. They found that the systems achieved error rates below one percent for light-skinned males but error rates above thirty percent for dark-skinned females. The disparity was not a subtle statistical artifact. It was an order-of-magnitude difference in performance that directly reflected the composition of the training data.

In medical imaging, similar patterns emerged. Algorithms trained predominantly on images from patients at academic medical centers in wealthy countries performed poorly when applied to patients in different demographic or clinical contexts. A skin cancer detection algorithm trained on images of light-skinned patients failed to identify lesions on dark skin -- not because the algorithm was incapable of learning, but because it had never been shown what skin cancer looks like on dark skin.

In natural language processing, algorithms trained on internet text learned to associate "doctor" with "he" and "nurse" with "she," to rate African American names as less pleasant than European American names, and to produce text reflecting cultural stereotypes embedded in the training corpus. The algorithms were not inventing bias. They were faithfully reproducing the biases present in the data they had been given -- data drawn from the streetlight of the English-language internet.

The Volume Illusion

The most insidious aspect of the data availability bias in machine learning is what the chapter calls the "volume illusion" -- the sense that a dataset with millions or billions of data points must be representative simply because it is large. This illusion is statistically naive but psychologically compelling. If you have a billion images, surely you have seen everything? If you have a trillion words of text, surely you have captured every perspective?

The answer, in both cases, is no. A billion images drawn from the same biased source are a billion data points illuminated by the same streetlight. A trillion words from the English-language internet are a trillion words from the same culturally specific corner of human expression. Volume does not correct for bias. It amplifies bias, by increasing the statistical confidence of conclusions drawn from biased data. The algorithm becomes more and more certain that the world looks like what it has seen -- and what it has seen is the world under the streetlight.

This is the data science equivalent of McNamara's body counts: a metric that grows more confident as it grows more wrong. The more data the algorithm processes, the more precisely it learns the biases in the data, and the more authoritatively it presents those biases as facts about the world.

The Structural Diagnosis

The training data problem in machine learning is a streetlight effect driven primarily by measurement availability and path dependence.

Measurement availability: The data that exists in large, digitized, accessible datasets is disproportionately Western, urban, English-language, and technologically mediated -- because those are the populations and contexts that generate digital data at scale. The experiences, faces, objects, and language of billions of people who are less connected to the digital infrastructure of wealthy nations are underrepresented or absent.

Path dependence: Once ImageNet became the standard training dataset, and once the algorithms trained on it achieved impressive results, the incentive to create more representative datasets was reduced. Why invest in expensive, labor-intensive data collection from underrepresented populations when the existing dataset already works well enough -- well enough, that is, for the populations who are already represented? The path was set. The streetlight was fixed. And the algorithms that followed walked the same path, under the same light.


Part II: The Invisible Civilizations -- How Archaeology Found What It Wasn't Looking For

The Amazon Before LIDAR

For much of the twentieth century, the conventional wisdom about the pre-Columbian Amazon was clear and confident: the rainforest was a pristine, largely uninhabited wilderness. The few indigenous peoples who lived there existed in small, mobile bands, limited by the poor soils and harsh conditions of the tropical forest. The Amazon was nature's domain, not civilization's. The great civilizations of the Americas -- the Maya, the Aztec, the Inca -- arose in highlands, river valleys, and coastal regions. The lowland tropical forest was too challenging, too nutrient-poor, too hostile to support anything resembling urban complexity.

This picture was wrong. And it was wrong because of the streetlight effect.

The evidence for Amazonian complexity was, in retrospect, abundant. The sixteenth-century Spanish explorer Francisco de Orellana, who traveled the length of the Amazon in 1542, described vast settlements, wide roads, intensive agriculture, and dense populations along the riverbanks. His account was dismissed by later scholars as exaggeration or fantasy -- partly because subsequent European visitors to the same areas found no such settlements (the populations had been devastated by the diseases the Europeans brought), and partly because the archaeological evidence that would have confirmed Orellana's account was invisible to the archaeological methods then available.

The invisibility was a direct product of the streetlight effect. Amazonian archaeology was, for most of the twentieth century, nearly impossible. The forest canopy blocked aerial survey. The soil was acidic, destroying organic remains quickly. The vegetation was dense, concealing surface features. Access was difficult and expensive. Mosquito-borne diseases threatened researchers. Funding agencies, comparing the cost and difficulty of Amazonian fieldwork to the relative ease of excavating in arid climates (where preservation is excellent and surface features are visible), directed their money elsewhere.

The streetlight shone on Mesoamerica, on the Andes, on the arid Southwest, on the Mediterranean -- places where archaeology was methodologically straightforward, preservation was good, and institutional infrastructure was established. The Amazon was in the dark.

The LIDAR Revolution

Light Detection and Ranging -- LIDAR -- changed everything. LIDAR works by firing millions of laser pulses from an aircraft and measuring how long each pulse takes to return. Pulses that hit the forest canopy return quickly. Pulses that penetrate gaps in the canopy and reach the ground return more slowly. By filtering out the canopy returns, researchers can create a detailed map of the ground surface beneath the forest -- in effect, seeing through the trees.

LIDAR was first applied to archaeology in Central America, where it revealed the Maya civilization to be far more extensive than previously known. But the Amazonian applications were even more startling.

In the 2010s and 2020s, LIDAR surveys in the southern Amazon revealed what the streetlight had hidden. In the state of Acre in western Brazil, researchers found a network of geometric earthworks -- precise circles, squares, and interconnected forms -- covering an area larger than England. These geoglyphs, as they are called, had been built by clearing forest, excavating ditches, and constructing raised embankments. They dated to roughly 2,000 years ago -- centuries before European contact. Their purpose is still debated (ceremonial, agricultural, defensive), but their scale and precision indicate organized societies with substantial populations, sophisticated engineering, and coordinated labor.

In the Llanos de Mojos region of Bolivia, LIDAR revealed a landscape of raised agricultural fields, canals, causeways, and artificial forest islands covering thousands of square kilometers. The engineering was designed to manage seasonal flooding -- diverting water during the wet season and retaining it during the dry season -- creating a productive agricultural landscape that supported large, permanent settlements. The system was as sophisticated as any in the pre-Columbian Americas, comparable in engineering ambition to the Maya's water management systems or the Inca's terraced agriculture.

In the upper Xingu region of central Brazil, archaeologists found evidence of a network of interconnected settlements -- "garden cities" -- covering an area of roughly 20,000 square kilometers. The settlements were linked by wide, straight roads, surrounded by managed agricultural land, and organized in a pattern that suggested deliberate urban planning. The population of the network, at its peak, may have numbered in the tens of thousands.

In the upper Tapajos Basin, LIDAR combined with ground survey revealed over eighty previously unknown settlements dating to 1250-1500 CE, with defensive ditches, plazas, and road systems. Some sites extended over hundreds of hectares.

Each of these discoveries was hidden not by nature but by method. The evidence was always there -- beneath the canopy, under the soil, encoded in the landscape. Traditional archaeological methods could not see it because those methods were developed for open landscapes, durable materials, and surface-visible features. The streetlight of traditional archaeology illuminated one kind of evidence and left another kind in darkness.

What Was Lost

The consequences of the streetlight effect in Amazonian archaeology are not merely academic. For decades, the "empty Amazon" narrative served as a justification for development policies that treated the rainforest as an unused resource available for exploitation. If the Amazon had never supported complex societies, the argument went, then it was not a cultural landscape but a natural one -- and natural landscapes could be cleared for agriculture, logging, and mining without cultural loss.

The LIDAR evidence destroyed this argument. The Amazon was not empty. It was a cultural landscape, managed and shaped by human societies for millennia. The forest itself -- its composition, structure, and ecology -- was partly a human creation: indigenous peoples had selectively cultivated useful species, enriched soils (creating the "terra preta" or "dark earth" that remains some of the most fertile soil in the Amazon), and managed fire to shape the forest's composition. The "pristine wilderness" was a myth -- a myth sustained by the streetlight effect's concealment of the evidence to the contrary.

The cultural cost was borne by indigenous communities whose historical achievements were erased by the streetlight's blindness. For generations, indigenous Amazonian peoples were characterized as "primitive" or "undeveloped" -- categories that reflected the absence of archaeological evidence for their ancestors' complexity, not the absence of the complexity itself. The streetlight's darkness fell on their history, and the darkness was mistaken for the reality.


Cross-Domain Analysis

The Same Structure in Silicon and Soil

The parallel between algorithmic bias and archaeological invisibility is structural:

Feature Algorithmic Bias Archaeological Invisibility
The streetlight Data from digitally connected, Western, English-speaking populations Sites in accessible, arid, open landscapes with durable materials
What it illuminates The visual world and language of the digital mainstream Civilizations that built in stone in valleys and plains
What it hides The experiences of the digitally disconnected, the visually underrepresented Civilizations that built in organic materials in forests and wetlands
The volume illusion Billions of images feel comprehensive but reflect one cultural perspective Thousands of excavations feel thorough but reflect one type of landscape
The false confidence High accuracy on represented populations masks low accuracy on others Detailed knowledge of some civilizations masks ignorance of others
The technology that breaks through Representative datasets, bias audits, fairness-aware algorithms LIDAR, remote sensing, ground-penetrating radar, environmental DNA
What the breakthrough reveals The world looks different from different perspectives History is more complex, more diverse, and more populated than the streetlight suggested

The Constitutive Effect

In both domains, the streetlight effect did not merely limit knowledge -- it constructed a false reality that was confidently mistaken for the truth.

In data science, the false reality was that the world looks like the internet: predominantly Western, light-skinned, English-speaking, urban. The algorithms did not know they were seeing a streetlight. They processed the data they were given and inferred, with high statistical confidence, that the world was as the data described it.

In archaeology, the false reality was that the Amazon was an empty wilderness, that lowland tropical forests could not support complex societies, that civilization was a highland and river-valley phenomenon. The archaeologists did not know they were searching under a streetlight. They excavated the sites they could reach and inferred, with the confidence of decades of fieldwork, that the pattern of their findings reflected the pattern of human history.

In both cases, the false reality had material consequences. Algorithms that misidentified dark-skinned faces led to false arrests. The "empty Amazon" narrative facilitated deforestation. The streetlight effect does not merely distort knowledge. It distorts the world.

The Recovery Pattern

Both domains also illustrate the recovery pattern described in the chapter's countermeasures section:

  1. Recognition of the bias. Someone notices that the evidence base is non-representative. In data science, researchers like Buolamwini and Gebru audited the datasets and documented the disparities. In archaeology, scholars like Anna Roosevelt and Michael Heckenberger challenged the "empty Amazon" orthodoxy based on soil analysis, ethnographic evidence, and early colonial accounts.

  2. New technology. A new observational method breaks through the streetlight's boundaries. In data science, bias audits, fairness-aware machine learning, and diverse dataset creation projects. In archaeology, LIDAR, satellite imagery, and environmental DNA analysis.

  3. Revelation of scale. The new methods reveal that what was hidden was not marginal but massive. Facial recognition bias was not a minor edge case but an order-of-magnitude performance gap. Amazonian civilizations were not small scattered bands but continent-scale cultural networks.

  4. Institutional reckoning. The field confronts the implications of having been wrong -- and wrong not randomly, but systematically, in a direction predicted by the streetlight effect. Data science begins developing fairness standards and representative dataset requirements. Archaeology begins revising its narratives of the human past.

  5. Incomplete correction. The correction is ongoing, partial, and contested. New streetlights are being built, because the structural forces (measurement availability, institutional incentives, path dependence) that created the original bias have not been fully transformed. The work is not finished. It may never be finished. But the direction -- toward the dark, away from the streetlight -- is established.


Lessons for the Reader

  1. The evidence of absence is not the absence of evidence. In both domains, the lack of data was mistaken for the lack of reality. Algorithms that had not been shown dark-skinned faces concluded that dark-skinned faces were hard to classify. Archaeologists who had not excavated in the Amazon concluded that the Amazon had nothing to excavate. The missing evidence was an artifact of the search, not an artifact of the world.

  2. Confidence scales with data volume, not data representativeness. The more data the algorithm processes, the more confident it becomes -- even if the data is systematically biased. The more sites the archaeologist excavates in river valleys, the more confidently she concludes that civilization is a river-valley phenomenon. High confidence and deep bias are not contradictory. They are companions.

  3. New technologies can move the streetlight. LIDAR for archaeology, bias audits for data science -- these are tools that extend the circle of light into areas that were previously dark. The lesson is not that technology solves the problem automatically. It is that technological innovation is one of the few forces powerful enough to overcome the institutional inertia of the streetlight effect.

  4. The people in the dark have always known what is there. Indigenous Amazonian communities knew that their ancestors had built complex societies. Communities of color knew that facial recognition systems treated them differently. The streetlight effect is not just an epistemological problem. It is a power problem: the communities whose realities are in the dark often lack the institutional power to move the streetlight. The technologies and methods that eventually reveal what is in the dark often confirm what the people in the dark have been saying all along.

  5. Every dataset is a streetlight. The most important lesson of both cases is the most general: no dataset, however large, is a neutral window onto reality. Every dataset is a streetlight -- shaped by who collected the data, what they chose to measure, who they chose to study, what methods they used, and what institutional constraints they operated under. The discipline of dark data awareness -- always asking "What is not in this data?" -- is not a specialized technical skill. It is a fundamental intellectual virtue.