Case Study 1: Research Methodology and Policing -- The McNamara Fallacy from Vietnam to the Algorithm

Case Study 1: Research Methodology and Policing -- The McNamara Fallacy from Vietnam to the Algorithm

"We have the data. We are making progress." -- Robert McNamara, 1966, in a press conference where every metric pointed toward victory in a war that was being lost

Two Streetlights, One Structure

This case study examines the streetlight effect operating in two domains that appear unrelated -- military strategy and law enforcement -- and shows that they share the same structural anatomy: a measurable proxy is mistaken for the thing it claims to represent, the proxy generates a feedback loop that distorts behavior and obscures reality, and the distortion persists because the institutional incentive structures reward performance on the proxy rather than progress on the underlying goal.

The first case is Robert McNamara's management of the Vietnam War through quantitative metrics. The second is the deployment of predictive policing algorithms in American cities beginning in the 2010s. They are separated by half a century, but the structural error is identical.

Part I: The Body Count -- McNamara's War by the Numbers

The Promise of Measurement

When Robert Strange McNamara arrived at the Pentagon in January 1961, he brought with him a worldview forged at Harvard Business School and refined through fifteen years at the Ford Motor Company. McNamara believed in data. He believed that rational analysis, rigorously applied, could solve any problem -- including the problem of winning a war in a small country in Southeast Asia that most Americans could not locate on a map.

McNamara's approach to Vietnam was, by the standards of management science, impeccable. He assembled a team of analysts -- many drawn from the RAND Corporation, the Air Force think tank that had pioneered systems analysis -- and tasked them with developing quantitative metrics for every aspect of the war effort. The goal was to replace the vague, subjective assessments of military commanders ("We're making progress" or "The situation is difficult") with precise, numerical indicators that could be tracked, compared, and optimized.

The problem was that the Vietnam War resisted quantification in its most important dimensions. The war's outcome depended on factors that were inherently difficult to measure: the morale of South Vietnamese villagers, the legitimacy of the Saigon government, the effectiveness of the Viet Cong's political organization, the willingness of the North Vietnamese to sustain casualties indefinitely, the erosion of American public support. These factors were real. They were decisive. And they were, by any practical standard, unmeasurable.

What was measurable was kinetic activity: bombs dropped, missions flown, territory swept, supplies interdicted, and -- above all -- enemy combatants killed. The body count became the war's primary metric of progress. It was quantifiable. It was timely (commanders could report it daily). It was aggregable (headquarters could compile it across the entire theater). It was comparable (units could be ranked by their body counts, creating implicit competition). It satisfied every criterion of good data -- except the criterion of validity.

The Distortion

The body count was not simply a bad metric. It was a metric that actively distorted the war it claimed to measure. The distortion operated through three mechanisms.

First, the metric created perverse incentives. Promotion, recognition, and unit reputation depended on body count numbers. Officers who reported high body counts were rewarded. Officers who reported low body counts -- even if their operations had achieved strategic objectives that did not involve killing -- were disadvantaged. The result was predictable: the entire military chain of command oriented itself toward producing the metric rather than winning the war. Patrols were designed to maximize enemy contact rather than secure territory. "Search and destroy" missions replaced the population-security operations that counterinsurgency doctrine recommended. Resources were allocated to units that produced high body counts, regardless of their strategic contribution.

Second, the metric was systematically inflated. Because careers depended on body counts, officers at every level had incentives to inflate their numbers. Civilians killed in crossfire were reclassified as enemy combatants. Estimates of enemy casualties from air strikes were rounded upward. Double-counting was common. A comprehensive postwar analysis by the Army's own historians concluded that body count figures were, on average, inflated by a factor of two to three -- and in some units, by much more. The data that reached Washington was not a measurement of reality. It was a measurement of what officers wanted their superiors to believe.

Third, the metric obscured what it could not capture. Every briefing, every report, every analysis centered on the body count diverted attention from the factors that were actually determining the war's outcome. While McNamara's analysts tracked rising body counts with satisfaction, the Viet Cong's political organization was strengthening in the countryside. The Saigon government was losing legitimacy among its own population. The North Vietnamese leadership, which measured success by different criteria entirely (territorial control, political loyalty, supply line integrity), was achieving its objectives. The body count streetlight was so bright that the darkness surrounding it was invisible.

The Reckoning

The Tet Offensive of January 1968 was the moment when the streetlight's limits became undeniable. The Viet Cong and North Vietnamese Army launched coordinated attacks on virtually every major city and military installation in South Vietnam -- including a dramatic assault on the American embassy in Saigon. By body count metrics, Tet was an American victory: the attackers suffered catastrophic casualties, perhaps 45,000 killed, and failed to hold any major objective permanently.

But by every metric that the body count had excluded -- political impact, psychological effect, erosion of American public confidence, demonstration of enemy capability and will -- Tet was a devastating defeat. The American public, which had been told for years that the war was being won (the body counts proved it), watched on television as enemy forces operated freely in the capital city. The credibility gap between McNamara's metrics and visible reality became unbridgeable. Within months, Lyndon Johnson announced he would not seek reelection. Within seven years, Saigon fell.

McNamara himself, years later, recognized the error. In his 1995 memoir In Retrospect, he wrote: "We failed then -- and have since -- to recognize the limitations of modern, high-technology military equipment, forces, and doctrine in confronting unconventional, highly motivated people's movements." He did not use the term "streetlight effect." But that is what he was describing: the systematic failure to measure what mattered, compensated for by the obsessive measurement of what did not.

Part II: Predictive Policing -- The Algorithm's Streetlight

The Promise of Prediction

In the early 2010s, police departments in several American cities began deploying predictive policing systems -- algorithms that analyzed historical crime data to forecast where future crimes were most likely to occur. The best-known system, PredPol (later renamed Geolitica, and eventually shut down amid controversy), used a model adapted from earthquake aftershock prediction: just as earthquakes cluster in space and time, crimes also cluster, and the algorithm could predict where the next cluster would emerge.

The promise was data-driven, evidence-based, racially neutral policing. The algorithm would identify high-risk locations based purely on historical data -- no human bias, no racial profiling, no subjective judgment. Police would be deployed where the data indicated they were most needed. Crime would be prevented before it occurred. The system would be, its advocates claimed, more fair and more effective than human-directed policing.

The system was deployed in Los Angeles, Chicago, New Orleans, Atlanta, and dozens of other cities. And the streetlight effect was built into its architecture.

The Feedback Loop

The historical crime data on which the algorithm was trained was not a census of all crimes committed. It was a record of crimes reported, detected, and recorded by police. This data inherited every bias in the history of American policing: the disproportionate surveillance of Black and Latino neighborhoods, the under-policing of white-collar crime, the under-reporting of crime in communities that distrusted police, the over-enforcement of drug laws in communities of color while similar drug use in white communities went largely undetected.

The algorithm faithfully learned these patterns. It did not learn where crime was. It learned where policing was. And because policing was historically concentrated in communities of color, the algorithm directed future policing to the same communities -- not because more crime occurred there in absolute terms, but because more crime was recorded there due to the prior concentration of police resources.

This created the feedback loop described in Section 35.3 of the chapter, but with algorithmic amplification. Police deployed to a neighborhood made more arrests, which generated more crime data from that neighborhood, which the algorithm interpreted as evidence of high crime risk, which directed more police to the neighborhood. The loop was self-reinforcing and self-confirming. The algorithm could point to its own predictions being validated -- "We predicted high crime in this neighborhood, and indeed, police made many arrests there" -- without recognizing that the arrests were a consequence of the policing, not independent evidence of the underlying crime rate.

Neighborhoods that were not heavily policed, meanwhile, generated less data. The algorithm interpreted the absence of data as evidence of low crime risk. Police resources were directed away from these neighborhoods. Crime in these neighborhoods went undetected and unrecorded. The algorithm, looking at the quiet data, confirmed its prediction: low risk. The streetlight grew brighter in some neighborhoods and dimmer in others, and the algorithm treated the distribution of light as if it were the distribution of crime.

The Human Cost

A 2019 study by researchers at the Human Rights Data Analysis Group examined PredPol's deployment in Oakland, California. They found that the algorithm disproportionately directed police to neighborhoods with high Black populations -- not because those neighborhoods had inherently higher crime rates for all offense types, but because the historical enforcement data reflected decades of racially disparate drug enforcement. The algorithm was not neutral. It was a machine for perpetuating historical bias under the guise of objective prediction.

The human consequences were concrete. Residents of algorithmically targeted neighborhoods experienced more police stops, more searches, more arrests for minor offenses, and more use of force -- not because they were more criminal than residents of non-targeted neighborhoods, but because they were more visible to a policing system that had been directed to look at them. The streetlight was pointed at them. And under the streetlight, everything is visible.

Residents of non-targeted neighborhoods, meanwhile, experienced less policing -- which meant less detection of the crimes that did occur there. Domestic violence in affluent neighborhoods went unreported at higher rates because victims had more to lose from involving the police. White-collar crime, which causes enormous aggregate harm, was essentially invisible to algorithms trained on street-level crime data. Drug use at the same rates across racial groups was detected and recorded at dramatically different rates depending on the intensity of policing -- and therefore appeared in the data as a racially disparate phenomenon when it was, in fact, a policing-disparity phenomenon.

The Structural Parallel

The structural parallel between McNamara's body count and the predictive policing algorithm is exact:

Feature	Body Count (Vietnam)	Predictive Policing
The metric	Enemy combatants killed	Crimes predicted and recorded
What it claims to measure	Progress toward victory	The spatial distribution of crime risk
What it actually measures	The intensity of kinetic operations	The spatial distribution of policing intensity
The feedback loop	High body counts → resources and promotion → more kinetic operations → higher body counts	High recorded crime → algorithm directs police → more arrests → more recorded crime
What it obscures	Political legitimacy, enemy morale, population loyalty	Unreported crime, white-collar crime, bias in historical enforcement
The perverse incentive	Officers maximize body counts rather than strategic objectives	Departments maximize arrests in targeted areas rather than equitable safety
The reckoning	Tet Offensive reveals that body counts masked deteriorating strategic position	Audits reveal that algorithm perpetuates racial disparities under the guise of neutrality

In both cases, a measurable proxy was mistaken for the reality it claimed to represent. In both cases, the proxy created a feedback loop that distorted behavior and reinforced the proxy's apparent validity. In both cases, the distortion persisted because institutional incentives rewarded performance on the proxy rather than progress on the underlying goal. And in both cases, the reckoning came when reality -- the Tet Offensive, the racial disparity audit -- revealed what the metric had been hiding.

Cross-Domain Analysis

The Proxy Problem

Both cases are instances of what social scientists call the proxy problem: the use of a measurable indicator as a substitute for an unmeasurable concept. Body counts are a proxy for military progress. Recorded crime is a proxy for actual crime. GDP (from Section 35.7) is a proxy for national wellbeing. Standardized test scores are a proxy for educational quality. The proxy is always more measurable than the concept it represents. And the streetlight effect ensures that the proxy gradually displaces the concept -- not in theory (everyone knows that body counts are not the same as winning), but in practice (all decisions are made based on the body counts).

The Feedback Amplification

The most dangerous streetlight effects are not static biases but dynamic ones -- biases that operate through feedback loops that amplify themselves over time. McNamara's body count created a feedback loop between the metric and the behavior it was supposed to measure. The predictive policing algorithm created a feedback loop between the data and the deployment decisions based on that data. In both cases, the system was not merely biased; it was increasingly biased, because each cycle of the loop reinforced the existing pattern.

The Institutional Lock-In

Both cases also illustrate how the streetlight effect becomes institutionally entrenched. McNamara's metrics were embedded in the Pentagon's reporting systems, briefing protocols, and promotion criteria. Abandoning the body count would have required redesigning the entire management architecture of the war. The predictive policing algorithm was embedded in departmental workflow, resource allocation decisions, and political narratives about data-driven governance. Abandoning the algorithm would have required admitting that the data-driven approach was flawed -- a politically costly admission for leaders who had championed it.

This is path dependence operating at the institutional level: the investment in the existing streetlight makes it progressively more costly to move the light, even when everyone knows it is shining in the wrong place.

Lessons for the Reader

The sophistication of the method does not guarantee the validity of the conclusion. McNamara's systems analysis was methodologically rigorous. Predictive policing algorithms are computationally sophisticated. Both produced systematically wrong conclusions because their data was systematically biased. Rigor applied to biased data produces rigorous bias.
Feedback loops are the streetlight effect's amplifier. A static bias -- a one-time search in the wrong place -- is correctable. A dynamic bias -- a feedback loop that reinforces itself with each cycle -- grows stronger over time and becomes progressively harder to correct. Always look for the feedback loop.
Proxies drift. Over time, the proxy and the concept it represents diverge. Body counts diverged from military progress. Recorded crime diverged from actual crime. The divergence is often invisible to the people managing the proxy, because the proxy's internal consistency masks its growing disconnect from reality. Always ask: when was this proxy last validated against the reality it claims to represent?
The cost of acknowledging the streetlight effect is institutional, not intellectual. Everyone involved in the body count system knew, on some level, that body counts were not the same as winning. Everyone involved in predictive policing knows, on some level, that recorded crime is not the same as actual crime. The problem is not ignorance. It is that acknowledging the bias requires institutional change -- changing the metrics, the incentives, the management systems, the political narrative -- and institutional change is always more costly than intellectual acknowledgment.
The people in the dark pay the price. The ultimate cost of the streetlight effect is borne by those who are invisible to the system's measurement apparatus. In Vietnam, the cost was borne by Vietnamese civilians whose welfare was not tracked by any metric. In predictive policing, the cost is borne by communities whose over-policing is perpetuated by the algorithm, and simultaneously by communities whose under-policing is perpetuated by the same algorithm. The streetlight effect is not just an epistemological error. It is a distributive injustice.