Case Study 5.1: Predictive Policing and the PredPol Algorithm — Social Sorting in Law Enforcement

Overview

This case study examines predictive policing — specifically the PredPol (Predictive Policing) algorithm used by numerous U.S. law enforcement agencies — as a site where Lyon's social sorting, Browne's racializing surveillance, and Foucault's power/knowledge nexus intersect in a system that has demonstrable real-world consequences for the communities it targets.

Estimated Reading and Analysis Time: 75–90 minutes

Background: The Promise of Predictive Policing

Predictive policing refers to the use of algorithmic analysis of historical crime data to predict where crimes are likely to occur in the future, enabling police departments to pre-position officers in anticipated high-crime areas. Proponents argue that this approach improves efficiency by concentrating limited patrol resources where crime risk is highest, reducing crime through deterrence.

PredPol — developed by a team at UCLA and subsequently commercialized — was among the most widely adopted predictive policing tools. By its peak adoption in the mid-2010s, PredPol was used by more than fifty police departments across the United States, including departments in Los Angeles, Santa Cruz, New Haven, and Atlanta. The company marketed its system as racially neutral because it used only crime type, crime location, and crime date — no demographic data.

How PredPol Works

PredPol uses a seismological model originally designed to predict aftershocks following earthquakes. The algorithm identifies "hotspots" — 500 x 500 foot geographic boxes — that it predicts will experience elevated crime probability during a given shift, based on patterns in historical crime data.

Officers using the system receive a map at the start of each shift showing which boxes are predicted as high-risk. Officers are directed to patrol these boxes during down time — time not responding to active calls.

The algorithm updates continuously: as new crime data is entered, the predictions are recalculated, creating a continuously updating risk landscape.

The Feedback Loop Problem

The most significant analytical problem with PredPol — and the problem most clearly analyzed through the theoretical frameworks of Chapter 5 — is the feedback loop it creates.

PredPol's predictions are based on where crimes were previously reported or recorded, not where crimes actually occurred. This distinction is crucial.

Step 1: Uneven Policing Produces Uneven Data

In American cities, historical policing has been heavily concentrated in communities of color — the result of decades of enforcement policies including stop-and-frisk, anti-gang ordinances, drug enforcement practices, and community demographics of police deployment. As a result, crimes committed in these communities are more likely to be observed by police, more likely to generate arrest records, and more likely to appear in crime databases.

Crimes committed in less-policed neighborhoods — including affluent white neighborhoods where drug use, white-collar crime, and interpersonal violence also occur — are less likely to be observed by police, less likely to generate arrest records, and less likely to appear in crime databases.

Step 2: The Algorithm Learns the Deployment Pattern

PredPol learns from the historical crime data. But because the historical crime data reflects previous deployment rather than actual crime incidence, the algorithm learns where police were previously deployed, not where crime actually occurred. The algorithm's predictions reproduce the historical deployment pattern.

Step 3: Officers Are Sent to Predicted Hotspots

Officers dispatched to PredPol hotspots increase police presence in those areas. This increased presence leads to more observations of minor infractions, more arrests for low-level offenses, and more crime data recorded for those areas.

Step 4: New Data Confirms the Predictions

The new crime data generated by increased police presence is fed back into PredPol, where it confirms the existing hotspot designations. The algorithm's predictions are "validated" by the data — but the data was itself produced by following those predictions.

This is a feedback loop: the algorithm learns from police deployment, directs police deployment, and the resulting arrests confirm that its deployment was correct. The predictions are self-fulfilling.

Applying Chapter 5 Frameworks

Lyon's social sorting framework identifies the primary harm of surveillance as the classification and differential treatment of populations. PredPol's social sorting is geographic: the 500 x 500 foot boxes that receive high-risk designations are locations, not people. But locations in American cities are racially segregated, and the boxes designated as high-risk are disproportionately in communities of color.

The geographic social sorting translates directly into population social sorting: residents of predicted-hotspot areas face higher probabilities of police contact, arrest for minor infractions, and the downstream consequences of arrest records (employment barriers, housing discrimination, immigration consequences) than residents of areas not designated as hotspots.

This is differential treatment based on surveillance-generated classification — Lyon's core definition of social sorting's harm.

Browne: Racializing Surveillance

PredPol is designed to be race-neutral: it uses no demographic data. But Browne's framework helps explain how race-neutral surveillance can produce racializing effects.

The historical crime data that PredPol learns from is not race-neutral. It reflects decades of racially differential policing — a surveillance history in which Black and Latino neighborhoods were policed more intensively, generating more crime records, than comparable white neighborhoods. PredPol learns this racially differential history and reproduces it as algorithmic prediction.

Browne's concept of racializing surveillance does not require explicit racial targeting. It requires only that the surveillance technology reproduces racial hierarchy through the categories it uses and the populations its effects fall upon. PredPol qualifies: it uses geographic categories that encode racial patterns, and its effects fall disproportionately on communities of color.

Foucault: Power/Knowledge

The PredPol case illustrates Foucault's power/knowledge nexus with unusual clarity.

Power produces knowledge: The police department's power to deploy officers in certain neighborhoods, to make arrests, to record crime data — this power produces the knowledge base (the crime database) that PredPol uses.

Knowledge extends power: PredPol's predictions, derived from this power-produced knowledge, direct the deployment of police power in ways that concentrate it further in already-policed communities.

The spiral: Each cycle of deployment → arrest → data → prediction → deployment extends the power/knowledge spiral. The communities most heavily policed are those about which most crime knowledge exists; the most crime knowledge exists about the communities most heavily policed.

The productive effect: PredPol does not merely manage crime; it produces crime records — and produces the social category of the "high-crime area" — through its operation. The "crime hotspot" is not discovered by PredPol; it is partly constituted by the policing practices PredPol directs.

The Santa Cruz Case

In June 2020, Santa Cruz, California became the first U.S. city to ban predictive policing. The city council passed an ordinance prohibiting the use of algorithmic tools for predicting individuals or locations as likely crime sites.

The ordinance followed sustained advocacy by a coalition including civil rights organizations, legal scholars, and community members who documented the feedback loop problem and the racially disparate impacts of predictive policing.

Santa Cruz's decision was both a policy achievement and an illustration of the limits of policy reform: the same algorithms prohibited in Santa Cruz were still in use in dozens of other jurisdictions. The feedback loop problem does not disappear because one jurisdiction bans the specific algorithm; it reappears wherever similar systems are deployed.

Alternatives and Critiques

Critics of the "predictive policing is racially biased" argument have raised several counterpoints:

The alternative argument: Without data-driven deployment, officers exercise informal discretion about where to patrol, which may be more — not less — racially biased than algorithmic guidance. Systematic data is preferable to unchecked human judgment.

The effectiveness argument: If predictive policing reduces crime in targeted areas (evidence is mixed), it may benefit the communities targeted, which are often communities of color that also experience higher rates of violent crime.

The technology-versus-policy argument: The feedback loop problem is not inherent to predictive policing but is a consequence of using biased training data. With better data — perhaps victimization surveys rather than arrest records — the algorithm could potentially be corrected.

Each counterpoint has analytical purchase. The theoretical frameworks from Chapter 5 help evaluate them:

The alternative argument (better than unguided discretion) does not address the cumulative harm of surveillance concentrated in over-policed communities; it accepts the over-policing as a baseline and asks only whether algorithmic or human discretion directs it.
The effectiveness argument conflates crime suppression with crime reduction; surveillance-intensive policing may suppress reported crime in targeted areas while displacing it or failing to address underlying conditions.
The technology-versus-policy argument is more promising but faces the data quality problem: any historical data reflecting past policing patterns will embed those patterns to some degree.

Discussion Questions

The Feedback Loop: The case study describes a feedback loop in which PredPol's predictions are self-fulfilling. Is this a problem with the specific algorithm, with the use of historical arrest data, or with predictive policing as an approach? What would a feedback-loop-free predictive policing system look like?
Race-Neutrality: PredPol's developers argue that the system is race-neutral because it uses no racial data. Browne's framework suggests that race-neutral surveillance can still produce racializing effects. Who has the better of this argument? Does the intention to be race-neutral matter morally if the effect is not?
Social Sorting and Due Process: Social sorting typically operates without individualized cause — a person is classified as high-risk based on their location or demographic characteristics, not their specific behavior. Is predictive policing compatible with due process principles that require individualized suspicion before police action? What does the incompatibility imply for its use?
The "Better Than Alternatives" Argument: If unguided officer discretion produces more biased outcomes than algorithmic guidance, does that make algorithmic guidance acceptable even if still biased? Or does this argument accept conditions that should themselves be challenged?
Santa Cruz's Ban: Santa Cruz banned predictive policing. What are the likely consequences of the ban? Does prohibiting predictive policing algorithms address the structural conditions that make them discriminatory, or does it only remove one specific mechanism while leaving those conditions in place?
Jordan's Connection: Jordan walks home from the bus stop through a neighborhood that, hypothetically, appears in PredPol's hotspot map. Jordan hasn't done anything wrong. How does the existence of the hotspot designation affect Jordan's relationship with police in that space? Does Jordan need to know about PredPol for it to affect them?

Chapter 5 | Case Study 5.1 | Part 1: Foundations | The Architecture of Surveillance