Case Study 39.2: ODA's Algorithm Audit — When the Model Learns the Wrong Thing

DataField.Dev

Case Study 39.2: ODA's Algorithm Audit — When the Model Learns the Wrong Thing

Background

OpenDemocracy Analytics has been contracted by a statewide criminal justice reform coalition to support a voter registration and mobilization drive targeting formerly incarcerated people who have had their voting rights restored under a recent state law change. This is a population that has historically had very low voter registration rates — not because of disinterest, but because of systematic barriers including lack of awareness of rights restoration, instability in housing and identification, and distrust of government institutions.

The coalition wants ODA to build a contact prioritization model: given a list of approximately 45,000 people who appear to be eligible under the new law (identified through state court records linked to voter file registration data), which should be contacted first, and how?

Adaeze assigns the project to Sam Harding, who collaborates with a junior analyst, Priya Kowalczyk, on the modeling work.

The Data

The available data includes:

State Department of Corrections records (sentence completion dates, offense categories, county of release)
Current voter file data (registration status, address, any voting history)
Census tract characteristics for each individual's current address (where available)
American Community Survey data on the tracts

The population is approximately 63% Black, 21% Hispanic, 9% white, and 7% other races — reflecting the racially disproportionate impact of incarceration in this state.

The Model

Sam and Priya build a contact prioritization model using a gradient boosting classifier. The model is trained on data from a smaller pilot registration drive conducted by the coalition the previous year — a sample of 3,200 people, of whom 890 successfully registered after contact.

The model's features include: recency of rights restoration (more recent = higher priority), census tract characteristics (income, housing stability, distance to government offices), offense category, and county of release.

The model produces a priority score for each of the 45,000 individuals. Sam runs the standard accuracy checks — the model performs reasonably well on the holdout test set overall, with an AUC of 0.71.

The Audit That Changes Everything

Priya, on her own initiative, runs the model's predicted scores disaggregated by race. What she finds is alarming.

The model assigns significantly lower priority scores to Black individuals than to white individuals with similar characteristics — similar recency of rights restoration, similar census tract characteristics, similar offense category. The differential is not subtle: controlling for the other features, being Black is associated with a predicted score reduction equivalent to adding 18 months to the time since rights restoration.

Priya and Sam investigate. The root cause is in the training data. In the pilot registration drive, Black participants were less likely to successfully register — not because they were less interested, but because the pilot program's contact methodology had systematically performed worse in majority-Black neighborhoods. Canvassers had been deployed less thoroughly in certain zip codes; the door-knocking hours had been less compatible with employment schedules in those areas; and one coordinator had dropped two high-Black-population precincts from the pilot entirely due to what they described as "safety concerns."

The model learned from this training data that Black individuals in the target population were less likely to successfully register, and it dutifully encoded that as a lower-priority prediction. The model was accurate in a narrow sense — it was accurately predicting the outcome of the flawed pilot. It was deeply wrong in the sense that mattered: it was predicting not who was reachable but who had been poorly served.

The Implications

If ODA deploys this model without correction, it will concentrate outreach resources on the least-Black portions of the target population — the precise inverse of an equitable allocation in a population that is already 63% Black and in which Black Pennsylvanians bear a grossly disproportionate incarceration burden.

The model has learned racial discrimination from the training data and would reproduce it at scale.

Sam's Report to Adaeze

Sam presents the audit findings to Adaeze with a proposed fix: remove the geographic features from the model (census tract characteristics, county of release), which are functioning as racial proxies, and retrain using only individual-level features (recency, offense category). Alternatively, apply a fairness constraint to the model that equalizes predicted scores by race conditional on other features.

Adaeze listens carefully. Then she asks a question Sam hadn't anticipated: "Should we be using a machine learning model at all for this population?"

She explains: the training data for this model is a pilot conducted with significant methodological flaws. The sample is 3,200 people out of a target population of 45,000 — a relatively small training set for a gradient boosting model. And the outcome we're trying to predict — successful registration — is as much a function of how well the outreach operation is conducted as of any characteristic of the individual. A model that learns from flawed outreach will systematically encode the flaws as individual characteristics.

Maybe, Adaeze says, the right answer is a simple prioritization rule: contact everyone with rights restored in the last 36 months first, then work through older restorations chronologically, with geographic stratification to ensure all areas are covered. No machine learning. No opportunity to encode discrimination from biased training data.

Discussion Questions

1. Identify the specific mechanism through which the ODA model developed racial bias. Use the concepts from Chapter 39 (algorithmic bias, training data, historical patterns) to describe precisely what happened.

2. Priya's discovery was the result of running disaggregated model performance statistics by race — a type of algorithm audit. Was this audit required by ODA's affirmative data practice commitments? Who should be responsible for conducting such audits, and at what stage of model development?

3. Evaluate Sam's proposed fix (removing geographic proxy variables and/or applying a fairness constraint). What are the trade-offs of each approach? Will either approach fully resolve the problem?

4. Adaeze's question — "Should we be using a machine learning model at all?" — challenges the assumption that more sophisticated methods are always better. Under what conditions is a simple rule-based prioritization approach actually superior to a machine learning model? Apply this analysis to the specific situation ODA faces.

5. The training data flaw — the underperformance of the pilot in majority-Black neighborhoods — was created partly by a coordinator who dropped two precincts citing "safety concerns." How should organizations handle training data that reflects past operational discrimination? What are the options, and what are the consequences of each?

6. Apply the data justice framework to this case: whose data is being used, and what consent applies? Who benefits from the model if it is deployed without correction? Who is harmed? How does ODA's handling of accuracy limitations align with its stated commitments?

7. Suppose ODA deploys the corrected model and the coalition uses it for its mobilization campaign. Six months later, a journalist asks Adaeze about ODA's model for the formerly incarcerated voter drive. What should Adaeze say about the audit, the initial bias finding, and the correction? What are the transparency obligations in this situation?

8. What structural changes to ODA's intake and contracting process — before a project like this begins — might have surfaced the training data quality issues earlier? Design a three-step pre-project data quality review process that includes explicit checks for racial bias in available training data.