Chapter 39 Key Takeaways: Race, Representation, and Data Justice

Chapter 39 Key Takeaways: Race, Representation, and Data Justice

The Census Undercount

Differential undercount is documented and significant. The 2020 Census Post-Enumeration Survey found net undercount rates of 4.99% for Hispanic residents, 3.30% for Black residents, and 5.64% for American Indian and Alaska Native people on reservations — compared to a net overcount of 1.64% for non-Hispanic whites.

The mechanisms are structural, not incidental: informal housing absent from address lists; lower response rates among households with limited English proficiency, recent immigration history, or high residential mobility; challenges in counting young Black men; and imputation failures in nonresponse follow-up.

The consequences are concrete: congressional apportionment, Electoral College allocation, redistricting, and over 300 federal funding formulas all flow from Census counts. Systematic undercounting of minority communities produces systematic underrepresentation in political institutions and underfunding of community services.

Polling and Differential Representation

Standard survey sampling produces racially unequal results because politically engaged, educated, and white voters respond to survey requests at higher rates. Demographic weighting corrects for compositional imbalance but cannot correct for selection effects within demographic groups.

The thin cell problem means that standard sample sizes produce unreliable subgroup estimates for minority communities. Reliable minority subgroup analysis requires intentional oversampling and appropriately wide confidence intervals.

Differential response rates, unadjusted, produce polling signals that systematically emphasize the preferences of overrepresented groups — which translates into campaign strategy and media coverage that reflects those preferences disproportionately.

Affirmative practice: oversample minority communities; field surveys in multiple languages; report subgroup results with confidence intervals; document limitations explicitly.

Algorithmic Bias in Targeting

Machine learning models trained on historical campaign data learn from historical patterns — including historical patterns of racial underinvestment in campaign strategy. The result is predictive models that encode racial disparities as predictive features, not as biases to correct.

Proxy variables (geographic location, consumer behavior, media consumption) can function as effective racial proxies, producing targeting outcomes that are racially structured even when race is not an explicit model variable. This creates both legal ambiguity and ethical opacity.

"Algorithmic redlining" — systematic patterns in which modeled persuasion or contact priority scores are lower for voters in majority-minority areas than for demographically similar voters in majority-white areas — has been documented in research on recent electoral cycles.

Algorithm auditing — running model performance statistics disaggregated by race — is an essential step before deploying any predictive model in political targeting contexts. Bias discovered before deployment is correctable; bias discovered after is a harm already done.

Voter Suppression and Race

Data-enabled voter suppression targeting specifically aimed at minority communities has been documented in recent election cycles. The same voter file and commercial data infrastructure that enables mobilization targeting enables demobilization targeting.

The distinction between under-targeting and suppression is important but often blurry in practice. Under-targeting can reflect negligence, strategic calculation, or algorithmic bias — all with similar functional effects on minority community participation.

Proxy-based racial targeting for suppression — using demographic proxies rather than explicit racial categories — may produce racially structured effects while maintaining plausible deniability. This does not resolve the ethical concern.

The Data Justice Framework

Data justice asks: Who owns the data? Who benefits from its collection and use? Who is harmed? Whose interests were considered in system design?

Ruha Benjamin's "New Jim Code": Algorithmic systems can reproduce racial hierarchy without explicit racist instructions, by encoding historical inequities from training data as predictive features. There is no neutral baseline; every data system reflects design choices that serve particular interests.

Joy Buolamwini's measurement accountability principle: Systems developed and validated on data from one group perform significantly worse on underrepresented groups. The populations that bear the costs of measurement error should have meaningful input into system design and evaluation.

Safiya Umoja Noble's representational harm framework: Harm can arise from how groups are represented in data systems — through stereotyping, erasure, or reductive aggregation — as well as from inaccurate counting.

Affirmative Data Practices

Affirmative data practices are not passive avoidance of discrimination but active investment in equitable outcomes. Core practices include:

Disaggregated analysis and reporting by race, ethnicity, and other relevant characteristics, with appropriate confidence intervals
Oversampling minority communities in survey research
Multilingual fielding for populations with significant non-English language use
Community partnership in question design, variable selection, and model validation
Algorithm auditing for differential performance across racial groups before deployment
Explicit transparency about accuracy limitations for minority community estimates

The Surveillance Asymmetry

Minority communities are subject to detailed data collection in formats that serve strategic control (criminal justice records, financial distress indicators, commercial consumer data) while being inadequately represented in the survey and polling data that would translate their political preferences into political signal. This asymmetry reinforces political underrepresentation even when it is not the result of any individual's deliberate discriminatory intent.

Systemic Change and Individual Practice

Individual organizations' adoption of affirmative data practices — as ODA has done — does not fix the voter file, the Census, or the commercial targeting model ecosystem. But it demonstrates that a different approach is possible, produces better information for the communities it serves, and builds the evidentiary and organizational foundation for systemic change. Individual practice and systemic advocacy are complements, not substitutes.