Case Study 40.2: The Optimization That Worked Too Well

Background

Stratagem Analytics is a mid-sized political technology firm that provides AI-powered voter contact optimization to campaigns. In the most recent election cycle, Stratagem deployed a new deep learning system for a gubernatorial campaign — call it the Alvarez campaign — that represented a significant capability upgrade over their previous gradient-boosted models.

The new system used a neural network architecture that integrated voter file data, consumer behavioral data, social media signal data, and historical campaign response data to generate contact priority scores. The system was trained on data from 23 previous campaigns that Stratagem had run. It updated its predictions in real time as canvassers submitted contact results, creating a feedback loop that continuously refined its estimates of which voters were most likely to be persuaded.

The system performed remarkably well by its primary metric: voter contact efficiency. In field tests during the primary, the system produced a 31% higher rate of successful persuasion contacts per canvassing hour compared to the control (traditional priority-score) approach. The campaign's field director was enthusiastic.

The Problem That Emerged

Six weeks before Election Day, Stratagem's data team ran a routine geographic analysis of the system's recommendations. They noticed something unexpected: the system was consistently assigning very low priority scores to voters in three specific urban zip codes. These zip codes were predominantly Black and Hispanic communities with historically lower response rates to campaign contacts.

The field director had noticed this as well and had initially attributed it to the model correctly identifying that those areas were harder to contact — lower response rates meant lower expected persuasion per contact hour, which was exactly what the model was supposed to minimize.

But Stratagem's lead analyst, a 28-year-old named Devon, ran a deeper analysis. Devon found that the response rate differential explained only part of the priority gap. After controlling for historical response rates, census tract socioeconomic characteristics, and individual-level predictors, the model was still assigning substantially lower priority to Black and Hispanic voters than to white voters with similar individual-level contact likelihood.

Devon investigated the training data. The 23 historical campaigns had all been run by Stratagem clients — which skewed toward competitive suburban and rural districts where the majority of the contact universe was white. The historical campaign response data from Black and Hispanic communities was thin. The model appeared to have learned, from this skewed training data, that Black and Hispanic voters were lower-value contacts — not because they were actually less persuadable, but because they had historically been contacted less and the model had encoded the historical under-investment as a preference.

The Compounding Factor

Devon's analysis also revealed a feedback loop problem. Because the model was continuously updating from current campaign data, its early-cycle under-prioritization of minority communities was reducing current contact rates in those communities — which was then feeding back into the model as additional evidence that these voters were lower-value. The model was becoming more biased over time, not less, because its own recommendations were generating the data it was learning from.

By the time Devon flagged this, the model had run for four weeks and the under-prioritization had compounded significantly. Correcting the model now — recalibrating it to prioritize the under-targeted communities — would require a significant resource reallocation in the final six weeks of a close campaign.

The Strategic Dilemma

Devon brings this finding to Stratagem's CEO, then together they bring it to the Alvarez campaign manager.

The campaign manager's reaction is complicated. On one hand, she recognizes that the model has been producing a racially discriminatory resource allocation — and she is running on a platform that explicitly includes racial equity commitments. The optics of the campaign's field strategy being racially biased would be devastating.

On the other hand, correcting the model six weeks out requires reallocating field resources from the areas where the model is working well (suburban persuasion targets) to areas where the campaign has historically struggled to translate contact into votes. The campaign's internal tracking shows a 3-point deficit. The race is close. A strategic field reallocation carries real risk.

And there is a third complication: the campaign has a legal team that has raised questions about whether the racially disparate resource allocation in the field program — even unintentionally produced — could create liability under state civil rights law. The legal team wants to understand the scope of the problem before deciding what to do.

Discussion Questions

1. Identify the specific mechanism through which algorithmic bias entered Stratagem's model. Use the concepts from Chapter 39 (training data bias, historical underinvestment, feedback loops) to describe precisely what happened.

2. Devon's discovery was the result of a geographic analysis followed by a causal investigation. Should this audit have been built into Stratagem's standard deployment process rather than conducted retroactively? Design a pre-deployment algorithm audit process for a voter contact prioritization model that would have caught this problem before the campaign began.

3. The feedback loop — where the model's recommendations generate the data it learns from — is a particularly serious problem in real-time updating systems. Explain the specific mechanism of this feedback loop and identify a technical design choice that would interrupt it.

4. The campaign manager faces a genuine tension between correcting the bias and maintaining campaign effectiveness in a close race. Evaluate her options: (a) correct the model fully and reallocate field resources; (b) correct the model partially, reallocating some resources but not disrupting the most effective targeting; (c) stop using the model's recommendations for the affected zip codes and revert to traditional methods for those areas; (d) continue with the current model but add supplementary outreach in the affected communities through a separate program. What are the strategic, ethical, and legal implications of each?

5. The legal team's concern about civil rights liability raises a question that Chapter 39 leaves partly open: can an unintentional, algorithmic resource allocation that produces racially disparate effects create legal liability? Research the legal landscape and develop your best assessment.

6. Devon is a relatively junior analyst who surfaced a problem that the organization had not caught. What organizational culture conditions made it possible for Devon to run this analysis and bring the findings forward? What organizational culture conditions might have prevented it?

7. Stratagem has 22 other active campaigns this cycle, all using versions of the same model architecture. When Devon's finding is confirmed, what are Stratagem's obligations to those other campaigns? Are they obligated to conduct retroactive audits of all 22? What if some of those campaigns did not want to know about potential bias — or would face similar resource reallocation dilemmas?

8. After the election, Stratagem wants to publish a technical paper documenting the bias problem and the correction methodology, to contribute to the field's understanding of feedback loop bias in campaign contact models. The Alvarez campaign objects — they say the paper would publicize a flaw in their operation that they corrected privately and that they don't want associated with the candidate's next campaign. How should Stratagem respond? What are their obligations to the research community versus their client?