Chapter 33 Key Takeaways

DataField.Dev

Chapter 33 Key Takeaways

Technical Concepts

Dashboard design starts with questions, not data. The single most important principle in campaign analytics dashboard design is to begin with the questions decision-makers need to answer and work backward to the data and visualizations that answer them. Nadia's six core questions — Are we on track? Where are we behind? Are we reaching the right voters? What is the support score profile of contacts? Who should we contact next? How are metrics trending? — drove every design decision. Visualizations that don't answer a specific question don't belong in the dashboard.

Data cleaning is non-negotiable and time-consuming. The ODA voter file cleaning pipeline handled string standardization, out-of-range numeric validation, age validation, vote history binary validation, and record deduplication before any analysis began. In real campaigns, data cleaning typically consumes 40-60% of a data analyst's time on a new dataset. The clean_voter_data() function in Example 01 illustrates the systematic approach: document what you cleaned, why, and how many records were affected.

Voter segments are the foundation of quality analysis. Classifying voters into segments (Hard Support, Soft Support - Persuadable, True Persuadable, Soft Opposition - Persuadable, Hard Opposition) based on the joint distribution of support score and persuadability score enables nuanced analysis of contact quality — not just whether voters are being contacted, but whether the right voters are being contacted.

Pace ratio is the most important daily KPI. Total contacts and percentage of goal are important, but the pace ratio — actual contacts per day divided by contacts per day needed to hit the goal — is the most operationally actionable metric. A pace ratio below 1.0 means the campaign needs to change something. The pace ratio changes every day as both the numerator (actual pace) and denominator (needed pace, which accelerates as time runs out) move.

Persuadability targeting lift measures contact quality. Subtracting the share of persuadable voters in the universe from the share of persuadable voters among contacted voters gives a clean measure of whether the campaign is successfully concentrating contact on high-value targets. Positive lift means the prioritization is working; lift near zero means canvassers are ignoring the priority lists.

The prioritization model is a composite of persuadability, support score targeting, vote likelihood, and county priority. The model in this chapter assigns 50% weight to persuadability, 30% to a Gaussian function that peaks at support score 55 (the genuine swing voter), 15% to vote frequency, and 5% to a county-priority adjustment. These weights are explicit choices that should be documented and revisited as the campaign accumulates contact data.

Plotly enables interactive dashboards that non-technical users can actually use. Static matplotlib charts are appropriate for printed reports and email attachments. Interactive Plotly HTML dashboards allow field directors to filter by county, zoom into specific time windows, and hover for specific data values — capabilities that dramatically increase the dashboard's utility for non-analysts.

Conceptual Themes

Measurement shapes reality. The metrics a campaign chooses to track shape the behavior of the people being measured. Tracking contacts per day incentivizes canvassers to maximize contact volume, sometimes at the expense of quality. Tracking persuadability targeting lift incentivizes following the priority list. Tracking conversion rate incentivizes focusing on the most receptive voters at the expense of reaching harder-to-contact persuadables. No KPI framework is incentive-neutral; understanding the perverse incentives embedded in your measurement choices is part of using data responsibly.

Data in Democracy: Tool or Weapon? The voter contact prioritization dashboard is not a neutral tool. It makes choices — about whose vote is most worth targeting, about which communities' data is well-calibrated, about what it means to be persuadable — that have differential effects on different communities. The systematic underrepresentation of certain communities in the training data for support score and persuadability models means that prioritization tools are more accurate for some voters than others, and those accuracy differences correlate with demographic characteristics. This is not a reason to stop building these tools; it is a reason to build them more carefully and use them more honestly.

The map is not the territory. Support scores and persuadability scores are model predictions, not measurements of individual voter preferences. A canvasser dispatched to contact a "True Persuadable" voter based on a model score of 52 may encounter someone whose actual views bear no resemblance to the model prediction. Campaign analytics provides a better-than-random prioritization of scarce contact resources; it does not replace the qualitative judgment that experienced canvassers bring back from doors.

The ODA Framework

The chapter introduces ODA (OpenDemocracy Analytics) and its mission of making advanced civic analytics accessible to campaigns and organizations working toward a more equitable democracy. The framework's key contributions: standardized voter file integration that eliminates the data cleaning burden that otherwise consumes weeks of analytical capacity, open-source dashboard templates with documented assumptions, and optional equity weighting in prioritization models.

The ODA equity debate — whether tools that require analytical capacity to use effectively systematically advantage well-resourced campaigns — is unresolved. Sam Harding's calibration analysis showing systematically lower model accuracy for certain demographic groups points to a structural issue that the "neutral tools" framing cannot fully address.

Code Architecture Summary

File	Purpose
`example-01-voter-file-analysis.py`	Data loading, cleaning, and universe profiling
`example-02-contact-kpis.py`	KPI computation, daily briefing generation, pace visualization
`example-03-dashboard-visualization.py`	Interactive Plotly dashboard + prioritization visualization
`exercise-solutions.py`	Five exercises: pace sensitivity, equity weighting, canvasser efficiency, A/B test, weekly KPI tracker

All four files are designed to be self-contained (they generate synthetic data if oda_voters.csv is not present) and modular (functions are documented and reusable).

Numbers to Anchor the Chapter

Garza campaign contact goal: 87,000 voters in 35 days
Pace ratio at the time Nadia presents the dashboard: 0.87 (87% of needed pace)
Riverside County: 61% of goal, but +11pp persuadability targeting lift and 24% conversion rate
ODA calibration range: 0.49 (Black/African American urban) to 0.72 (white suburban) support score model accuracy
The 6-question dashboard design framework: track, target, quality, trend, geography, priority