Key Takeaways: The Age of Political Data
-
Political data is everywhere---and growing. The convergence of digitized public records, the internet and social media, advances in computing, commercial data infrastructure, and cultural demand for data has created a political information ecosystem of unprecedented size and complexity. A single registered voter may have hundreds of data points associated with them across voter files, consumer databases, Census records, social media, and campaign contact histories.
-
More data does not automatically mean better understanding. The history of political analytics includes spectacular failures---the 1936 Literary Digest poll, 2016 election forecasting errors---that resulted from having large volumes of data but flawed methods, biased samples, or overconfident interpretations. Volume without rigor is dangerous.
-
Three worlds produce and consume political data for different purposes. Campaigns use data to win elections (targeting, persuasion, turnout). Media and researchers use data to explain and inform (polling, forecasting, journalism). Civic organizations use data to promote transparency and participation (open data, voter tools, accountability projects). These worlds overlap and sometimes conflict.
-
Measurement shapes reality. How we define categories (race, ethnicity, "likely voter"), what we choose to count, and whom we include in our samples all have real political consequences. Every dataset reflects the choices of its creators, and those choices are never neutral.
-
Who gets counted determines who gets heard. Political data systems can empower underrepresented communities by making them visible, or they can further marginalize them by undercounting them, miscategorizing them, or excluding them from samples and models.
-
Analytics differs from punditry in its commitment to quantified uncertainty. An analyst does not say "this candidate will win"; they say "this candidate has a 62 percent probability of winning, with these caveats." Calibrated confidence---being as confident as the evidence warrants, and no more---is a core professional virtue.
-
Prediction and explanation are different goals that are not always aligned. A model that predicts outcomes accurately may not explain why those outcomes occur, and vice versa. Different stakeholders---campaigns, media, researchers, civic organizations---prioritize these goals differently.
-
Political data can be a tool for democracy or a weapon against it. The same voter file can be used to engage underserved communities or to target vulnerable voters with disinformation. The ethics of political analytics depend on the intent, transparency, and accountability of the people who use the data.
-
The core analyst's toolkit includes: data acquisition and management, descriptive analysis, statistical modeling, visualization and communication, and critical evaluation of others' work. These skills apply across all three worlds of political data.
-
The Garza-Whitfield Senate race, Meridian Research Group, and OpenDemocracy Analytics will serve as running examples throughout this book, illustrating how analytical concepts play out in the campaign world, the media/research world, and the civic world respectively.