systematic patterns in which modeled persuasion or contact priority scores are lower for voters in majority-minority areas than for demographically similar voters in majority-white areas — has been documented in research on recent electoral cycles. → Chapter 39 Key Takeaways: Race, Representation, and Data Justice
Average Commission meeting attendance by press: 1.2 reporters per meeting - Average meeting length: 47 minutes - Frequency of contested votes: 31% of agenda items - Frequency of public comment: 24% of meetings included public speakers on contested items - County budget variance (actual vs. approved) → Case Study 23.1: The Collapse of the Rocky Mountain Clarion
2019-2022 (skeleton newsroom):
Average Commission attendance by press: 0.1 reporters per meeting (essentially none) - Average meeting length: 28 minutes (meetings shortened after press presence declined) - Frequency of contested votes: 14% of agenda items - Frequency of public comment: 7% of meetings included public speakers - Co → Case Study 23.1: The Collapse of the Rocky Mountain Clarion
4.D1
Jake Rourke argues that his 25 years of campaign experience give him better political judgment than any model. Nadia argues that models are more reliable than individual judgment because they are not subject to cognitive bias. Who is right, and under what conditions? → Chapter 4 Exercises: Thinking Like a Political Analyst
4.D2
The chapter argues that intellectual humility — expressing calibrated uncertainty — is a professional virtue for political analysts. But campaign environments reward confident projections. How should an analyst navigate this tension? What are the ethical dimensions of overstating certainty to satisf → Chapter 4 Exercises: Thinking Like a Political Analyst
4.D3
We noted that the ecological fallacy — drawing individual-level conclusions from aggregate data — has distorted understanding of populism. Can you think of a specific political narrative in recent American politics that may have been driven partly by ecological fallacy reasoning? How would you test → Chapter 4 Exercises: Thinking Like a Political Analyst
4.D4
The chapter distinguishes prediction and explanation as different analytical goals requiring different approaches. Is it ever possible for a single analysis to serve both goals well? What are the conditions under which the tension between them is most acute? → Chapter 4 Exercises: Thinking Like a Political Analyst
5% threshold
The minimum national vote share required for a party to receive proportional representation seats in the German Bundestag; creates distinctive forecasting uncertainty for parties polling near this boundary. → Chapter 22: Down-Ballot and Global Forecasting
`(election_date - df["date"]).dt.days`
This subtracts the poll date from the election date, producing a `timedelta` object. The `.dt.days` accessor extracts the number of days as an integer. A poll fielded on October 20 for a November 5 election produces `(2024-11-05 - 2024-10-20).days = 16`. The `.clip(lower=0)` call ensures that polls → Chapter 21: Building a Simple Election Model (Python Lab)
`(simulated_margins > 0).mean()`
This is a vectorized computation that converts the array of simulated margins to a Boolean array (True where Garza wins, False where Whitfield wins) and takes the mean. Since True = 1 and False = 0, the mean of a Boolean array is the proportion of True values — which is the win probability. This is → Chapter 21: Building a Simple Election Model (Python Lab)
`.copy()`
When you filter a DataFrame with boolean indexing, you receive a "view" rather than a copy. If you modify a view, pandas may warn you about setting values on a copy. The `.copy()` call creates an independent copy of the filtered data, preventing this ambiguity. Developing the habit of calling `.copy → Chapter 21: Building a Simple Election Model (Python Lab)
`np.random.normal(0, poll_avg_sd, n_simulations)`
This generates `n_simulations = 100,000` draws from a Normal distribution centered at 0 with standard deviation `poll_avg_sd = 1.5`. Each draw represents the polling error in one simulated election. The mean of 0 reflects our assumption that the polling average is unbiased on average; the standard d → Chapter 21: Building a Simple Election Model (Python Lab)
`np.random.seed(42)`
Setting the random seed makes the simulation reproducible: every time you run the code with the same seed, you get the same sequence of random draws. This is essential for debugging and for sharing results with colleagues who need to verify your numbers. In production, you might want to run multiple → Chapter 21: Building a Simple Election Model (Python Lab)
The `errors="coerce"` argument converts any value that cannot be parsed as a date to `NaT` (Not a Time — pandas' equivalent of NaN for datetime columns) rather than raising an error. This is safer than the default `errors="raise"` when processing datasets of unknown quality. After this call, scan fo → Chapter 21: Building a Simple Election Model (Python Lab)
`race_df['pollster'].value_counts()`
This diagnostic output is intentional. Before any analysis, you want to know which pollsters contributed data and how many polls each produced. A dataset where one pollster is responsible for 70 percent of the polls raises different issues than a dataset evenly distributed across many firms. → Chapter 21: Building a Simple Election Model (Python Lab)
A randomized experiment comparing two (or more) versions of a communication (email subject lines, digital ads, mailer headlines) to determine which produces better outcomes (open rates, click-throughs, donations, volunteer signups). The campaign equivalent of a clinical trial. *See also: RCT.* (Ch. → Appendix G: Glossary of Key Terms in Political Analytics
AAPOR
American Association for Public Opinion Research. The primary professional organization for survey researchers in the United States. AAPOR publishes standards for reporting polling methodology (the Transparency Initiative), definition of response rates, and codes of professional ethics. Its post-ele → Appendix G: Glossary of Key Terms in Political Analytics
accessibility
who can get it, under what conditions, and at what cost. Data accessibility is not just a technical issue; it is a political one, because it determines who can participate in data-driven analysis and who is excluded. → Chapter 3: The Political Data Ecosystem
Acquiescence bias
The tendency of survey respondents to agree with any proposition presented to them, regardless of its content. Respondents may agree with "The government should do more to reduce crime" and also agree with "The government should stay out of crime-fighting and leave it to local communities." Acquiesc → Appendix G: Glossary of Key Terms in Political Analytics
Affective polarization
The tendency of partisans to view members of their own party favorably and members of the opposing party with hostility, disgust, or contempt — independent of policy disagreements. Measured directly via *feeling thermometers* and *in-group/out-group* difference scores. Distinguished from *ideologica → Appendix G: Glossary of Key Terms in Political Analytics
Agenda-setting
The theory that media influence public opinion not by telling people what to think, but by telling them what to think about. Issues that receive extensive media coverage become more salient to citizens, who then weight those issues more heavily in political evaluations. *See also: priming, framing e → Appendix G: Glossary of Key Terms in Political Analytics
Aggregate bias
A systematic error that is consistent in direction across many polls or measurements, causing the aggregate of polls to be biased rather than merely variable. Distinguished from random error, which cancels out in averages. Aggregate bias persists even in large samples if the underlying methodology i → Appendix G: Glossary of Key Terms in Political Analytics
Alan Abramowitz's Time for Change model
one of the most celebrated and most scrutinized fundamentals models — includes presidential net approval (approval minus disapproval) in June of the election year as a key input. Abramowitz chose June as the reference point because it's when the model is typically published, long before the campaign → Chapter 18: Fundamentals Models: The Economy, Incumbency, and Structure
Algorithm auditing
running model performance statistics disaggregated by race — is an essential step before deploying any predictive model in political targeting contexts. Bias discovered before deployment is correctable; bias discovered after is a harm already done. → Chapter 39 Key Takeaways: Race, Representation, and Data Justice
What does the organic/paid reach ratio tell you about the two videos' relative performance? - The Whitfield video had significantly higher average watch time despite being longer (6 minutes vs. 4 minutes). What does this suggest about audience engagement? - The comments column shows a qualitative me → Case Study 31-2: Whitfield's Viral Moment and Garza's Response
Anchoring effect
A cognitive bias in which respondents' answers are influenced by a reference point (the "anchor") presented in the question or preceding questions. In political surveys, a question presenting a specific policy amount (e.g., "Should the minimum wage be raised to $15?") anchors subsequent judgments ab → Appendix G: Glossary of Key Terms in Political Analytics
**2.** The "paradox of participation" refers to the puzzle that: → Chapter 14 Quiz
Answer: c
**4.** Which of the following GOTV interventions has the strongest evidence base for increasing voter turnout? → Chapter 14 Quiz
Answer: d
**5.** Automatic Voter Registration (AVR) primarily addresses which barrier to participation? → Chapter 14 Quiz
Applications to contemporary analysis:
The Affordable Care Act's passage and the Republican Party's failure to repeal it created political opportunities for healthcare activist movements: the Democratic primary field's responsiveness to Medicare for All demands, town hall protests against repeal. - The George Floyd murder occurring durin → Chapter 35: Social Movements and Protest Analytics
Applications to contemporary movements:
Black Lives Matter's rapid growth after Ferguson (2014) reflected pre-existing infrastructure: Movement for Black Lives organizations, established activist networks, social media following of movement leaders, and philanthropic support from foundations that had been funding racial justice work for y → Chapter 35: Social Movements and Protest Analytics
Apply the model as if you were forecasting blind
use only data that was available before the election, not anything known afterward. This means not including polls fielded after Election Day, not incorporating post-election revised economic figures, and not using any knowledge of who actually won. 3. **Record the model's probability for each race* → Chapter 21: Building a Simple Election Model (Python Lab)
Area probability sampling
A form of *probability sampling* in which geographic areas are sampled at the first stage, then households within selected areas, then individuals within selected households. Used primarily in face-to-face surveys. Avoids the need for a complete sampling frame of individuals. *See also: probability → Appendix G: Glossary of Key Terms in Political Analytics
A personality orientation or value system characterized by preference for social conformity, deference to authority, hostility to out-groups, and desire for strong leadership. Measured in surveys via items about child-rearing values or directly via "social dominance orientation" and "right-wing auth → Appendix G: Glossary of Key Terms in Political Analytics
Available data:
2014 and 2016 presidential election results by municipality (from the Tribunal Superior Eleitoral — Brazil's electoral authority) - Party registration and candidate filing data - Economic indicators: state-level unemployment and GDP per capita - Social indicators: Bolsa Família (conditional cash tra → Case Study 22.2: Forecasting Without Polls — The 2018 Brazilian Municipal Elections
A legislative question placed directly before voters, either through citizen initiative, legislative referral, or other mechanisms. Analytics for ballot measures require different approaches than candidate races: there is no voter history for the specific measure, framing and messaging research is e → Appendix G: Glossary of Key Terms in Political Analytics
Ballot test
A survey question that simulates the act of voting by presenting respondents with the names and offices of candidates and asking how they would vote if the election were held today. The most common ballot test format is the "head-to-head" ballot test between two candidates. Also called the "horse ra → Appendix G: Glossary of Key Terms in Political Analytics
Base rate
The unconditional probability of an event occurring, before any case-specific information is applied. In political analytics, the base rate for an incumbent seeking reelection, or for the party holding the presidency in a midterm, provides the prior probability in a *Bayesian* analysis before polls → Appendix G: Glossary of Key Terms in Political Analytics
Battleground state
A state in which neither major party has a reliable advantage in presidential elections, making it competitive and therefore a primary target of campaign resources. Also called "swing state." The definition of battleground states shifts over time as partisan composition and demographics change. Stat → Appendix G: Glossary of Key Terms in Political Analytics
The process of revising probability estimates in light of new evidence according to Bayes' theorem. A prior probability (based on historical base rates or structural fundamentals) is updated using the likelihood of observing the new evidence under each possible state of the world, producing a poster → Appendix G: Glossary of Key Terms in Political Analytics
Bellwether
A county, precinct, state, or other geographic unit whose electoral results consistently predict the national outcome. Bellwethers are useful for tracking national results on election night, but their predictive reliability tends to degrade over time as demographic change alters their composition. → Appendix G: Glossary of Key Terms in Political Analytics
Benchmark poll
A comprehensive baseline survey conducted at the outset of a campaign, before significant public communication has occurred. Benchmark polls measure candidate name recognition, initial vote share, attribute ratings, issue salience, and opponent vulnerabilities. They establish the baseline against wh → Appendix G: Glossary of Key Terms in Political Analytics
A procedure in which experimental subjects are divided into homogeneous groups (blocks) before random assignment, ensuring that treatment and control groups are balanced on known characteristics. In political experiments, common blocking variables include party affiliation, prior vote history, geogr → Appendix G: Glossary of Key Terms in Political Analytics
Boomerang effect
The tendency for persuasive messages to produce attitude change in the opposite direction from what was intended, particularly when the message is perceived as heavy-handed or the recipient has strong prior commitments. Campaign negative ads can produce boomerang effects when they are perceived as u → Appendix G: Glossary of Key Terms in Political Analytics
Brier score
the mean squared error between the predicted probability and the actual binary outcome — is the standard metric for evaluating probabilistic forecasts. A model that assigns 50 percent probability to every race achieves a Brier score of 0.25 (the "no-skill baseline"). A perfect model that assigns 100 → Chapter 21: Building a Simple Election Model (Python Lab)
C
Calibration
A property of probabilistic forecasts: a forecast is well-calibrated if events predicted at 70% probability actually occur approximately 70% of the time, events predicted at 90% occur about 90% of the time, etc. Calibration is assessed across many predictions of similar probability. *See also: proba → Appendix G: Glossary of Key Terms in Political Analytics
Call back
The practice of attempting to contact a selected household at multiple times and on multiple days before giving up. Call-back protocols are critical for telephone surveys: households that are hard to reach (because residents work long hours, travel frequently, or are otherwise busy) tend to have dif → Appendix G: Glossary of Key Terms in Political Analytics
Campaigns can:
Mobilize base voters, particularly low-propensity supporters who would vote for their candidate if they vote at all - Frame issues — increase the salience of favorable issues and decrease the salience of unfavorable ones - Introduce challengers — build name recognition and positive associations for → Chapter 15: Campaign Effects — Do They Matter?
Campaigns cannot:
Convert committed strong partisans of the opposing party in meaningful numbers - Overcome large structural disadvantages (a 20-point structural deficit is beyond the reach of any documented campaign effect) - Produce effects that persist without continued reinforcement — effects decay, requiring con → Chapter 15: Campaign Effects — Do They Matter?
Canvassing
Door-to-door voter contact in which campaign volunteers or paid staff visit homes of targeted voters to deliver a scripted message and record contact information. The most effective GOTV and persuasion tactic per contact, but also the most expensive. Effectiveness is well-documented through randomiz → Appendix G: Glossary of Key Terms in Political Analytics
Canvassing: Estimating cost
Volunteer canvassing direct costs: $15-25 per completed contact (staff coordination, materials, training time) - If using paid canvassers: $40-60 per completed contact - Typically 80-90% of canvassing is done by volunteers; paid canvassers fill gaps → Capstone 3 Data Appendix: The Campaign Analytics Plan
The process of determining whether a relationship between two variables is causal (X causes Y) rather than merely correlational. In political analytics, causal inference requires either random assignment (as in *randomized controlled trials*) or quasi-experimental designs that approximate random ass → Appendix G: Glossary of Key Terms in Political Analytics
Cell phone problem
The methodological challenge created by the shift of American households from landline to cell phone usage. Telephone Consumer Protection Act provisions prohibit automated dialing to cell phones, requiring human interviewers (higher cost). Cell phone numbers cannot be localized geographically. Cell- → Appendix G: Glossary of Key Terms in Political Analytics
Chapter Appearances:
Ch. 1: Introduction of the race and all four characters; sets up the analytical stakes - Ch. 5: Nadia explores the ODA dataset; first look at state demographics - Ch. 8: Meridian polls the Garza-Whitfield race; sampling challenges in diverse state - Ch. 10: Python analysis of Garza-Whitfield polling → Continuity Tracking Document
Chapter Overview
Why this topic matters, what you'll learn - **Main Sections** (4–7 sections) — Core content with embedded activities - **Practical Considerations** — Real-world applications and common mistakes - **Chapter Summary** — Key concepts, studies, debates, and frameworks - **What's Next** — Preview of the → How to Use This Book
**Maria Garza** (D): Former state Attorney General, 47 years old, daughter of Mexican immigrants. Runs on healthcare access, education funding, immigration reform. Pragmatic progressive — not far-left but clearly progressive. Background: law degree, AG's office, built reputation prosecuting corporat → Continuity Tracking Document
Chart Type Matching
Choropleth maps: geographic variation - Bar charts: category comparison; stacked for composition, grouped for direct comparison - Line charts: temporal trends; only for sequential data - Scatter plots: bivariate relationships between continuous variables - Heatmaps: two-dimensional crosstabulation - → Chapter 16 Key Takeaways
Cluster randomization
Randomization at the group level (households, precincts, blocks) rather than the individual level, used to prevent spillover between treatment and control units. → Chapter 30: Field Experiments in Politics
Cluster sampling
A sampling procedure in which the population is divided into naturally occurring clusters (e.g., counties, precincts, households), a random sample of clusters is selected, and all or a random subset of units within each selected cluster are interviewed. Cluster sampling is operationally efficient bu → Appendix G: Glossary of Key Terms in Political Analytics
Color Scale Principles
Sequential scales (one hue, varying lightness): for data with one-directional range - Diverging scales (two contrasting hues meeting at a neutral center): for data with a natural midpoint - Always use colorblind-accessible palettes (ColorBrewer, viridis family) - Maintain the conventional red (Repub → Chapter 16 Key Takeaways
Comparative Perspectives
Cleavage theory (Lipset & Rokkan): party systems frozen around historical social divisions — national revolution (church-state, center-periphery) and industrial revolution (class) - Valence model: competition over who can best deliver shared goals (competence, integrity) rather than competing vision → Chapter 11 Key Takeaways: The American Voter and Beyond
A range of values computed from sample data within which the true population parameter is estimated to fall with a specified probability (usually 90%, 95%, or 99%). A 95% confidence interval means that if the sampling procedure were repeated many times, 95% of the resulting intervals would contain t → Appendix G: Glossary of Key Terms in Political Analytics
construct validity
whether the operationalization captures the theoretical concept — is the most debated form of validity. Does "political ideology" measured by self-placement on a 7-point liberal-conservative scale capture the multidimensional ideological space that political theorists describe? Probably not fully. T → Appendix A: Research Methods Primer
Contingent valuation
A survey method that asks respondents to express the monetary value they place on a good or outcome. Used in policy research to measure the value of public goods. Applied in political analytics to estimate willingness to pay for policy outcomes or to prioritize among policy options. → Appendix G: Glossary of Key Terms in Political Analytics
The tendency for polling errors in different states to run in the same direction and by similar amounts due to shared underlying causes. Correlated errors eliminate the error-diversification that probabilistic models rely on, producing win probabilities that are far too extreme. → Chapter 20: When Models Fail: 2016, 2020, and Beyond
Error arising when members of the target population have no chance of being selected because they are not included in the *sampling frame*. Examples include online polls that exclude people without internet access, telephone polls that exclude people without phones, and polls using listed telephone → Appendix G: Glossary of Key Terms in Political Analytics
Ipsos, Gallup, YouGov, and the various national academic survey programs that coordinate through the International Social Survey Programme and Comparative Study of Electoral Systems — have developed specialized techniques for navigating these cross-national differences. But the fundamental challenge → Chapter 6: What Is Public Opinion?
Cue-taking
The process by which citizens rely on information shortcuts (cues) from trusted sources — parties, interest groups, candidates, or peers — to make political judgments without processing detailed policy information. Party identification is the most powerful political cue in American politics. Elite c → Appendix G: Glossary of Key Terms in Political Analytics
Custom Audiences
Facebook's advertising feature that allows campaigns to target specific individuals by matching their email addresses or phone numbers to Facebook accounts. → Chapter 29: Voter Targeting and Microtargeting
D
Data merge
The process of combining records from two or more datasets using a common identifier (often a voter file record ID, or a combination of name, date of birth, and address). Merging commercial consumer data, voter file data, and survey data is the foundation of voter targeting. The quality of the merge → Appendix G: Glossary of Key Terms in Political Analytics
28 LV polls, 14 RV polls, 5 Adult polls - Average sample size: 682 respondents - Methodologies: 13 CATI, 6 Online-Probability, 18 Online-Opt-in, 8 Mixed, 2 IVR - Pollster sponsors: 8 independent, 4 campaign-commissioned, 3 party-affiliated, 32 commercial - Date range: 120 days before election to 3 d → Case Study 10-2: Building a Race Rating System — The Forecaster's Perspective
Demographic Destiny: The Fallacy
Population growth of Democratic-leaning groups does not automatically translate into Democratic electoral gains - Three mechanisms decouple population and electoral trends: differential turnout, within-group political change, party adaptation and counter-mobilization - "Emerging majority" projection → Chapter 13 Key Takeaways: Demographics and the Electorate
Demographic determinism
The error of assuming that an individual's political preferences can be reliably predicted from their demographic characteristics (race, gender, education, age) alone. While demographics are powerful predictors at the aggregate level, individual-level prediction is far more uncertain. Demographic de → Appendix G: Glossary of Key Terms in Political Analytics
Difference-in-differences
A quasi-experimental method that estimates causal effects by comparing changes over time between treatment and comparison groups. → Chapter 30: Field Experiments in Politics
Differential nonresponse
The general phenomenon in which certain subpopulations are less likely to respond to surveys, creating unrepresentative samples if weighting does not correct for the disparity. → Chapter 20: When Models Fail: 2016, 2020, and Beyond
email open rates, click-through rates, ad engagement — provides signal about which voters are paying attention and which messages are resonating. But digital engagement is heavily selected: the voters who engage with political ads are different in systematic ways from those who don't. → Chapter 28: The Modern Data-Driven Campaign
A news framing strategy that emphasizes conflict, disagreement, and partisan division in coverage of political events. Divergence framing can amplify the appearance of polarization beyond what actually exists in public opinion. Contrasted with *consensus framing* (emphasizing agreement) or *complica → Appendix G: Glossary of Key Terms in Political Analytics
donor network analysis
mapping the connections between donors, committees, and candidates to reveal patterns of coordination, influence, and shared interest that are not visible in any single filing. → Chapter 36: Money in Politics: Following the Data
Down-ballot
Referring to candidates or measures appearing below the top-of-ticket (presidential or gubernatorial) races on the ballot. Down-ballot races (state legislature, local office, ballot measures) typically receive less media attention, lower voter familiarity, and substantial *coattail effects* from the → Appendix G: Glossary of Key Terms in Political Analytics
E
Ecological fallacy
The error of inferring individual-level relationships from aggregate-level data. For example, if precincts with more college-educated voters tend to be more Democratic, it does not necessarily follow that college-educated individuals within those precincts are more Democratic — there could be other → Appendix G: Glossary of Key Terms in Political Analytics
Ecological inference
Statistical methods for estimating individual-level relationships from aggregate data. Used extensively in political science where individual-level data (surveys) are unavailable or unreliable, particularly for historical elections or in international contexts. Associated with Gary King's EI method. → Appendix G: Glossary of Key Terms in Political Analytics
Economic voting
The tendency of voters to reward or punish incumbent parties for economic conditions, especially income growth and unemployment. The central mechanism of the *fundamentals model*. Classic studies by Kramer (1971), Fair (1978), and Hibbs (1987) established the empirical basis; ongoing debate concerns → Appendix G: Glossary of Key Terms in Political Analytics
Effective sample size
The sample size that accounts for the statistical efficiency loss introduced by weighting; always less than or equal to the nominal sample size when weights are applied. → Chapter 20: When Models Fail: 2016, 2020, and Beyond
The system by which the U.S. president is elected, in which each state receives a number of electors equal to its congressional delegation (House seats + 2 senators). In 48 states, the winner of the popular vote in that state receives all of its electoral votes (winner-take-all). This structure mean → Appendix G: Glossary of Key Terms in Political Analytics
Electoral victory
the populist leader or party wins office through democratic means, often with genuine majority support. 2. **Institutional capture** — using the mandate of electoral victory to pack courts, subordinate independent prosecutors, capture public media, and weaken oversight bodies. 3. **Legal weaponizati → Chapter 34: Populism: Measurement, Causes, and Consequences
A type of *framing effect* in which a particular attribute, consideration, or value is made more salient in the presentation of an issue, affecting how audiences evaluate it, without changing the underlying facts. "Emphasizing" the economic cost of immigration policy versus its cultural effects, or → Appendix G: Glossary of Key Terms in Political Analytics
Endorsement experiment
A survey experiment that measures support for a policy by varying whether it is attributed to a particular leader, group, or country. If the same policy is more supported when attributed to a popular leader, the difference estimates the endorsement effect. Used to study cue-taking and to measure pol → Appendix G: Glossary of Key Terms in Political Analytics
Entry-level (0-2 years experience):
Campaign analytics: $50,000-$70,000 (plus boom-bust volatility) - Survey research firms: $50,000-$70,000 - Civic tech nonprofits: $45,000-$65,000 - Political consulting firms: $55,000-$80,000 - Government/legislative staff: $45,000-$65,000 - Academic (PhD stipend): $25,000-$40,000 - Data journalism: → Chapter 41: Careers in Political Analytics
Equivalence framing
Presenting logically equivalent information in different surface forms to demonstrate that framing shapes judgment. The classic example: "95% survival rate" versus "5% death rate" for a medical treatment produce different evaluations despite being identical statements. In politics: "97 out of 100 sc → Appendix G: Glossary of Key Terms in Political Analytics
Estimated spread:
Facebook shares/engagements: ___ - Twitter/X impressions: ___ - TikTok views: ___ - Mainstream media pickups: ☐ None ☐ Local ☐ National — outlets: _______________ → Appendix F: Templates and Worksheets
Evidence against simple translation:
Tufekci herself notes a paradox: social media lowers the organizational costs of getting people into the streets, but it may also lower the organizational strength that sustained campaigns require. A movement that can assemble 100,000 people through viral mobilization but has no organizational struc → Chapter 35: Social Movements and Protest Analytics
Evidence against:
Many Trump supporters in 2016 had above-average incomes; similar findings in European right populism - Thomas Piketty's analyses showed the populist right disproportionately attracts less-educated voters, but education correlates only loosely with economic hardship - Post-2008 austerity hit Southern → Chapter 34: Populism: Measurement, Causes, and Consequences
Evidence for translation:
Philip Howard and Muzammil Hussain's research on the Arab Spring shows that Twitter activity in Egypt and Tunisia in the weeks before uprisings began predicted the geographic spread of subsequent protests, even when controlling for pre-existing political grievance levels. - Zeynep Tufekci's research → Chapter 35: Social Movements and Protest Analytics
Evidence in favor:
The geographic correlation between Trump 2016 support and deindustrialized communities (Autor, Dorn, and Hanson's research on Chinese import competition) - Populist surges in post-2008 austerity Europe (Spain's Podemos, Greece's SYRIZA, Italy's Five Star Movement) - Historical cases: Great Depressio → Chapter 34: Populism: Measurement, Causes, and Consequences
Exit poll
A survey conducted at polling places on Election Day in which voters who have just cast their ballots are asked to complete a questionnaire about their vote choices, demographic characteristics, and issue priorities. Conducted by Edison Research for the National Election Pool (ABC, CBS, NBC, CNN, Fo → Appendix G: Glossary of Key Terms in Political Analytics
A survey question that asks respondents to rate their warmth or coolness toward a person, group, or institution on a scale from 0° (very cold/negative) to 100° (very warm/positive). Used extensively in the *American National Election Studies* (ANES). Feeling thermometers capture *affective* (emotion → Appendix G: Glossary of Key Terms in Political Analytics
Field data manager
The campaign staff member responsible for ensuring data flows correctly between the field organizing program and the analytics infrastructure. → Chapter 28: The Modern Data-Driven Campaign
estimated credit score ranges, debt-related consumer patterns, income volatility markers — are commonly present in commercial data packages. Campaigns that use financial anxiety as a mobilizing frame have an incentive to target those experiencing it. The ethics of targeting people in moments of fina → Chapter 38: Ethics of Political Analytics
A small-group qualitative research method in which a moderator facilitates discussion among 6–12 participants on a defined topic (a candidate, a message, an advertisement, a policy position). Focus groups generate hypotheses and reveal the language people use to think about issues; they are not repr → Appendix G: Glossary of Key Terms in Political Analytics
For campaign decision-making under uncertainty:
A 75% win probability means a real 25% loss probability — plan for it - Scenario analysis tells you which scenarios require attention and which resources to deploy where - High uncertainty is a signal to gather more information, not to assume the comfortable outcome → Chapter 19 Key Takeaways: Probabilistic Forecasting and Uncertainty
For campaign strategy:
Campaigns operating in unfavorable structural environments face a steeper climb; campaigns in favorable environments have more room to make mistakes. Knowing the structural baseline helps set realistic expectations. - The fundamentals don't determine outcomes — they constrain the range of plausible → Chapter 18 Key Takeaways: Fundamentals Models
For each platform, document:
Follower/subscriber count - Posting frequency (how many posts in the last 7 days) - Average engagement rate (likes + comments + shares ÷ followers) - Content mix (video vs. image vs. text) - Evidence of paid promotion (look for "Sponsored" labels) - Your assessment of content quality and strategy → Chapter 31 Exercises
For election analysis:
Start any race analysis with a structural baseline before looking at polls. What do economic conditions and approval ratings suggest? - When polls and fundamentals diverge, investigate why — both are providing information about the same race from different angles. - Use fundamentals as an anchor for → Chapter 18 Key Takeaways: Fundamentals Models
For producing polls that enter aggregations:
Methodological transparency improves quality ratings and ultimately quality weighting - Disclosing likely voter screen design, weighting procedures, and crosstabs earns transparency credit - Timing matters enormously — early polls contribute little to late aggregates due to recency weighting - Under → Chapter 17 Key Takeaways: Poll Aggregation
For reading aggregations:
Always look at which polls are included, not just the headline average - Check whether aggregators are adjusting for house effects - Be more confident when multiple methodologically different aggregators agree; investigate divergences - Remember that the aggregate measures current opinion, not a pre → Chapter 17 Key Takeaways: Poll Aggregation
For reading probabilistic forecasts:
Read the confidence interval or scenario range, not just the win probability headline - Ask: what conditions would have to be true for the minority-probability outcome to materialize? - Evaluate forecasters by their calibration record across many predictions, not by any single election - Be especial → Chapter 19 Key Takeaways: Probabilistic Forecasting and Uncertainty
Foundations of Political Analytics
The age of political data, history of polling, political data ecosystem, thinking like an analyst 2. **Public Opinion and Survey Research** — Survey design, sampling, question wording, weighting, and aggregation 3. **Elections and Voting Behavior** — Electoral systems, voter turnout, geographic anal → Political Analytics: From Populism to Polling
The phenomenon in which the way an issue is presented — the words used, the aspects highlighted, the comparisons invoked — influences the attitudes and judgments respondents express, independent of the underlying facts. Framing effects are among the most robust findings in political communication re → Appendix G: Glossary of Key Terms in Political Analytics
Fundamentals model
An election forecasting model that predicts electoral outcomes using pre-campaign structural variables — economic conditions (GDP growth, unemployment, income, presidential approval) and structural factors (incumbency, time-in-office) — without incorporating any polling data. Fundamentals models are → Appendix G: Glossary of Key Terms in Political Analytics
The difference in political preferences between men and women. In the United States, women have voted more Democratic than men since at least 1980 (the "gender gap"), and this gap has widened substantially since 2016. The gender gap is distinct from the marriage gap, the education gap, and other dem → Appendix G: Glossary of Key Terms in Political Analytics
Generational replacement
the slow process by which older cohorts die and younger cohorts enter the electorate — is one of the two mechanisms through which demographic change can produce political change. (The other is individual conversion — people changing their minds.) Generational replacement is slow but cumulative: if a → Chapter 13: Demographics and the Electorate
Generic ballot
A survey question asking respondents whether they prefer to vote for the "Republican Party candidate" or the "Democratic Party candidate" for Congress, without naming specific candidates. The generic ballot is a widely used indicator of the national political environment and a key input into congres → Appendix G: Glossary of Key Terms in Political Analytics
Geographic Information System (GIS)
Software systems that store, analyze, and visualize spatial data. In political analytics, GIS tools are used for mapping partisan composition by precinct, planning canvassing routes, analyzing redistricting proposals, and visualizing demographic data. ArcGIS and QGIS are the most common platforms. * → Appendix G: Glossary of Key Terms in Political Analytics
GOTV (Get Out the Vote)
Campaign activities designed to increase turnout among voters who already support the campaign. GOTV operations typically focus on high-support, low-propensity voters — those with *support scores* above a threshold but *turnout scores* below a threshold — and use phone banking, door knocking, ride-t → Appendix G: Glossary of Key Terms in Political Analytics
GOTV Resource Requirement:
Estimated doors/phone dials needed per conversion: ___ - Total contacts needed: ___ - Volunteer hours required: ___ - Priority subgroups within GOTV universe: _______________________________________________ → Appendix F: Templates and Worksheets
All five criteria fully Met: Grade A - Four criteria fully Met, one Partial: Grade A - Three criteria fully Met, two Partial or one Not Met: Grade B+ - Two criteria fully Met, significant Not Met: Grade B - Majority of criteria Not Met: Grade C - Methodology not disclosed: Grade F → Capstone 1 Data Appendix: Data Sources, Methodology Notes, and Analysis Steps
gross rating point (GRP)
a unit that represents 1 percent of the target audience exposed to an advertisement once. Buying 500 GRPs means that, on average, your target audience was exposed to your advertisement five times each (100% coverage, 5 times = 500 GRPs; or 50% coverage, 10 times = 500 GRPs). GRP planning requires de → Chapter 25: Political Advertising: From TV Spots to Targeted Ads
Ground game
The field operations component of a campaign, including voter registration, canvassing, phone banking, early vote programs, and Election Day turnout operations. The ground game represents the operational application of analytics: models identify universe, field staff executes contacts. *See also: GO → Appendix G: Glossary of Key Terms in Political Analytics
The tendency of polling firms to publish results that cluster around the consensus estimate or around results from prestigious pollsters, suppressing genuine variation in polling results. Herding occurs when pollsters who obtain outlier results adjust their methodology or withhold publication rather → Appendix G: Glossary of Key Terms in Political Analytics
Heuristic
A mental shortcut or rule of thumb used to simplify complex decisions. In voting, heuristics include party identification, candidate attributes (likability, competence), endorsements, and "Who does this candidate look like?" Low-information voters rely more heavily on heuristics; political knowledge → Appendix G: Glossary of Key Terms in Political Analytics
House effect
The systematic tendency of a particular polling firm to produce results that favor one party or candidate more than the average of other polls. House effects can arise from methodological choices (likely voter screen, weighting variables, mode of interviewing) or from deliberate decisions. A firm wi → Appendix G: Glossary of Key Terms in Political Analytics
party manifestos, speeches, social media posts. 2. **Define a populism dictionary** — words and phrases associated with the people-elite distinction. Their original dictionary includes terms like "elite," "establishment," "corrupt," "people's will," "ordinary citizens," "political class." 3. **Calcu → Chapter 34: Populism: Measurement, Causes, and Consequences
Identity-based targeting
Digital advertising directed at specific, identified individuals (via voter file matching), as opposed to demographic or interest-based targeting. → Chapter 29: Voter Targeting and Microtargeting
Ideological sorting
The process by which the two major political parties have become more internally homogeneous ideologically and more distinct from each other. "Sorted" parties mean that liberals are concentrated in the Democratic Party and conservatives in the Republican Party — a development that was much less true → Appendix G: Glossary of Key Terms in Political Analytics
campaign communications designed to look like election authority notices, government documents, or official candidate communications from the other party. → Chapter 38: Ethics of Political Analytics
In-group favoritism / out-group derogation
The tendency to evaluate members of one's own social group (in-group) more favorably than members of other groups (out-groups). A core mechanism of *affective polarization*: partisans rate their own party warmly and the opposing party coldly. *See also: affective polarization, feeling thermometer.* → Appendix G: Glossary of Key Terms in Political Analytics
Incumbency effects and candidate quality metrics
**Fundraising data** (which the Federal Election Commission makes public on a quarterly basis) - **Redistricting adjustments** (how new district lines changed the partisan composition) - **Forecaster ratings** from organizations like the Cook Political Report, the Rothenberg-Stewart Political Report → Chapter 22: Down-Ballot and Global Forecasting
Index (composite)
A single summary measure created by combining multiple related survey questions or variables. For example, an economic anxiety index might combine questions about job security, financial situation, and economic optimism. Composite indexes reduce the noise in individual questions and capture latent c → Appendix G: Glossary of Key Terms in Political Analytics
Inferential statistics
Statistical methods that draw conclusions about a *population* from a *sample*. In survey research, inferential statistics include confidence intervals, margins of error, hypothesis tests (t-tests, chi-square), and regression analysis. Distinguished from *descriptive statistics*, which summarize the → Appendix G: Glossary of Key Terms in Political Analytics
Information environment
The totality of political information available to citizens through all channels — news media, social media, interpersonal communication, campaign communication, entertainment media. The information environment shapes the salience of issues, the attributes of candidates that citizens use to evaluate → Appendix G: Glossary of Key Terms in Political Analytics
Polling average (quality-weighted, house-effect adjusted): Garza +3.8 - Structural baseline (Chapter 18 analysis): Garza +4.3 (integrated estimate) - Historical polling error standard deviation in comparable races: 3.2 points - Correlated error adjustment: using historical within-cycle correlations → Chapter 19: Probabilistic Forecasting and Uncertainty
Instrumental variable
A variable used in observational research to approximate the conditions of a randomized experiment. An instrumental variable is correlated with the treatment (independent variable) but affects the outcome only through the treatment, not through any other path. Valid instruments are rare but powerful → Appendix G: Glossary of Key Terms in Political Analytics
Intent-to-treat (ITT)
Analysis comparing treatment-assigned and control-assigned groups regardless of whether treatment was actually received. The operationally relevant estimate for campaigns. → Chapter 30: Field Experiments in Politics
Intraclass correlation (ICC)
A measure of the degree to which units within a cluster (e.g., households within a precinct, precincts within a county) are more similar to each other than to units in other clusters. In *cluster sampling*, a high ICC means that clusters are very homogeneous, and sampling more clusters rather than m → Appendix G: Glossary of Key Terms in Political Analytics
The phenomenon in which parties or candidates are perceived as more competent and trustworthy on certain issues (Republicans on defense and crime; Democrats on healthcare and education). Issue ownership affects which issues are activated in campaigns and how frames resonate with different audiences. → Appendix G: Glossary of Key Terms in Political Analytics
Issue Voting
Positional issues: specific policy alternatives exist; spatial model predicts choosing the closer candidate - Symbolic issues: rooted in values and identity; directional theory predicts choosing intensity of position, not proximity - Many important political issues (border security, "law and order") → Chapter 11 Key Takeaways: The American Voter and Beyond
K
Key data characteristics:
14,782 speeches from 2018 through early 2026 - 847 unique speakers - 50 states represented - Median word count: 1,847 words; mean word count: 2,341 words (right-skewed) - Populism score range: 0.0–1.0; median: 0.21; mean: 0.24 (right-skewed) - Approximately 23% of records have null `full_text` (only → Chapter 37: Tracking Populist Rhetoric (Python Lab)
Key design parameters:
Target population: Low-to-moderate propensity registered Democrats and unaffiliated voters in the three target counties (CEF's existing target universe) - Baseline turnout estimate: 55% based on previous comparable elections - Target effect size to detect: 2.5 percentage points (consistent with publ → Case Study 30.1: The Meridian Canvassing Experiment — Design, Execution, and What Went Wrong
Key filing types consulted:
**Form 3** (Campaign committee receipts and disbursements): Used for candidate campaign totals, itemized contributions, and operating expenditures. - **Form 3X** (Super PAC receipts and disbursements): Used for Senate Majority PAC and American Leadership Fund data. - **Form 3N** (Non-connected commi → Capstone 1 Data Appendix: Data Sources, Methodology Notes, and Analysis Steps
Key limitations of reported MOE:
Assumes probability sampling - Captures only random sampling error - Does NOT capture coverage bias, nonresponse bias, question wording effects, or weighting errors - Subgroup MOE is larger: use √(subgroup n) for denominator → Chapter 8 Key Takeaways
Key Race Dynamics:
State demographics: ~38% white non-Hispanic, ~32% Hispanic/Latino, ~18% Black, ~12% Asian/other - Registration: D+2 but trending more competitive - Key issues: healthcare costs, border security, water/drought, education funding, housing affordability - Suburbs vs. rural divide is central - Both cand → Continuity Tracking Document
Known data notes:
Alaska reports results at the borough/census area level in some years; FIPS codes for Alaska may differ across years. - A small number of counties were created or dissolved during 1980–2024 (notably in Virginia, where independent cities appear as counties). The ODA dataset standardizes these to 2020 → Appendix B: Python and Data Toolkit Reference
L
Labeled training data
a set of texts that human expert coders have already classified (populist/not populist, or a continuous populism score). 2. **Feature extraction** — converting text into numerical features (word frequencies, n-grams, sentiment scores, structural features). 3. **Model training** — fitting a classifie → Chapter 34: Populism: Measurement, Causes, and Consequences
last-click attribution
crediting the final touchpoint before conversion — is the default in most campaign analytics platforms, but it systematically undervalues channels that operate earlier in the voter engagement journey. → Chapter 31: Digital Campaigning and Social Media Strategy
A survey respondent classified as likely to cast a ballot in the upcoming election. Likely voter screens attempt to improve election predictions by restricting the sample to people who will actually vote. Common screening criteria include stated intention to vote, past voting history, interest in th → Appendix G: Glossary of Key Terms in Political Analytics
likely voter electorate
filtered for registration, predicted turnout, and election-specific mobilization — looks considerably different: - Approximately 44% white non-Hispanic (overrepresented relative to CVAP) - Approximately 26-28% Hispanic/Latino (significantly underrepresented) - Approximately 20-21% Black (slightly ov → Chapter 13: Demographics and the Electorate
likely voter screen
a set of survey questions designed to distinguish respondents who will actually vote on Election Day from those who will not. This distinction matters because registered voters and likely voters can have quite different preferences. If a candidate has strong support among people who are unlikely to → Chapter 2: A Brief History of Polling and Political Measurement
List Experiment (Item Count Technique)
Randomly assign respondents to control (N items) or treatment (N+1 items, including sensitive item) - Ask how many items apply — not which ones - Estimate sensitive item prevalence from the difference in group means - Requires larger samples due to increased variance → Chapter 7 Key Takeaways
Local average treatment effect (LATE)
The estimated effect of actual contact among the voters who would have been contacted if assigned to treatment, derived by dividing the ITT by the contact rate. → Chapter 30: Field Experiments in Politics
Longitudinal consistency (if applicable):
[ ] Are tracking questions identically worded to prior waves? - [ ] Are response scales reproduced identically from prior waves? - [ ] Is question position within the instrument stable across waves? → Chapter 7: Survey Design: From Questions to Questionnaires
Longitudinal survey
A survey design in which the same respondents are interviewed at multiple points in time. *Panel surveys* are the most common form, following the same individuals across survey waves. Allows measurement of individual-level change (vote switching, opinion change) rather than just aggregate-level tren → Appendix G: Glossary of Key Terms in Political Analytics
The range above and below a poll estimate within which the true population value is estimated to fall with a specified probability (usually 95%). For a simple random sample, MoE ≈ 1/√N for a 95% confidence interval, where N is the sample size. A poll of 1,000 respondents has an MoE of approximately → Appendix G: Glossary of Key Terms in Political Analytics
The tendency of extreme values to move toward the long-term average over time. In election forecasting, mean reversion is a key concept: polls showing very large leads or deficits typically overstate the eventual margin because some of the observed gap reflects sampling variation and temporary facto → Appendix G: Glossary of Key Terms in Political Analytics
Measurement accountability
the principle that the populations that bear the costs of measurement error should have meaningful input into how measurement systems are designed and evaluated — is an affirmative data practice that follows from Buolamwini's framework. → Chapter 39: Race, Representation, and Data Justice
Media market
A geographic area defined by which television stations' signals reach it, used by the Nielsen company to measure television audiences. Media markets are the primary unit for television advertising planning in political campaigns. The United States has approximately 210 media markets, ranging from Ne → Appendix G: Glossary of Key Terms in Political Analytics
Methodology transparency (weight: 25% of grade):
Met: Pollster discloses specific method (not just "telephone" but "live caller using RDD with cell phone supplements"), discloses whether cells or landlines or both are called, and discloses field dates. - Partial: Method is named but cell/landline breakdown is missing or field dates are only approx → Capstone 1 Data Appendix: Data Sources, Methodology Notes, and Analysis Steps
Microtargeting
The use of individual-level data to identify specific voters who are likely to respond favorably to specific messages or mobilization efforts, and to target those voters with differentiated communication. Modern microtargeting combines *voter file* data, commercial consumer data, and *predictive mod → Appendix G: Glossary of Key Terms in Political Analytics
Mid-level (3-7 years experience):
Campaign analytics director: $80,000-$120,000 (for major campaigns) - Survey research senior analyst: $75,000-$100,000 - Consulting firm director: $90,000-$140,000 - Civic tech organization director: $75,000-$100,000 - Government senior analyst: $75,000-$100,000 - Academic (assistant professor): $85 → Chapter 41: Careers in Political Analytics
Truncated y-axes on bar charts (makes small differences look large) - Cherry-picked time windows (shows only the favorable portion of a trend) - Asymmetric color scales (distorts visual balance in diverging displays) - Non-comparable denominators (mixing vote share with absolute votes) - Non-percept → Chapter 16 Key Takeaways
Mobile and mode considerations:
[ ] Has the instrument been tested on mobile devices? - [ ] Have grid questions been replaced with sequential single-item questions for mobile? - [ ] Are response option lengths appropriate for mobile display? → Chapter 7: Survey Design: From Questions to Questionnaires
Mode effect
The difference in survey results produced by different methods of data collection (telephone vs. online vs. face-to-face vs. text). Mode effects arise because the medium of interviewing affects question interpretation, social desirability pressures, respondent attention, and the populations reached. → Appendix G: Glossary of Key Terms in Political Analytics
Model accuracy
A measure of how well a predictive model's outputs match actual outcomes. In political analytics, model accuracy is assessed by comparing predicted probabilities (support scores, turnout probabilities) to actual behavior. Key metrics include *Brier score* (for probabilistic predictions), AUC-ROC (di → Appendix G: Glossary of Key Terms in Political Analytics
Models are tools, not oracles
turnout models improve resource allocation but require continuous validation, appropriate uncertainty quantification, and integration with qualitative organizational knowledge. → Chapter 14: Turnout — Who Votes and Why
modular network structures
loosely coupled clusters that each have internal coordination capacity but maintain limited inter-cluster connectivity except through specific bridging nodes. This structure is organizationally analogous to cell-based clandestine organization: if one cell is compromised, the others remain operationa → Chapter 35: Social Movements and Protest Analytics
modularity
a measure of how much more dense within-cluster connections are relative to what would be expected by chance. High modularity indicates strong community structure; low modularity indicates a relatively homogeneous network without clear subgroup divisions. → Chapter 35: Social Movements and Protest Analytics
Monte Carlo procedure:
Drew 50,000 random election-day margins from a normal distribution centered on the integrated estimate of Garza +4.0, with standard deviation 3.2 - Applied correlated error adjustment: in 40% of simulations, added a national environment shift drawn from a separate distribution (mean 0, SD 1.5), repr → Chapter 19: Probabilistic Forecasting and Uncertainty
MRP (Multilevel Regression and Poststratification)
A statistical method for estimating opinion in small geographic areas using data from national surveys. The multilevel regression step models individual opinion as a function of individual-level demographics (age, sex, race, education) and geographic-level characteristics (state partisanship, urbani → Appendix G: Glossary of Key Terms in Political Analytics
no single mode is reliable enough to absorb all operational problems. 2. **Demographic imbalances create substantive, not merely statistical, problems** when underrepresented groups hold distinctive political views. 3. **Client pressure for speed is almost always in tension with methodological sound → Case Study 9-1: Meridian's Multi-Mode Crisis
Multilevel Regression and Poststratification
universally abbreviated as MRP, and sometimes called "Mister P" — is a more sophisticated approach that has become increasingly influential in political polling, especially for estimating opinion in small geographies. → Chapter 8: Sampling: Who Speaks for the Public?
Multilevel Regression and Poststratification (MRP)
A statistical technique that combines individual-level survey data with population-level demographic data to produce subnational opinion estimates without requiring direct sampling in each subnational unit. → Chapter 22: Down-Ballot and Global Forecasting
Down-ballot races (Senate, House, state) increasingly track presidential vote share rather than local candidate quality or local issues - Driven by: sorting of elected officials, collapse of local news, campaign finance nationalization, party cue-taking - Consequence: incumbency advantage has dimini → Chapter 12 Key Takeaways: Partisanship, Polarization, and Sorting
Negative partisanship
The phenomenon in which partisan loyalty is driven primarily by dislike of the opposing party rather than enthusiasm for one's own party. Voters with negative partisanship vote reliably for their party not because they enthusiastically support its candidates but because they find the opposing party → Appendix G: Glossary of Key Terms in Political Analytics
No comment
standard campaign policy; protects the campaign but says nothing - **Denial** — the campaign has not implemented demobilization analytics; this is technically true - **Off-the-record background** — she could explain to the reporter, without attribution, that she has in fact *not* implemented such a → Case Study 38.2: The Reluctance Model — Nadia's Final Decision
non-attitudes
fabricated on the spot with no stable underlying content - **Ideological constraint** (the logical coherence of positions across issues) is rare in mass publics - Panel data shows many respondents give randomly different answers to the same question across survey waves - True ideologues constitute o → Chapter 6 Key Takeaways
Non-probability sample
A sample in which the probability of selection for each member of the population is unknown. Online opt-in panels, self-selected web polls, and convenience samples are non-probability samples. Traditional *inferential statistics* (margin of error, confidence intervals) do not apply directly to non-p → Appendix G: Glossary of Key Terms in Political Analytics
Non-response bias
Bias arising when people who do not respond to a survey differ systematically from those who do, in ways that affect the measured variable. If politically engaged voters are more likely to respond to political surveys, and political engagement correlates with partisan preference, the resulting bias → Appendix G: Glossary of Key Terms in Political Analytics
nonresponse bias
whether those who respond are systematically different from those who don't in ways that affect survey estimates. If nonresponse is random with respect to the variables being measured, low response rates produce estimates with higher variance (more uncertainty) but not systematically wrong ones. If → Chapter 9: Fielding and Data Collection
nonresponse biased
it misrepresents the population not because the sampling frame was wrong, but because the fraction of the frame that responded was unrepresentative. → Chapter 8: Sampling: Who Speaks for the Public?
normal distribution
the famous bell curve — is symmetric, with most values clustered around the mean and fewer values as you move away from the mean in either direction. Many naturally occurring quantities approximate normality. The normal distribution is foundational to classical statistics because it has predictable → Appendix A: Research Methods Primer
a realistic synthetic dataset of political data - Include complete, runnable code with step-by-step walkthroughs - Have corresponding files in the `code/` subdirectory - Can be completed in a standard 2–3 hour lab session → How to Use This Book
Online panel
A pool of pre-recruited volunteers who have agreed to complete surveys in exchange for points, cash, or other incentives. Online panels can be surveyed quickly and cheaply, making them attractive for political polling. Because panelists self-select into the panel, they are not probability samples; r → Appendix G: Glossary of Key Terms in Political Analytics
Open with engaging, non-threatening questions
right/wrong track, local issues before sensitive national issues 2. **Group related questions** — don't jump between topics; coherent topical flow reduces cognitive load 3. **Place demographics at the end** — avoids identity priming of substantive questions; reduces early dropout 4. **Use skip logic → Chapter 7 Key Takeaways
Open-ended question
A survey question that allows respondents to answer in their own words rather than selecting from predetermined response options. Open-ended questions capture the range and texture of opinion and reveal the language respondents use naturally. More expensive to analyze (require coding or NLP analysis → Appendix G: Glossary of Key Terms in Political Analytics
operational layer
kept internal — includes: - Claims under active investigation (publishing these before verification is complete would constitute spreading the claim) - Identity of confidential sources - Expert consultation records not approved for public attribution - Campaign response correspondence (the correspon → Chapter 43: Capstone 2 — The Misinformation Tracker
Opinions are distributions, not fixed points
any individual's "true" position is a probability cloud, not a location 2. **Context is data** — question wording, order, and mode are not contaminants; they are part of what you measure 3. **Stability varies** — partisan identity is stable; specific policy positions on unfamiliar issues are volatil → Chapter 6 Key Takeaways
Order and flow:
[ ] Does question order prime later responses in unintended ways? - [ ] Are related questions grouped appropriately? - [ ] Is skip/branch logic complete and correctly specified? - [ ] Are sensitive or personal questions near the end? - [ ] Is the instrument an appropriate length for the mode and pop → Chapter 7: Survey Design: From Questions to Questionnaires
Organizational Characteristics:
Revenue model: media clients (30%), political clients (25%), corporate (25%), academic partnerships (20%) - Methodological approach: multi-mode (phone + online + address-based), probability-based, transparent about weighting - Reputation: respected but mid-sized — not Pew or Gallup, so they face com → Continuity Tracking Document
Overall credibility rating:
☐ **High** (17–20 criteria met, 0–1 red flags): Poll meets professional standards; results can be used with appropriate uncertainty - ☐ **Moderate** (12–16 criteria met, 2–3 red flags): Important information missing; treat results with caution; do not treat as definitive - ☐ **Low** (fewer than 12 c → Appendix F: Templates and Worksheets
Overall ethics assessment:
☐ **Approved:** All criteria met; no concerns identified. Project may proceed. - ☐ **Conditional approval:** Concerns identified; proceed with listed mitigations. - ☐ **Hold:** Significant concerns require resolution before proceeding. See notes. - ☐ **Stop:** Violation identified. Project must not → Appendix F: Templates and Worksheets
Oversample
The practice of including a larger proportion of a minority group in a survey than exists in the general population, ensuring enough cases for statistically meaningful analysis of that group. Oversamples are common for racial minorities, residents of specific geographic areas, or specific age groups → Appendix G: Glossary of Key Terms in Political Analytics
P
Page 1: Where We Are
County choropleth: Garza support score (shaded) with bubble size = registered voters - Sorted horizontal bar chart: counties ranked by expected net votes (total voters × (support score / 100 - Whitfield share estimate)) — the "opportunity counties" → Chapter 16: Visualizing the Electorate (Python Lab)
Page 2: Where to Focus
Heatmap: turnout propensity × support score (2×2 grid showing which cell is the best GOTV target: high support + low propensity) - Time series: week-by-week tracking of modeled support in key counties, updated daily → Chapter 16: Visualizing the Electorate (Python Lab)
Panel attrition
The dropout of respondents between waves of a *longitudinal panel survey*. Attrition is typically non-random: lower-engagement respondents, younger respondents, and mobile respondents are more likely to drop out. This creates bias over time as the panel becomes increasingly composed of engaged, stab → Appendix G: Glossary of Key Terms in Political Analytics
Panel conditioning
The effect of prior survey participation on respondents' subsequent attitudes and behavior. Respondents who have been asked about their voting intentions may become more likely to vote; those asked about issue positions may develop more crystallized views. Panel conditioning is a concern in studies → Appendix G: Glossary of Key Terms in Political Analytics
panel study
a survey that interviews the same respondents multiple times before and after events of interest. The ANES has panel components that allow researchers to observe how changes in economic attitudes, candidate evaluations, and party identification over time relate to vote choice. → Chapter 11: The American Voter and Beyond
Parallel vote tabulation
A method of election verification that samples a random selection of polling stations and tallies results in real time to project overall outcomes independent of official counts. → Chapter 22: Down-Ballot and Global Forecasting
Parliamentary system
A system of government in which the executive emerges from the legislature and is accountable to it; contrasted with presidential systems where the executive is independently elected. → Chapter 22: Down-Ballot and Global Forecasting
Partisan nonresponse
A specific form of *differential response* in which partisans of one party are less likely to respond to surveys during periods when their political enthusiasm or hostility is activated. When Republicans are politically engaged and resistant to outreach (as some analysts argue was the case in 2020), → Appendix G: Glossary of Key Terms in Political Analytics
a genuine restructuring of which social groups belong to which party. The most significant ongoing realignment is the class realignment: as we'll see in greater depth in Chapter 13, college-educated voters have been moving toward Democrats while non-college voters have moved toward Republicans, cutt → Chapter 12: Partisanship, Polarization, and Sorting
Partisan sorting
the alignment of ideology with party identity — has produced parties that are more ideologically coherent and more predictable than at any time in the survey era. **Affective polarization** — the growth of negative feelings toward the opposing party — has intensified partisan identity and made cross → Chapter 12: Partisanship, Polarization, and Sorting
Partisan Sorting (Not the Same as Polarization)
Sorting: the process by which liberals have become Democrats and conservatives have become Republicans — ideological alignment with party identity - Polarization: the movement of policy opinions toward the extremes — a distribution shift, not just a realignment - You can have sorting without polariz → Chapter 12 Key Takeaways: Partisanship, Polarization, and Sorting
Party committees
the Democratic National Committee (DNC), Republican National Committee (RNC), and their House (DCCC/NRCC) and Senate (DSCC/NRSC) campaign arms — operate under distinct rules. They can: - Accept contributions up to the hard money limits from individuals and PACs - Make "coordinated expenditures" on b → Chapter 36: Money in Politics: Following the Data
Party identification
A long-standing psychological attachment to a political party that shapes how individuals perceive and evaluate political information. The concept was introduced by Angus Campbell, Philip Converse, Warren Miller, and Donald Stokes in *The American Voter* (1960). Party identification is typically mea → Appendix G: Glossary of Key Terms in Political Analytics
Party Identification (Michigan Model)
A psychological attachment to a political party — distinct from registration, voting behavior, and ideology - Acts as a "perceptual screen": shapes how voters interpret candidates, issues, and political events - Organized in the "funnel of causality": long-term forces (social background, party ID) → → Chapter 11 Key Takeaways: The American Voter and Beyond
perceptual screen
a lens through which voters process all incoming political information. When Rosa encounters a news story about Garza's healthcare proposal, her party ID shapes whether she finds the proposal credible, whether she remembers it, and whether she evaluates it positively. Republicans watching the same s → Chapter 11: The American Voter and Beyond
Personal contact mobilizes
the Green-Gerber research tradition has firmly established that high-quality personal outreach increases turnout, with effects that are meaningful at the margin of competitive elections. → Chapter 14: Turnout — Who Votes and Why
Personalization at scale
true individual-level message generation calibrated to each voter's specific profile — is qualitatively different from traditional segment-based microtargeting. It raises the persuasion-manipulation threshold question more acutely because the gap between what voters know about how they are being tar → Chapter 40 Key Takeaways: AI, Automation, and the Future of Political Analytics
Persuadability score
An individual-level model estimate of how responsive a voter is to campaign communication. High persuadability scores indicate voters whose opinions are weakly held, who have considered voting for both parties, or whose demographic and attitudinal profile suggests openness to persuasion. Persuadabil → Appendix G: Glossary of Key Terms in Political Analytics
Persuasion rate
In campaign field experiments, the estimated percentage of contacted voters who changed their vote intention as a result of the contact. Typical persuasion rates for canvassing and phone banking are very small (1–5%) on a per-contact basis, but aggregate to meaningful numbers in large-scale operatio → Appendix G: Glossary of Key Terms in Political Analytics
Persuasion Strategy:
Primary contact mode: ☐ Direct mail ☐ Digital ads ☐ Phone ☐ Door ☐ Other: ___ - Number of touches planned: ___ - Priority message themes (based on polling data): _______________________________________________ → Appendix F: Templates and Worksheets
A single value calculated from sample data as the best estimate of a population parameter. In polling, the percentage of respondents who prefer a candidate is the point estimate of that candidate's support in the population. Distinguished from an *interval estimate* (confidence interval), which expr → Appendix G: Glossary of Key Terms in Political Analytics
Polarization
The divergence of political attitudes, party coalitions, or governing behavior toward opposite extremes. Political scientists distinguish: (1) *mass polarization* (whether ordinary citizens' views have become more extreme); (2) *elite polarization* (documented increase in ideological distance betwee → Appendix G: Glossary of Key Terms in Political Analytics
Polarization's Effects on Polling
Partisan differential nonresponse: the party whose voters are more enthusiastic overrepresents itself in poll samples - Social desirability bias: some voters may underreport preference for candidates perceived as socially unacceptable - Herding: pollsters adjust toward consensus to avoid being outli → Chapter 12 Key Takeaways: Partisanship, Polarization, and Sorting
Political knowledge
Respondents' demonstrated recall of factual information about politics (names of officials, positions of parties, constitutional provisions). Political knowledge is associated with more consistent political attitudes, greater resistance to persuasion attempts, and greater use of policy information i → Appendix G: Glossary of Key Terms in Political Analytics
Polling average
A combined estimate of candidate support calculated by averaging across multiple polls, often with weights based on poll quality, sample size, recency, and pollster track record. Polling averages reduce the influence of any single poll's idiosyncratic error and provide more reliable signal than indi → Appendix G: Glossary of Key Terms in Political Analytics
Polling blackout
Legally mandated periods before an election during which publication of new polls is prohibited; common in European democracies and increases late-stage forecasting uncertainty. → Chapter 22: Down-Ballot and Global Forecasting
Populism
A political ideology or rhetorical style that posits a virtuous, unified "people" in opposition to a corrupt, self-serving "elite." In its thin-centered form, populism is compatible with many different thick ideological programs (left-wing or right-wing), and its specific content depends on who the → Appendix G: Glossary of Key Terms in Political Analytics
Populism needs media
even anti-media populism. The paradox of Whitfield's "fake media" attacks is that he delivers them on television, and his attacks generate the coverage that keeps him visible. Populist politicians are often highly media-savvy, understanding that conflict generates attention and that anti-media attac → Chapter 34: Populism: Measurement, Causes, and Consequences
Post-stratification weighting
A statistical adjustment applied to survey data to make the sample match known population characteristics. If women are 52% of the adult population but 55% of respondents, women's responses are downweighted to 52% and men's upweighted to 48%. Standard weighting variables include age, sex, race/ethni → Appendix G: Glossary of Key Terms in Political Analytics
Poststratification
The step in MRP where regression-based predictions for each demographic cell are combined with census data on the demographic composition of each geographic unit to produce weighted estimates. → Chapter 22: Down-Ballot and Global Forecasting
Pre-registration
The practice of publicly specifying an analysis plan, including hypotheses, data sources, and analytical methods, before data collection begins, to prevent post-hoc specification of analyses designed to produce significant results. → Chapter 27: Analyzing Political Text and Media (Python Lab)
prebunking
inoculating people against misinformation before they encounter it — has shown more consistent positive results. Inoculation theory, developed by William McGuire in the 1960s as a model of persuasion resistance, has been applied to misinformation by Sander van der Linden and colleagues. → Chapter 26: Misinformation, Disinformation, and Fact-Checking
Prediction markets
platforms where participants bet on election outcomes — aggregate the beliefs of market participants through price signals. When Garza trades at 62 cents on a prediction market, the market is expressing a collective belief that she has a 62% chance of winning. Prediction markets have shown some accu → Chapter 10: Reading and Evaluating Polls
Pretesting:
[ ] Has the instrument been reviewed internally? - [ ] Have cognitive interviews been conducted? - [ ] Has a pilot been fielded to check logic, timing, and response distributions? → Chapter 7: Survey Design: From Questions to Questionnaires
Priming
The process by which media coverage or campaign communication raises the salience of particular issues, causing citizens to weight those issues more heavily in their political evaluations. If media devote extensive coverage to immigration, voters may evaluate the president more heavily on immigratio → Appendix G: Glossary of Key Terms in Political Analytics
Probabilistic forecasting
An approach to election prediction that expresses outcomes as probabilities rather than point predictions. Instead of "Candidate X will win," a probabilistic forecast states "Candidate X has a 72% probability of winning." Probabilistic forecasts are more honest about uncertainty, allow readers to re → Appendix G: Glossary of Key Terms in Political Analytics
Probability sample
A sample in which every member of the target population has a known, non-zero probability of being selected. Probability samples are the foundation of classical inferential statistics; their theoretical guarantees (unbiasedness, calculable variance) underpin the margin of error and confidence interv → Appendix G: Glossary of Key Terms in Political Analytics
exemplified by the AmeriSpeak panel at NORC/University of Chicago, Ipsos's KnowledgePanel, and similar designs — recruit respondents through probability methods (often address-based sampling, ABS, drawing from the USPS delivery sequence file). Households without internet access are provided tablets → Chapter 9: Fielding and Data Collection
A type of campaign tactic disguised as a poll in which respondents are not genuinely sampled to measure opinion but are contacted to be exposed to negative (often false or misleading) information about an opponent. "If you knew that Candidate X had done Y, would you still vote for them?" Push polls → Appendix G: Glossary of Key Terms in Political Analytics
Python 3.9 or higher
We recommend the Anaconda distribution 2. **Jupyter Notebook or JupyterLab** — For interactive coding 3. **Required packages** — Listed in `requirements.txt`; install with `pip install -r requirements.txt` → Prerequisites
An R package for text analysis widely used in political science, with methods parallel to those covered in this chapter. For analysts fluent in R, Quanteda's documentation and tutorials at quanteda.io are the best entry point. → Chapter 27 Further Reading
Quota
In targeting, the number of contacts or conversations a volunteer or field organizer is expected to complete in a given time period. Quotas are set based on model predictions of voter density in walking lists or phone lists, and are used to assess field program performance. → Appendix G: Glossary of Key Terms in Political Analytics
Quota sampling
A sampling method in which interviewers are instructed to recruit a specified number of respondents in each demographic category (e.g., 50% women, 30% age 18–34, 40% non-college). Used by Gallup, Roper, and Crossley through 1948. Quota sampling does not require a random mechanism for selection withi → Appendix G: Glossary of Key Terms in Political Analytics
In Exercise 5.12, the county targeting analysis assigns a single "opportunity score" to each county. What assumptions is this approach making? What important information is it ignoring? → Chapter 5 Exercises: Your First Political Dataset
R3
Sam Harding's realization in Section 5.12 — that likely voter models systematically underrepresent communities whose mobilization is actively being contested — applies to many political analyses. Can you think of two other situations where the filtering or weighting choices in a political analysis s → Chapter 5 Exercises: Your First Political Dataset
Race and Partisan Coalitions
Black voters: historically stable at 85-95% Democratic; exit poll subgroup data is often unreliable for detecting small shifts - Hispanic/Latino voters: highly heterogeneous by national origin, generation, and region; treat as plural, not singular - Asian American voters: fastest-growing group, most → Chapter 13 Key Takeaways: Demographics and the Electorate
Randomized controlled trial (RCT)
An experiment in which participants are randomly assigned to treatment and control conditions, enabling causal inference. In political analytics, RCTs are used to test the effects of canvassing, mailers, advertising, and messaging on vote choice and turnout. The gold standard for causal claims about → Appendix G: Glossary of Key Terms in Political Analytics
Randomized Response Technique (RRT)
Introduce random element (coin flip) that gives respondents plausible deniability - Individual answers uninterpretable; aggregate proportions estimable from known probability structure - Best for high-stigma behaviors; adds complexity to interviewing → Chapter 7 Key Takeaways
RAS model
The Receive-Accept-Sample model of attitude formation developed by John Zaller. Individuals receive political messages (Receive), accept messages consistent with their predispositions (Accept), and sample from their considerations when asked for an opinion (Sample). The RAS model explains why educat → Appendix G: Glossary of Key Terms in Political Analytics
Real comparable datasets:
Comparative Manifesto Project (MARPOR): manifesto-project.wzb.eu — Coded party manifestos for 50+ countries. Can be used for populism text analysis research. - US Congressional Record: congress.gov/congressional-record — All floor speeches in Congress, accessible programmatically. - VoxPopuli: voxpo → Chapter 37 Further Reading
it shapes campaigns, policies, and political careers — but it is not a fact about the world the way a physical measurement is. It is an approximation, always context-dependent, always incomplete, always shaped by the instruments of measurement. The best political analysts hold both truths simultaneo → Chapter 6 Key Takeaways
Recall bias
Error arising when respondents inaccurately remember past events or behavior. In political surveys, respondents systematically over-report past voting (bandwagon effect), over-report having voted for the winner, and misremember their own past policy positions to align with current views. Vote recall → Appendix G: Glossary of Key Terms in Political Analytics
Redistricting
The process of redrawing legislative district boundaries, typically following each decennial census. Redistricting determines the partisan composition of congressional and state legislative districts and has profound effects on electoral competitiveness. Analytics play a central role in redistrictin → Appendix G: Glossary of Key Terms in Political Analytics
Regression discontinuity
A quasi-experimental research design that exploits a discontinuity (a threshold or cutoff) in treatment assignment to estimate causal effects. In political science: comparing candidates who barely won an election (received just over 50% of the vote) to those who barely lost. Candidates on either sid → Appendix G: Glossary of Key Terms in Political Analytics
Regression discontinuity design
A quasi-experimental method that exploits threshold-based treatment assignment to estimate causal effects for units near the threshold. → Chapter 30: Field Experiments in Politics
Regression to the mean
The statistical phenomenon whereby extreme observations in a sample tend to be followed by less extreme observations on subsequent measurement. In polling, a poll showing an unusually large lead is more likely to be followed by a more moderate result because the initial result contains sampling erro → Appendix G: Glossary of Key Terms in Political Analytics
Regularization
A technique in machine learning and regression analysis that penalizes model complexity to prevent *overfitting* — the tendency of models trained on sample data to fit noise as well as signal, performing well in-sample but poorly out-of-sample. Common regularization methods include ridge regression → Appendix G: Glossary of Key Terms in Political Analytics
Reliability
The degree to which a survey measure produces consistent results when administered under similar conditions (test-retest reliability) or when multiple indicators of the same construct are used (internal consistency reliability). A reliable measure of partisanship should produce similar results if th → Appendix G: Glossary of Key Terms in Political Analytics
religious change
particularly the growth of the nones — is restructuring the cultural coalitional landscape; and **urban-rural-suburban geography** has emerged as a dominant organizing cleavage. → Chapter 13: Demographics and the Electorate
The proportion of selected sample units from which completed interviews are obtained. AAPOR defines multiple response rate formulas of varying stringency (RR1 through RR6). Declining response rates in telephone polling (from ~36% in 1997 to ~6% in 2018) have intensified concerns about *non-response → Appendix G: Glossary of Key Terms in Political Analytics
Response rate and field period:
[ ] Is the response rate reported, or at least approximated? - [ ] Is the field period disclosed? (Multi-day polls provide more stability than single-day polls) - [ ] Is the mode described? (phone, online, text-to-web, ABS mail) → Chapter 8: Sampling: Who Speaks for the Public?
Response scales:
[ ] Does the scale have an appropriate number of points for the construct? - [ ] Is there a midpoint option for respondents with no opinion? - [ ] Is "don't know" or "refused" available where appropriate? - [ ] Are response options exhaustive and mutually exclusive? - [ ] Are response options presen → Chapter 7: Survey Design: From Questions to Questionnaires
Results:
Garza wins in 79.4% of simulations - Whitfield wins in 20.6% of simulations - Median simulated Garza margin: +4.1 - 80% confidence interval on actual margin: Garza +0.1 to Garza +8.0 - 90% confidence interval: Whitfield +1.3 to Garza +9.4 → Chapter 19: Probabilistic Forecasting and Uncertainty
Retrospective voting
The theory that voters evaluate incumbents or governing parties based on past performance (particularly economic performance) rather than future promises. The key empirical prediction: when the economy is strong, the incumbent party gains votes; when it is weak, the incumbent loses votes. The domina → Appendix G: Glossary of Key Terms in Political Analytics
Revisionism and the Ideology Debate
Converse (1964): most Americans lack ideological constraint — their views on one issue don't predict views on others - Nie, Verba, Petrocik (1979): constraint increased from the 1950s to 1970s — but critics showed this was partly a survey artifact - Lesson: changes in survey instruments can produce → Chapter 11 Key Takeaways: The American Voter and Beyond
S
Sample design:
[ ] Is the sampling method described? (probability vs. nonprobability; stratification design if any) - [ ] If the sample is nonprobability (online opt-in), is it disclosed as such? - [ ] Is the sample size reported for both the full sample and the relevant subgroups? → Chapter 8: Sampling: Who Speaks for the Public?
Sampling error
The random variation in sample statistics that arises because a sample, rather than the entire population, was measured. Sampling error decreases as sample size increases (proportional to 1/√N for simple random samples). Sampling error is the only form of survey error quantified by the *margin of er → Appendix G: Glossary of Key Terms in Political Analytics
Sampling frame
The list or rule from which a sample is drawn. The quality of a sampling frame determines the extent of *coverage bias* in the resulting sample. Telephone directories excluded unlisted numbers; landline RDD sampling excluded cell-only households; online panels exclude people without internet access. → Appendix G: Glossary of Key Terms in Political Analytics
Sampling frame:
[ ] Is the sampling frame described? (voter file, RDD, ABS, online panel?) - [ ] Is the frame appropriate for the target population? (A poll of likely voters should not use a general adult panel without adjustment) - [ ] Are obvious coverage gaps acknowledged? (cell-phone-only households, non-Englis → Chapter 8: Sampling: Who Speaks for the Public?
Sampling variance in individual polls
captured by sample size and MOE. 2. **Fundamental model uncertainty** — our fundamentals estimates are themselves uncertain. 3. **Systematic polling error** — the possibility that all polls are biased in the same direction by some unknown amount (the mechanism from Chapter 20). 4. **Late movement** → Chapter 21: Building a Simple Election Model (Python Lab)
Satire and parody
No intent to deceive, but potential to be mistaken for genuine news when stripped of satirical markers 2. **Misleading content** — Misleading framing of genuine information (a real video, a real quote, but presented in a context that distorts its meaning) 3. **Imposter content** — Genuine sources im → Chapter 26: Misinformation, Disinformation, and Fact-Checking
paid placement in response to specific search queries—is often overlooked in discussions of political advertising but is strategically significant. Search advertising reaches voters at the moment they are actively seeking information about candidates or issues, rather than interrupting them with mes → Chapter 25: Political Advertising: From TV Spots to Targeted Ads
Search trend data
Google Trends, Bing query volumes — has been proposed as a predictor of election outcomes on the theory that people who plan to vote for a candidate will search for that candidate. Some studies have found correlations between search volume and vote share in presidential primaries, where candidate na → Chapter 10: Reading and Evaluating Polls
Seats-votes curve
The historical relationship between a party's national vote share and its seat share in a legislature; typically nonlinear, with higher swing ratios near the 50-50 threshold. → Chapter 22: Down-Ballot and Global Forecasting
Section A: Financial Record
Total campaign contributions received in their most recent race (use FEC or state equivalent) - Top five industry categories of donors - Any individual contributions of $5,000 or more from sources with potential conflicts of interest before the official's relevant body or jurisdiction - Any politica → Chapter 32 Exercises
Section B: Public Statement Record
At least two public statements on a policy issue where you can document a clear position - At least one case where you can find an earlier statement on the same issue that could be characterized as inconsistent (if one exists) → Chapter 32 Exercises
Section C: Official Record
For legislators: three votes that you would flag as potentially difficult to defend in a general election - For executives: two regulatory, enforcement, or budgetary decisions with potential political vulnerability → Chapter 32 Exercises
Selective exposure
The tendency of individuals to seek out and consume information that confirms their existing beliefs and avoid information that challenges them. Selective exposure to partisan media limits the corrective function of new information and contributes to *affective polarization*. While the empirical evi → Appendix G: Glossary of Key Terms in Political Analytics
Senior level (8+ years experience):
Campaign analytics leadership at major national campaign: $150,000-$200,000+ - Polling firm founder/principal: highly variable (ownership stake) - Senior consulting firm partner: $150,000-$250,000+ - Civic tech executive director: $100,000-$160,000 - Federal senior executive service: $175,000+ - Ten → Chapter 41: Careers in Political Analytics
Sensitive data categories
religious affiliation, health information, financial distress indicators, precise location history — receive no special protection under political analytics norms but carry heightened ethical concern. The same data that enables effective targeted communication can enable targeted exploitation of vul → Chapter 38 Key Takeaways: Ethics of Political Analytics
Sensitive topics:
[ ] Have sensitive questions been designed to minimize social desirability bias? - [ ] Have sensitive question techniques (list experiment, RRT) been considered where bias risk is high? - [ ] Is there a "refused" option for highly personal questions? → Chapter 7: Survey Design: From Questions to Questionnaires
The phenomenon in which the persuasive impact of a low-credibility message increases over time, because recipients remember the message but forget the source. Implies that negative ads from disreputable sources may be more effective than initially appears. An area of ongoing debate in political comm → Appendix G: Glossary of Key Terms in Political Analytics
Social desirability bias
The tendency of survey respondents to report attitudes and behaviors that they believe are socially expected or approved, rather than their true views. Particularly significant for questions about race, immigration, prejudice, voter turnout (over-reporting), candidate support (when one candidate is → Appendix G: Glossary of Key Terms in Political Analytics
Social Identity Theory
People derive self-concept from group memberships and are motivated to view their groups favorably - Party becomes an identity, not just a preference — voting for your party is an expression of who you are - Helps explain in-group favoritism, out-group hostility, and the emotional dimensions of part → Chapter 11 Key Takeaways: The American Voter and Beyond
Social pressure mailer
A type of GOTV direct mail that shows recipients their own and their neighbors' voting histories, found in experiments to produce large turnout effects. → Chapter 30: Field Experiments in Politics
Sociology of forecasting
The social dynamics — reputational incentives, professional culture, media pressures — that shape how forecasters produce and publish predictions, often introducing herding and over-correction biases independent of measurement quality. → Chapter 20: When Models Fail: 2016, 2020, and Beyond
The organization's leadership team (who is featured on their website's About page?) - Job postings (what qualifications are required? what signals does the language send?) - Press coverage (what types of people are quoted representing this organization?) - Any published diversity, equity, and inclus → Chapter 41 Exercises: Careers in Political Analytics
Spatial Model and Its Limits
Downs (1957): rational candidates converge to the median voter; rational voters choose the nearest candidate - Real elections violate key assumptions: multi-dimensionality, strategic ambiguity, non-spatial motivations - Directional theory offers a partial alternative: voters care which side of a cul → Chapter 11 Key Takeaways: The American Voter and Beyond
Spillover
Contamination of the control group through treatment-to-control transmission mechanisms such as social networks or geographic proximity. → Chapter 30: Field Experiments in Politics
Spiral of silence
Elisabeth Noelle-Neumann's theory that individuals who believe their opinion is in the minority will be less likely to express it publicly, out of fear of social isolation. This silencing then further reduces the perceived prevalence of the minority view, creating a feedback loop. In political analy → Appendix G: Glossary of Key Terms in Political Analytics
Split sample
An experimental design embedded in a survey in which different randomly assigned subgroups of respondents receive different versions of a question or item set. Split samples allow researchers to test the effect of question wording, framing, or information on attitude expression while controlling for → Appendix G: Glossary of Key Terms in Political Analytics
Statistical power
The probability that an experiment will detect a true treatment effect if one exists; depends on sample size, effect size, and outcome variance. → Chapter 30: Field Experiments in Politics
statistical weighting
the process of adjusting survey data so that the sample matches the known characteristics of the target population. In the golden age of telephone polling, weighting was a relatively minor adjustment, because high response rates meant the raw sample was reasonably close to the population. With onlin → Chapter 2: A Brief History of Polling and Political Measurement
*Windows*: Run the `.exe` installer. When prompted, check "Add Anaconda to my PATH environment variable" only if you know what you are doing; otherwise, use the Anaconda Prompt application that the installer creates. - *macOS/Linux*: Run `bash Anaconda3--MacOSX-x86_64.sh` from your terminal → Appendix B: Python and Data Toolkit Reference
Step 3: Apply the three tests
**The disclosure test:** Would you be comfortable if the full details of this action appeared in a well-reported news story? If the answer is no, ask why — and whether the reason reflects genuine ethical concern or merely reputational risk. - **The consent of the governed test:** Would the people wh → Chapter 38: Ethics of Political Analytics
Stratified sampling
A sampling procedure in which the population is divided into non-overlapping groups (strata), and separate samples are drawn from each stratum. Stratified sampling increases precision when strata are internally homogeneous and different from each other. In political polling, common strata include ge → Appendix G: Glossary of Key Terms in Political Analytics
straw poll
an informal, unscientific survey conducted by a newspaper or organization, typically by approaching people in public places and asking their candidate preference. The term itself comes from the practice of throwing straw in the air to see which way the wind was blowing. → Chapter 2: A Brief History of Polling and Political Measurement
Structural barriers matter
registration requirements, ID laws, roll purges, and polling place accessibility all affect participation rates, often in ways that are differentially burdensome by race, income, and age. → Chapter 14: Turnout — Who Votes and Why
Structural interventions
expanding hiring networks beyond elite institutions, addressing internship accessibility, investing in mentorship for underrepresented junior analysts — are practical organizational changes, not just aspirational commitments. → Chapter 41 Key Takeaways: Careers in Political Analytics
Structural model
An election forecasting model based on stable, measurable features of the political environment (presidential approval, economic conditions, seat exposure, incumbency) that predict electoral outcomes based on historical patterns. Distinguished from *poll-only* models by the inclusion of non-poll pre → Appendix G: Glossary of Key Terms in Political Analytics
Super PAC
A type of political action committee that may raise unlimited funds from corporations, unions, and individuals, but may not contribute to or coordinate directly with candidates or political parties. Super PACs conduct "independent expenditures" — advertising and organizing activities conducted indep → Appendix G: Glossary of Key Terms in Political Analytics
Support score
An individual-level model estimate of a voter's probability of supporting a specific candidate or party, expressed on a 0–100 scale. Support scores are generated by applying predictive models (trained on surveys and past voting data) to individual voter records. The primary targeting tool in modern → Appendix G: Glossary of Key Terms in Political Analytics
Suppression analytics
using data to discourage specific voters from participating — represents the clearest application of the manipulation framework to political strategy. The consent of the governed test applies: voters would not, if they understood what was happening, recognize targeted demobilization as legitimate de → Chapter 38 Key Takeaways: Ethics of Political Analytics
Survey experiment
A research design that uses random assignment of question wording, frames, information, or conditions to measure causal effects of communication on attitudes. Allows causal inference within survey settings. See *split sample*. *See also: split sample, RCT.* (Ch. 5) → Appendix G: Glossary of Key Terms in Political Analytics
Swing voter
A voter who does not have a strong attachment to either major party and whose vote choice varies from election to election. In campaigns, swing voters are the primary target of persuasion efforts. The definition and size of the "true" swing voter population is contested: some analysts argue it has s → Appendix G: Glossary of Key Terms in Political Analytics
Synthetic respondents
AI-generated simulated survey responses without real respondents — are least reliable for the local-specific questions that matter most in electoral contexts. Using synthetic respondents as a substitute for actual polling in high-stakes electoral applications is a methodological failure, not an effi → Chapter 40 Key Takeaways: AI, Automation, and the Future of Political Analytics
A probability sampling procedure in which every kth unit is selected from a list after a random starting point. For example, to select 100 households from a list of 10,000, select a random number between 1 and 100, then select every 100th household on the list. Systematic sampling approximates simpl → Appendix G: Glossary of Key Terms in Political Analytics
T
Targeting
The systematic allocation of campaign contact resources toward specific voters based on their estimated support, turnout propensity, and persuadability. → Chapter 29: Voter Targeting and Microtargeting
Americans are living in increasingly partisan communities, partly through self-selection into different types of places - The red-and-blue map overstates Republican geographic dominance by weighting area rather than population - Geographic sorting reduces cross-partisan contact, reinforces affective → Chapter 12 Key Takeaways: Partisanship, Polarization, and Sorting
The Calibration Verdict:
Forecasters who gave Clinton 98-99% win probability experienced a roughly 50-70-sigma event (essentially impossible under their models). Their models were severely miscalibrated. - Forecasters who gave Clinton 85% experienced a roughly 3-sigma event — unusual but within the range of plausible outcom → Case Study 19-1: The 2016 Presidential Forecast and the Correlated Error Problem
The Education Realignment
College-educated white voters have moved toward Democrats; non-college white voters have moved toward Republicans - This is distinct from class voting: it reflects cultural identity, institutional exposure, and economic attribution, not income alone - The realignment is most pronounced among white v → Chapter 13 Key Takeaways: Demographics and the Electorate
The Endogeneity Debate
Is party ID prior to and causative of issue positions, or do issue positions also shape party ID? - Modern evidence suggests a recursive relationship: party ID is stable but not immutable; it is updated by political experience - Partisan sorting (covered in Ch. 12) makes this debate harder to resolv → Chapter 11 Key Takeaways: The American Voter and Beyond
The field has a significant diversity deficit
disproportionately white, disproportionately male, disproportionately from elite institutions. This is not incidental; it reflects informal hiring networks, internship accessibility barriers, and mentorship gaps that systematically disadvantage candidates from underrepresented backgrounds. → Chapter 41 Key Takeaways: Careers in Political Analytics
The four domains of ethical concern
privacy, manipulation, representation, and accountability — provide a systematic structure for analyzing ethical dilemmas in political analytics. Real situations typically implicate more than one domain simultaneously; the framework's value is in making the relevant considerations explicit rather th → Chapter 38 Key Takeaways: Ethics of Political Analytics
The Four Layers of the Electorate
Citizen voting-age population (CVAP) → registered voters → likely voters → actual voters - Each filter (naturalization, registration, turnout) reduces participation disproportionately for younger, lower-income, and minority voters - Demographic change in the CVAP translates into electoral change onl → Chapter 13 Key Takeaways: Demographics and the Electorate
The Gender Gap
Women vote approximately 10-15 points more Democratic than men in recent presidential elections - The gap emerged primarily from men moving Republican (1980s onward), not women moving dramatically Democratic - Largest among unmarried women, younger women, college-educated women - Non-college and old → Chapter 13 Key Takeaways: Demographics and the Electorate
The hallucination problem
LLMs generating fluent, confident text that is factually incorrect — is a fundamental limitation for political communications use. LLMs are fluency machines, not accuracy machines. Robust human review and fact-checking infrastructure is an ethical requirement for any LLM deployment in political cont → Chapter 40 Key Takeaways: AI, Automation, and the Future of Political Analytics
The human analysts' disagreement:
Sam Harding: "The speech is using people-centric vocabulary, but there's no anti-elite critique. The senator is praising Washington's ability to deliver for communities, not attacking Washington as corrupt. The frame is 'government can work for you,' which is the opposite of populism." - A second an → Case Study 37.2: The Gap Between Map and Territory — Three Cases of Classifier Failure
The Polarization Dataset
A longitudinal synthetic public opinion dataset tracking ideological sorting, issue positions, and media consumption across demographic groups. → Political Analytics: From Populism to Polling
The Three Types of Polarization
**Ideological polarization**: actual movement of policy opinions toward extremes (mixed evidence among voters; clear evidence among elites) - **Affective polarization**: growing emotional hostility toward the out-party (strong evidence; growing faster than ideological polarization) - **Elite polariz → Chapter 12 Key Takeaways: Partisanship, Polarization, and Sorting
Thin ideology
A concept developed by political theorist Michael Freeden to describe ideologies (like populism) that occupy only part of the ideological spectrum and require combination with a "thick" or full ideology (socialism, nationalism, liberalism) to specify complete policy positions. Thin ideologies are fl → Appendix G: Glossary of Key Terms in Political Analytics
Third-party consumer data
Data about individuals' purchasing behavior, lifestyle, and demographic characteristics purchased from commercial data brokers and used to enrich voter file records. → Chapter 29: Voter Targeting and Microtargeting
The summary document of a poll's results, presenting all questions and response distributions for the full sample and key subgroups. Professional polling organizations publish topline documents alongside their press releases. Analysis of a poll without access to the topline — relying only on press c → Appendix G: Glossary of Key Terms in Political Analytics
A poll conducted continuously (daily or every few days) using a rolling sample that is averaged over a defined window (typically 3–7 days). Each day, a new set of interviews is added and the oldest set is dropped, providing a rolling estimate of public opinion. Used by campaigns to monitor opinion t → Appendix G: Glossary of Key Terms in Political Analytics
Transfer rates
In two-round or preferential voting systems, the proportion of first-choice voters for an eliminated candidate who transfer their support to each remaining candidate. → Chapter 22: Down-Ballot and Global Forecasting
transparency ratings
how much of their methodology they disclose. A pollster who publishes full crosstabs, discloses their likely voter screen, explains their weighting procedure, and provides complete question wording gets credit for transparency even if their historical sample is small. Opacity, conversely, is a red f → Chapter 17: Poll Aggregation: From RealClearPolitics to FiveThirtyEight
Turnout gap
Differences in turnout rates between demographic groups. The most politically consequential turnout gaps in American elections are between younger and older voters (older voters turn out at substantially higher rates), between high- and low-income voters, and (in recent elections) between college-ed → Appendix G: Glossary of Key Terms in Political Analytics
Turnout model
A predictive model that estimates the probability that an individual registered voter will cast a ballot in a specific upcoming election. Inputs typically include past voting history (the single strongest predictor), party registration, age, geographic location, and consumer data indicators. Turnout → Appendix G: Glossary of Key Terms in Political Analytics
[ ] Is the margin of error reported? - [ ] Does the MOE appropriately account for the design effect (if stratified or clustered sampling was used)? - [ ] Are subgroup results reported with their own (larger) margins of error? - [ ] Does the methodology statement acknowledge nonsampling sources of un → Chapter 8: Sampling: Who Speaks for the Public?
Undecided voter
A survey respondent who does not express a preference between candidates when asked the *ballot test* question. Depending on the poll, undecideds may be pushed toward a preference ("If you had to choose..."), left undecided, or further probed for leanings. Undecideds are not equivalent to swing vote → Appendix G: Glossary of Key Terms in Political Analytics
Understanding sector differences
not just in pay but in culture, professional norms, ethical environment, and career trajectory — is prerequisite to making informed choices. The right sector depends on your values, risk tolerance, financial situation, and what you find intrinsically motivating. → Chapter 41 Key Takeaways: Careers in Political Analytics
Universe (targeting)
In campaign parlance, a defined list of voters who meet specified criteria and are the target of a particular campaign activity (GOTV, persuasion, fundraising, opposition targeting). Defining universes — who is included, what criteria are used, how large the list is — is the central operational task → Appendix G: Glossary of Key Terms in Political Analytics
Universe segmentation
The analytical process of dividing the electorate into distinct groups based on support, turnout propensity, and persuadability for targeting purposes. → Chapter 28: The Modern Data-Driven Campaign
Urban-Rural-Suburban Geography
Urban density is an independent predictor of Democratic voting (controlling for other demographics) - Suburbs are not monolithic: inner suburbs trend Democratic; outer suburbs/exurbs trend Republican; middle suburbs are the battleground - The visual dominance of rural areas on political maps is a ge → Chapter 13 Key Takeaways: Demographics and the Electorate
Use interactive visualizations when:
The audience will explore the data themselves (not just receive a pre-constructed argument) - Multiple questions need to be answered from the same dataset and you can't anticipate which ones - The data has multiple dimensions and you want to let users slice by any of them - The visualization will be → Chapter 16: Visualizing the Electorate (Python Lab)
Use static visualizations when:
The audience is receiving a presentation and won't be able to interact - The key message is specific and should not be obscured by exploratory complexity - The visualization will be printed, included in a PDF, or embedded in a slide deck - The design needs to be precisely controlled and reproducible → Chapter 16: Visualizing the Electorate (Python Lab)
Using inappropriate fundamentals
historical correlations that no longer hold due to structural changes. 3. **Ignoring candidate-specific factors** that deviate from the partisan baseline. 4. **Applying the model outside its scope conditions** — too few polls, structural breaks, multi-candidate dynamics. 5. **Treating the model as c → Chapter 21 Key Takeaways: Building a Simple Election Model
the quality dimension of politics. In valence models (associated with Donald Stokes's work), the most important political competition is not between left and right but between competence and incompetence, corruption and integrity, management and mismanagement. Voters all want the same things (econom → Chapter 11: The American Voter and Beyond
Validity
The degree to which a survey measure actually captures the construct it is intended to measure. A measure can be *reliable* (consistent) but not *valid* (not measuring the intended thing). Forms of validity include: *face validity* (the measure looks like what it's supposed to measure), *content val → Appendix G: Glossary of Key Terms in Political Analytics
VAN platform costs
State Democratic Party VAN access: $8,000-15,000 for the campaign cycle (varies by state and negotiation) - Additional modules (data integration, texting): $5,000-12,000 - Field director and data staff time for VAN administration: significant; typically 0.5 FTE during peak campaign period → Capstone 3 Data Appendix: The Campaign Analytics Plan
A database of registered voters maintained by state election authorities, containing each voter's name, address, date of birth, party registration (in states with party registration), and voting history in past elections. State voter files are publicly available (at varying costs and under varying t → Appendix G: Glossary of Key Terms in Political Analytics
Voter persuasion
Campaign activities designed to change the preferences of persuadable voters who have not yet committed to supporting the campaign's candidate. Persuasion targets are identified via *persuadability scores* and contacted via direct mail, digital advertising, canvassing, or phone banking with messages → Appendix G: Glossary of Key Terms in Political Analytics
past behavior is the strongest predictor of future behavior, which means campaigns' mobilization decisions have consequences beyond the current election. → Chapter 14: Turnout — Who Votes and Why
W
Walter Lippmann
*Public Opinion* (1922): the pseudo-environment - **Philip Converse** — "The Nature of Belief Systems in Mass Publics" (1964): non-attitudes, ideological constraint - **John Zaller** — *The Nature and Origins of Mass Opinion* (1992): RAS model - **Christopher Wlezien / Robert Erikson** — Thermostati → Chapter 6 Key Takeaways
watch time
the total minutes a user spends watching content attributed to a recommendation. This means YouTube favors longer content that viewers actually complete or nearly complete over shorter content that is abandoned. For political campaigns, this creates an unusual algorithmic environment: detailed polic → Chapter 31: Digital Campaigning and Social Media Strategy
Weighting
The statistical adjustment of survey data to correct for imbalances between the sample and the target population. Standard demographic weights correct for overrepresentation of some groups (e.g., college-educated respondents) and underrepresentation of others (e.g., younger respondents). Advanced we → Appendix G: Glossary of Key Terms in Political Analytics
Weighting transparency (weight: 20% of grade):
Met: Variables used for weighting are disclosed (e.g., age, gender, race/ethnicity, education, geography) and the targets for each weight variable are described (e.g., "weighted to 2020 Census estimates for race/ethnicity"). - Partial: Some weighting variables are disclosed but targets are not descr → Capstone 1 Data Appendix: Data Sources, Methodology Notes, and Analysis Steps
Weighting:
[ ] Are the weighting variables and targets described? - [ ] Is the weighting target population appropriate? (Adult population vs. registered voters vs. likely voters) - [ ] Are the weighting sources cited? (Census, voter file, election authority projections) → Chapter 8: Sampling: Who Speaks for the Public?
The research was placed without providing the reporter with the known explanation (the witness issue), which the research team either did not know or did not surface - The framing explicitly invited a corruption inference the evidence did not support - The campaign did not seek expert assessment of → Case Study 32-2: The Pharmaceutical Settlement Story
because demographic analysis is never politically neutral. The decision about which groups to count, how to categorize them, and what to infer from their behavior has real consequences for whether those groups are treated as electoral targets worthy of investment or as afterthoughts. → Chapter 13: Demographics and the Electorate
whose voice counts
Likely voter polls vs. all-adult polls produce different results with different political implications - Arrow's Impossibility Theorem: no aggregation method satisfies all reasonable democratic criteria simultaneously - Transparency about aggregation choices is an ethical, not just methodological, r → Chapter 6 Key Takeaways
Word embedding
A computational technique in natural language processing (NLP) in which words or phrases are represented as dense numerical vectors in a high-dimensional space, such that words with similar meanings are close together in the space. Word embeddings (Word2Vec, GloVe, fastText) enable machine learning → Appendix G: Glossary of Key Terms in Political Analytics
Current best estimate (with confidence interval) - Win probability estimate - Key uncertainties and assumptions - What would change the forecast → Appendix F: Templates and Worksheets