Exercises: The Political Data Ecosystem

Tier 1: Foundational (Comprehension and Recall)

Exercise 3.1: Ecosystem Vocabulary Define each of the following terms in your own words and identify which layer of the five-layer ecosystem framework it belongs to: (a) voter file, (b) American Community Survey, (c) data broker, (d) enriched voter file, (e) open data, (f) civic technology, (g) administrative data, (h) FEC filing.

Exercise 3.2: Source Identification For each of the following pieces of political information, identify the most likely original data source and whether that source is government, academic, media, campaign, or commercial: - (a) The median household income in a Congressional district - (b) A voter's party registration and voting history - (c) The total amount raised by a Senate candidate - (d) The percentage of Americans who approve of the president's job performance - (e) A voter's estimated probability of supporting a Democratic candidate - (f) Precinct-level election results from the most recent presidential election

Exercise 3.3: Access Barriers Create a three-column table listing five types of political data from this chapter. For each, indicate: (a) whether the data is free, fee-based, or proprietary, (b) what technical skills are needed to use it, and (c) one barrier that might prevent an ordinary citizen from accessing or understanding it.

Exercise 3.4: Government Data Scavenger Hunt Visit two of the following websites and spend 15 minutes exploring each: (a) data.census.gov (Census Bureau), (b) fec.gov (Federal Election Commission), (c) bls.gov (Bureau of Labor Statistics). For each site, write a short paragraph describing: what data is available, how easy it is to navigate, and one thing you found surprising or confusing.

Exercise 3.5: The Voter File In your own words, explain the difference between a "raw" state voter file and an "enriched" voter file. What additional information does an enriched file contain? Who produces the enrichment? Who uses it, and for what purpose?

Exercise 3.6: ODA's Mission Summarize OpenDemocracy Analytics' mission in three sentences. Then identify two specific challenges that Adaeze Nwosu faces in pursuing that mission, as described in the chapter.

Tier 2: Analytical (Application and Analysis)

Exercise 3.7: Data Flow Mapping Choose one of the following political data products and map its flow through the five-layer ecosystem framework (raw production, processing, analysis, communication, decision): - (a) A public poll of the Garza-Whitfield Senate race published by a news organization - (b) A targeted campaign mailer sent to specific voters based on modeled persuasion scores - (c) An ODA visualization showing campaign finance data for a competitive race For each layer, identify the specific data sources, organizations, and processes involved.

Exercise 3.8: Privacy Analysis A registered voter named James discovers that a political campaign has a file containing his name, address, voting history, party registration, estimated income, car type, magazine subscriptions, and a "persuasion score" predicting his likelihood of being influenced by campaign messaging. (a) Which of these data points came from the voter file? (b) Which came from commercial data brokers? (c) Which was generated by the campaign's modeling team? (d) Did James consent to any of this data collection? Discuss the privacy implications.

Exercise 3.9: Comparing Data Sources You want to know how Hispanic voters in the Garza-Whitfield state feel about immigration policy. Compare the strengths and limitations of the following data sources for answering this question: (a) the state voter file, (b) an ANES survey, (c) a Meridian Research Group poll, (d) social media posts by Hispanic users in the state, (e) ACS demographic data. Which would you use, and why?

Exercise 3.10: The Standardization Problem The chapter notes that precinct-level election data is published in "dozens of different formats by hundreds of different county offices." Explain why this lack of standardization is a problem for political analysis. Then propose a realistic strategy for standardizing this data across a single state. What resources would be needed? What obstacles would you encounter?

Exercise 3.11: Information Asymmetry The chapter describes several information asymmetries in the political data ecosystem (campaigns vs. citizens, large vs. small campaigns, national vs. local, data-rich vs. data-poor communities). Choose one asymmetry and analyze it in 500 words. Include: (a) a specific example, (b) the consequences for democratic participation, and (c) one realistic proposal for reducing the asymmetry.

Exercise 3.12: Evaluating ODA's Tools Based on the descriptions of ODA's four core products (Campaign Finance Explorer, Voter Information Portal, District Data Dashboard, Open Election Data Repository), evaluate which product is likely to have the greatest impact on democratic participation and why. Consider factors like audience, usability, uniqueness, and the type of decision the tool supports.

Tier 3: Advanced (Synthesis and Evaluation)

Exercise 3.13: Designing a Data Portal You have been hired to design a political data portal for your state. The portal should make key political data accessible to three distinct audiences: (a) ordinary voters, (b) local journalists, and (c) community organizers. For each audience, identify: (i) what data they most need, (ii) what format would be most useful, (iii) one specific feature of the portal designed for them. Then discuss the trade-offs involved in serving all three audiences with a single tool.

Exercise 3.14: The Ethics of Data Brokerage Write a 750-word essay arguing for or against the following proposition: "The sale of consumer data for political purposes should be regulated as strictly as the sale of financial or health data." Address the strongest counterargument to your position and propose a specific regulatory framework.

Exercise 3.15: Data Ecosystem Comparison Research the political data ecosystem in one country outside the United States (e.g., the United Kingdom, India, Brazil, Germany, South Korea). Write a 500-word comparison with the U.S. ecosystem described in this chapter. Consider: (a) government data availability, (b) campaign data infrastructure, (c) data privacy regulations, (d) the role of civic technology. What does the comparison reveal about the relationship between data infrastructure and democratic culture?

Exercise 3.16: The Feedback Loop The chapter describes the data ecosystem as a cycle, not a pipeline---decisions at Layer 5 generate new data that feeds back into Layer 1. Identify a specific example of this feedback loop in the context of the Garza-Whitfield race and trace the complete cycle. How does the existence of feedback loops complicate the interpretation of political data?

Exercise 3.17: A Grant Proposal for ODA You are Adaeze Nwosu, writing a grant proposal to a civic technology foundation requesting $500,000 over two years. The foundation's priorities include "expanding data access to underserved communities" and "building sustainable civic data infrastructure." Write a two-page proposal that includes: (a) a problem statement grounded in the data gaps described in this chapter, (b) a proposed project with specific deliverables, (c) a plan for reaching underserved communities (addressing Adaeze's concern about building "a beautiful library in a neighborhood where nobody reads"), and (d) a sustainability plan for after the grant period ends.

Exercise 3.18: Adversarial Analysis Consider the political data ecosystem from the perspective of a bad actor---someone who wants to use data to undermine democratic processes. Identify three specific vulnerabilities in the ecosystem as described in this chapter, and for each vulnerability, describe: (a) how it could be exploited, (b) what damage could result, and (c) what safeguards currently exist (or should exist) to prevent exploitation. This exercise is designed to develop your ability to think critically about data security and integrity.

Exercise 3.19: Building the Bridge Adaeze identifies the gap between "technically available" and "practically accessible" data as the central problem ODA was created to solve. Analyze this gap by choosing one specific type of political data (e.g., state-level campaign finance data, precinct-level election results, legislative voting records) and documenting: (a) where the raw data lives, (b) what format it is in, (c) what technical skills are needed to access and analyze it, (d) what an accessible version would look like, and (e) the estimated effort required to create that accessible version. This exercise mirrors the work that ODA does every day.