On May 25, 2020, a Minneapolis police officer knelt on George Floyd's neck for nine minutes and twenty-nine seconds. Within 24 hours, protest erupted in Minneapolis. Within a week, protests had been recorded in all 50 US states and more than 60...
Learning Objectives
- Explain resource mobilization theory, political opportunity structure theory, and framing theory as complementary frameworks for understanding social movement success
- Describe protest event analysis (PEA) methodology and its major datasets (GDELT, ACLED, Mass Mobilization Project, Crowd Counting Consortium)
- Identify and analyze coverage bias in protest data
- Evaluate evidence for online-to-offline mobilization translation
- Apply social network concepts to activist community analysis
- Analyze the data profiles of Black Lives Matter and climate movement mobilization
In This Chapter
- 35.1 Theoretical Foundations: Why Movements Form and Succeed
- 35.2 Protest Event Analysis: Measuring Collective Action at Scale
- 35.3 Sam Harding's Protest Analytics Work
- 35.4 Social Media as Mobilization Infrastructure
- 35.5 Network Analysis of Activist Communities
- 35.6 The Radicalization Pipeline
- 35.7 Climate Movement Analytics
- 35.8 The Methodological Frontier
- 35.8a The History and Limitations of Protest Event Analysis as a Method
- 35.8b Social Network Analysis of Activist Communities: A Deeper Treatment
- 35.8c Platform Transformations and Movement Organizing: Arab Spring Through Black Lives Matter
- 35.8d The Ethics of Researching Social Movements
- 35.9 Chapter Summary
- 35.9 Movements and Electoral Politics: The Interface Problem
- Key Terms
- Discussion Questions
Chapter 35: Social Movements and Protest Analytics
On May 25, 2020, a Minneapolis police officer knelt on George Floyd's neck for nine minutes and twenty-nine seconds. Within 24 hours, protest erupted in Minneapolis. Within a week, protests had been recorded in all 50 US states and more than 60 countries. Within a month, the Crowd Counting Consortium had logged over 7,750 distinct protest events in the United States alone — making the summer of 2020 the largest protest wave in American history by most quantitative measures.
Sam Harding, data journalist at OpenDemocracy Analytics (ODA), spent that summer building a protest tracker. Sam — who goes by they/them pronouns — had been analyzing social movement data for three years, but nothing had prepared them for the scale, speed, and geographic breadth of what unfolded. "The data was coming in faster than I could process it," Sam told a journalism conference in 2021. "Every methodology question I thought we'd settled — how to count, who to count, what counts as a protest — got stress-tested in real time."
Sam's experience captures the central tension of this chapter: social movements generate data at scale, but the gap between that data and the underlying political reality it claims to represent is vast. Understanding that gap — knowing what the data shows, what it misses, and why — is the essential skill for anyone who wants to analyze collective action empirically.
This chapter develops the theoretical foundations of social movement research, examines the major protest datasets and their methodological choices, analyzes online mobilization and its relationship to offline action, and applies these frameworks to the most significant protest waves of the past decade. Throughout, we keep ODA's analytical work — and the "Who Gets Counted" question at its heart — in focus.
35.1 Theoretical Foundations: Why Movements Form and Succeed
Three theoretical frameworks dominate social movement research, each emphasizing different factors in the formation, growth, and success of collective action. Understanding all three is necessary because they are complementary, not competing.
35.1.1 Resource Mobilization Theory
Resource mobilization theory (RMT), developed by John McCarthy and Mayer Zald in the 1970s, challenged the then-dominant view that social movements were irrational eruptions of collective grievance. RMT argued instead that movements are organized enterprises that succeed or fail based on their access to and management of resources: money, organizational capacity, networks of sympathizers, media access, and leadership talent.
The core insight of RMT is that grievances are necessary but insufficient to explain movement emergence. Sufficient grievance existed in pre-movement conditions long before movements actually formed; what changed was the availability of resources to channel those grievances into organized collective action. The civil rights movement, on this account, did not emerge in the 1950s because grievances against Jim Crow suddenly intensified — those grievances had been present for generations — but because a specific set of resources (Black church infrastructure, HBCU-trained leadership, Northern white liberal philanthropy, NAACP legal capacity) coalesced in a way that made sustained collective action possible.
Key analytical implications: When studying why a movement emerged when it did, look for changes in resource availability, not just changes in grievance intensity. When studying why some movements succeed and others fail, look at organizational structure, financial sustainability, and network connectivity, not just the quality of the cause.
Applications to contemporary movements: - Black Lives Matter's rapid growth after Ferguson (2014) reflected pre-existing infrastructure: Movement for Black Lives organizations, established activist networks, social media following of movement leaders, and philanthropic support from foundations that had been funding racial justice work for years. - The Tea Party's emergence (2009–2010) was substantially funded and organizationally supported by established conservative networks (FreedomWorks, Americans for Prosperity) — its apparent "grassroots" character coexisted with significant resource mobilization from institutional sources.
35.1.2 Political Opportunity Structure Theory
Political opportunity structure (POS) theory, associated with Sidney Tarrow, Peter Eisinger, and Charles Tilly, emphasizes the importance of the political environment in enabling or constraining collective action. Movements do not form in a vacuum; they form in political contexts that open or close "opportunities" for effective challenge.
Key dimensions of political opportunity include: - Openness of the political system: More permeable systems (multiple veto points, federalism, weak party discipline) offer more entry points for movement demands - Electoral volatility: When party coalitions are in flux, movements may find previously closed doors opening as parties compete for new constituencies - Elite divisions: When powerful actors within the state disagree, movements can find allies inside institutions who are willing to champion their cause - State capacity for repression: When states are willing and able to repress protest, movements face much higher costs of collective action
The timing puzzle. POS theory is most useful for explaining timing — why movements emerge when they do rather than when grievances were equally intense but political conditions were less favorable. The women's suffrage movement succeeded in the 1910s not primarily because women's grievances intensified but because World War I created political opportunities (women's wartime contribution, international legitimacy concerns) that the pre-war political structure had denied.
Applications to contemporary analysis: - The Affordable Care Act's passage and the Republican Party's failure to repeal it created political opportunities for healthcare activist movements: the Democratic primary field's responsiveness to Medicare for All demands, town hall protests against repeal. - The George Floyd murder occurring during an election year, with a Democratic primary concluding and a general election approaching, created political opportunity: parties competed for Black voter mobilization, and the stakes of the moment were visible in ways that earlier police killings had not been.
35.1.3 Framing Theory
Framing theory, developed by David Snow and Robert Benford drawing on Erving Goffman's earlier work, focuses on the cultural and cognitive work that movements must do to mobilize support. Movements must construct frames — interpretive schemas that define a problem, assign blame, and suggest solutions — that resonate with potential supporters and broader publics.
Three framing tasks are analytically distinct:
Diagnostic framing: Defining the problem and identifying the agent responsible. "The problem is police violence against Black people, enabled by systemic racism in law enforcement and criminal justice." Diagnostic frames must resonate with the lived experience of potential recruits and the cultural common sense of the broader public.
Prognostic framing: Prescribing solutions. "The solution is defunding/abolishing/reforming police, investing in community-based public safety, eliminating racial disparities in criminal justice." Prognostic frames are often sites of internal movement conflict: different factions within a movement may share a diagnostic frame but disagree sharply about prescriptions.
Motivational framing: Providing the rhetorical "vocabulary of motives" that induces people to act. "This is a moral emergency. History will judge those who stayed silent. Your action matters." Motivational frames connect individual agency to collective necessity.
Frame resonance is the degree to which a frame connects to existing cultural narratives, values, and experiences. Frames that resonate with dominant cultural narratives (American ideals of equality and justice) mobilize more broadly; frames that require accepting radically new worldviews face higher barriers.
📊 Real-World Application: Competing Frames in 2020 The Black Lives Matter movement deployed a diagnostic frame ("police violence is a symptom of systemic racism") and multiple competing prognostic frames (from "reform policing" to "defund the police" to "abolish the police"). Research by political scientists including Omar Wasow showed that different frames — and especially the specific phrase "defund the police" — resonated very differently with different audiences. The prognostic frame conflict within BLM was simultaneously a political science framing problem, a communication strategy problem, and a data problem: what did "defund" mean to different audiences, and how did those meanings shape support for the broader movement?
35.2 Protest Event Analysis: Measuring Collective Action at Scale
Protest event analysis (PEA) is the systematic collection of data about individual protest events from media sources, government records, organizational archives, or direct observation, enabling large-scale quantitative analysis of collective action patterns. PEA was developed by Charles Tilly and colleagues in the 1970s and has become the dominant approach to protest data collection.
35.2.1 PEA Methodology
A protest event, in PEA methodology, is a discrete occurrence of collective action involving at least some public display of challenge to authority or claim-making. Coding a protest event involves capturing:
- Date, location, and duration of the event
- Organizer(s): Which movement organizations, if any, organized the event?
- Participants: How many attended? What demographic profile?
- Claims: What demands or messages were expressed?
- Tactics: Peaceful march, sit-in, property destruction, strike, rally, online protest?
- Response: Police presence, arrests, repression, official response?
- Outcomes: Any policy changes, negotiations, or concessions?
Data sources for PEA include newspaper archives (historically dominant), social media (increasingly used for recent events), police records, government intelligence files (where accessible), and organizational records. Each source introduces systematic biases that shape what gets coded.
35.2.2 The Major Protest Datasets
GDELT (Global Database of Events, Language, and Tone) uses automated text analysis of global news media to code events including protests. It covers 250+ countries with near-real-time updates and contains hundreds of millions of event records. Its geographic scope and currency are unmatched.
Strengths: Global coverage, real-time, free, enormous scale. Weaknesses: Automated coding introduces significant classification errors; heavy bias toward events covered by Anglophone media; in regions with limited news infrastructure, GDELT dramatically undercounts protest; duplicate event records are common.
ACLED (Armed Conflict Location and Event Data Project) was originally designed for conflict data but has expanded to cover "demonstration" events (riots, protests, mob violence) in over 100 countries. Human coders read source material and code events, producing higher accuracy than automated approaches.
Strengths: Human-coded accuracy, good coverage of conflict-affected regions, granular event-level data. Weaknesses: Originally optimized for conflict, so protest coding has developed unevenly; not all countries are covered equally; proprietary (academic access is free; commercial access is charged).
Mass Mobilization Project focuses specifically on anti-government protest in autocracies and transitional regimes, coding protest events from 1990–2019. Designed to answer questions about when protest succeeds in changing autocratic behavior.
Strengths: Theoretically focused, good for regime-change questions. Weaknesses: Scope limited to anti-government protest; ends in 2019; excludes democracies.
Crowd Counting Consortium (CCC) is a US-focused dataset maintained by researchers Erica Chenoweth and Jeremy Pressman (University of Denver/Harvard Kennedy School). Unlike automated systems, CCC uses trained research assistants to read news reports and code protest events, with transparent methodology and regular public releases.
Strengths: High accuracy for US events, transparent methodology, consistent coding rules, tracks crowd size estimates, downloadable and free. Weaknesses: US only; dependent on media coverage, creating the coverage bias problem discussed below.
35.2.3 The Coverage Bias Problem: Who Gets Counted
This is the most important methodological issue in protest data analysis, and it connects directly to the chapter's "Who Gets Counted" theme.
News media are not neutral observers of protest. They cover events based on news values: novelty, conflict, size, proximity to power, visual interest, relevance to existing narratives. This creates systematic patterns in which protests appear in the data record and which do not:
Size bias: Large protests generate coverage; small protests usually don't. This means protest datasets effectively measure "major protest" rather than "all protest." Small, localized, or repeated protests — which may be extremely important for understanding movement infrastructure — are systematically undercounted.
Location bias: Protests in media hubs (large cities, capitals) receive more coverage than protests in rural areas or small towns, even when controlling for event size. A 500-person protest in Washington DC is more likely to be covered than a 500-person protest in rural Mississippi.
Repertoire bias: Some protest tactics are considered more "newsworthy" than others. Dramatic confrontations, property destruction, and civil disobedience arrests generate coverage; legal marches, letter-writing campaigns, and sustained vigils often do not, even when they represent significant mobilization.
Racial and political bias: Research by Davin Phoenix and others demonstrates that protests by Black, Indigenous, and minority communities receive less coverage per event than equivalent protests by white communities. In the data record, this means minority-community protest is systematically underrepresented.
The implication for analysis: When Sam Harding at ODA analyzes protest data, every statistical finding is conditional on the coverage bias of the underlying source. A finding that "protest in urban areas is more frequent" may primarily reflect urban media coverage density rather than actual differences in protest frequency. A finding that "right-wing protests have increased" may partly reflect changing media interest in that repertoire rather than actual changes in mobilization. Responsible protest data analysis always includes a coverage bias section.
⚠️ Common Pitfall: Treating Coverage as Ground Truth Perhaps the most dangerous mistake in protest data analysis is treating newspaper-based event counts as direct measures of protest activity. They are measures of reported protest activity — a function of both actual protest and media coverage decisions. Studies that use GDELT or newspaper archives without bias correction are measuring, in part, changes in media behavior rather than changes in collective action. Always interrogate your data source: what systematic factors determine whether an event in this database exists at all?
35.3 Sam Harding's Protest Analytics Work
Sam Harding joined OpenDemocracy Analytics after a previous role at a regional newspaper, where they had covered city hall and noticed that protests outside city hall were covered only when they achieved a certain size threshold — smaller, more sustained efforts by neighborhood organizations were systematically ignored in the paper's coverage. "I was creating the bias in real time," Sam explains. "I got interested in understanding what I was missing."
At ODA, Sam has developed a multi-source approach to protest tracking that attempts to mitigate coverage bias through triangulation.
35.3.1 ODA's Multi-Source Methodology
ODA's protest tracker draws from four source categories:
-
Major newspaper archives (national and large regional papers): captured through keyword search and automated classification, providing broad coverage of large events.
-
Local and community media (alternative weeklies, ethnic-community newspapers, neighborhood news sites): captured through a network of volunteer stringers and targeted web scraping. This source captures smaller and minority-community events systematically missed by major media.
-
Social media (primarily Twitter/X and Facebook public posts, Instagram when accessible): monitored using event-specific hashtags and location data. Provides near-real-time event detection but requires manual verification to distinguish actual protest events from discussion, news sharing, and counter-movement activity.
-
Organizational self-reporting: ODA has relationships with approximately 400 US civil society organizations that share event data directly. This is the most accurate source for events those organizations hold but the most biased toward organizations with ODA relationships.
For each event, ODA coders attempt to reconcile information across sources, producing a consensus event record with confidence levels for each data field (size estimate, claims, organizer identity, etc.). Events with single-source documentation are flagged as lower confidence; events with 3+ source confirmation are flagged as higher confidence.
35.3.2 The 2020 Uprising Analysis
Sam's analysis of the George Floyd protests provides a case study in responsible protest data analysis.
Scale and spread: ODA's dataset for the period May 25–August 31, 2020 contains 9,847 protest events in the United States, substantially more than the Crowd Counting Consortium's count for the same period (approximately 7,750). The discrepancy reflects ODA's additional local and social media sources capturing smaller events that CCC's newspaper-based methodology missed.
Geographic distribution: Contrary to narratives that portrayed the protests as primarily urban phenomena, ODA's data shows protest events in counties representing 96.8 percent of the US population, including 2,143 counties that had no record of a Black Lives Matter-related protest event in the previous five years. This geographic spread was enabled by social media mobilization and the networked structure of existing activist communities.
Racial demographic composition: Using a combination of news reports, organizational data, and survey research (particularly the Kaiser Family Foundation's polling), Sam's analysis estimates that approximately 15–26 percent of US adults participated in a protest event in the summer of 2020 — the largest single-movement participation level in US history. Demographic analysis suggests the protests were notably multiracial, with white participants accounting for roughly 40–50 percent of attendees in many events outside the South.
Coverage bias analysis: Comparing ODA's local-source data to major-newspaper-only data reveals significant disparities. Major newspapers captured approximately 68 percent of events that ODA documented, with the gap concentrated in: (1) small cities and rural areas (major newspapers captured ~45 percent of these); (2) events organized by minority-community organizations rather than nationally branded organizations; (3) events without arrests or notable incidents.
🔗 Connection to Chapter 34: Movement and Populism Overlap The Black Lives Matter protests and the populist right's response to them illustrate how social movements and populism interact. BLM's diagnostic frame ("systemic racism") activated Whitfield-style populist counter-framing: "the radical left" and "Marxist agitators" versus "law-and-order Americans." Research by Omar Wasow and others suggests that violent elements in or near protest events (whether initiated by protesters, counter-protesters, or police) activated conservative backlash that hurt Democratic electoral prospects in adjacent congressional districts. The protest data here is not just descriptive — it becomes politically consequential through the frames that attach to it.
35.4 Social Media as Mobilization Infrastructure
The role of social media in social movement mobilization is one of the most debated questions in contemporary political sociology. Claims range from "social media caused the Arab Spring" to "social media produces slacktivism and weakens actual movements." The evidence is more nuanced than either extreme.
35.4.1 The Online-Offline Translation Problem
The central analytical question is whether online activity translates into offline action. Signing an online petition, sharing a protest announcement, tweeting in solidarity — these activities generate data that is easy to count but may have little effect on whether people actually show up in the streets.
Evidence for translation: - Philip Howard and Muzammil Hussain's research on the Arab Spring shows that Twitter activity in Egypt and Tunisia in the weeks before uprisings began predicted the geographic spread of subsequent protests, even when controlling for pre-existing political grievance levels. - Zeynep Tufekci's research on contemporary movement organizing shows that social media dramatically reduces the organizing costs that resource mobilization theory identified as the key barrier to movement formation: a Facebook event can mobilize thousands without any of the infrastructure (phone trees, mailing lists, meeting spaces) that earlier movements required. - Studies of the Ferguson protests, the Women's March, and the 2020 BLM protests all show rapid translation from viral social media moments to mass street protest, with timeline compressions that would have been impossible in pre-social-media organizing environments.
Evidence against simple translation: - Tufekci herself notes a paradox: social media lowers the organizational costs of getting people into the streets, but it may also lower the organizational strength that sustained campaigns require. A movement that can assemble 100,000 people through viral mobilization but has no organizational structure to follow up is "network without power" — impressive in a crisis, fragile over time. - The Arab Spring evidence is disputed: many countries with high Twitter activity did not experience uprisings, and countries with low Twitter penetration (Libya, Syria) did. Social media may amplify movements already underway more than it creates movements from scratch. - Research on online petition signing shows that online participation often substitutes for rather than complementing offline participation — the "slacktivism" hypothesis has partial empirical support for certain types of movement activity.
35.4.2 Platform-Specific Dynamics
Different platforms have different properties that shape their role in movement organizing:
Twitter/X: Historically the dominant platform for political movement communication, with strengths in rapid information spread, cross-network bridging, and media visibility (journalists over-index on Twitter). Weaknesses include bot manipulation, harassment campaigns, and Elon Musk's ownership changes creating platform instability and selective enforcement that disproportionately affects activist communities.
Facebook: Dominant for local and community organizing, event creation, and reaching older demographics. The Facebook event system has been a primary mobilization mechanism for everything from the Women's March to anti-vaccine campaigns. Weaknesses: algorithmic amplification of engagement-optimized content (often inflammatory), opaque group infrastructure that enables both organizing and radicalization.
WhatsApp and Telegram: End-to-end encrypted messaging enables both secure movement organizing and the spread of unverifiable information. Brazilian social movements and anti-government protests have used both platforms extensively. The encryption makes these platforms difficult to monitor for analytical purposes — meaning protest data based on public social media significantly undercounts mobilization that happens in private encrypted channels.
TikTok: Increasingly important for youth mobilization; algorithmically advantaged for content discovery across networks. The 2023 pro-Palestinian campus mobilizations used TikTok extensively for both recruitment and narrative framing. Limited public data access makes systematic analysis difficult.
🔵 Debate: Does Social Media Weaken Movements? Zeynep Tufekci's influential argument in Twitter and Tear Gas (2017) suggests that social media creates a paradox for movements: by drastically lowering the cost of mobilization, it also lowers the organizational capacity that comes from overcoming those costs. Movements that build slowly through difficult organizing develop the internal governance, conflict resolution, and strategic decision-making capacity to sustain campaigns over time. Movements that scale quickly through viral social media may lack this capacity — appearing powerful while remaining organizationally fragile. The Arab Spring uprisings produced few durable political outcomes; movements built over decades of organizing (the Indian independence movement, the US civil rights movement, the Polish Solidarity movement) achieved fundamental change. Is this causal? Or does it reflect the political contexts in which these movements operated?
35.5 Network Analysis of Activist Communities
Social network analysis (SNA) provides tools for mapping and measuring the structural properties of activist communities — who is connected to whom, how information flows, which actors are central, and where the movement's organizational vulnerabilities lie.
35.5.1 Key Network Concepts for Movement Analysis
Nodes and edges: In a movement network, nodes are typically individual activists or organizations; edges represent communication, coordination, or resource flows between them.
Degree centrality: The number of direct connections a node has. High-degree nodes are the most connected individuals or organizations in the movement — often (but not always) leaders or coordinators.
Betweenness centrality: The frequency with which a node appears on the shortest path between other pairs of nodes. High-betweenness nodes are "bridges" between different parts of the network — critical for information flow across organizational boundaries. Research shows high-betweenness activists are disproportionately important for coordination and coalition building.
Network density: The proportion of possible connections that actually exist. Dense networks communicate more efficiently but may also be more insular, limiting the movement's ability to reach beyond its existing base.
Structural holes: Gaps in the network — pairs of nodes that are not connected and have no mutual connections. Activists who can bridge structural holes gain influence by serving as intermediaries between disconnected communities.
Clustering coefficient: The degree to which a node's neighbors are also connected to each other. High clustering produces tight communities (good for solidarity and sustained engagement) but can limit information access and cross-community coordination.
35.5.2 What Network Structure Predicts
Research by Sidney Tarrow, Doug McAdam, and network-oriented movement scholars suggests several empirical regularities:
- Movements with more decentralized networks are more resilient to repression: removing any single node (arresting a leader, disrupting an organization) does not disconnect the network. Centralized movements (high-degree, high-betweenness nodes concentrated in a few individuals) are more vulnerable.
- Inter-organizational ties predict coalition formation. Movements whose component organizations have prior working relationships form effective coalitions more quickly than movements whose organizations are encountering each other for the first time.
- Network bridges predict geographic spread. Movements spread geographically through the pre-existing personal networks of activists who are connected across geographic communities — college friends in different cities, relatives in different states, conference connections across regions.
35.5.3 ODA's Network Tracking
Sam Harding's network analysis work at ODA uses Twitter/X follower and mention data (collected before API access changes in 2023) to map activist community structure. For the 2020 BLM protests, the analysis revealed:
- A highly decentralized national network structure, with hundreds of local Black Lives Matter chapters and allied organizations independently connected to a relatively small national coordination layer
- High betweenness among a set of "movement bridges" — organizations like the Movement for Black Lives policy table — that connected otherwise separate local networks
- Rapid network expansion during June 2020, with thousands of new accounts entering the movement's network perimeter and existing high-betweenness actors becoming critical bottlenecks for information flow
The network analysis also revealed vulnerabilities: the small number of high-betweenness bridging organizations meant that coordinated harassment campaigns targeting those organizations could significantly disrupt network-wide information flow — which is precisely what happened in August–September 2020 as organized counter-campaigns targeted key movement organizations on multiple platforms simultaneously.
35.6 The Radicalization Pipeline
Among the most consequential questions in protest analytics is the relationship between online political communities and real-world political violence. The "radicalization pipeline" concept describes pathways through which individuals move from mainstream political engagement to extremist views to potential willingness to participate in or support violent action.
35.6.1 The Pipeline Concept and Its Limits
The pipeline model suggests a sequential process: individuals begin in politically moderate online spaces, are algorithmically or socially directed toward progressively more extreme content, and eventually arrive at communities where violent action is normalized and encouraged. This model has been applied to far-right radicalization (from mainstream conservatism → Fox News → MAGA forums → 4chan/8chan → QAnon/Proud Boys) and, with less evidence, to left-wing radicalization.
The pipeline metaphor has been criticized for several reasons:
Most people don't complete the journey. Millions of people consume extreme content online without ever acting on it. The pipeline model overestimates the behavioral consequences of content exposure.
It de-emphasizes agency. The pipeline framing presents individuals as passive recipients of algorithmic manipulation rather than active choosers of content that affirms preexisting beliefs and identities. Research by Brendan Nyhan and others shows that algorithmic recommendation is less important than individual active choice in explaining extreme content exposure.
It conflates media consumption with action. Consuming extremist content and participating in political violence are very different behaviors with very different determinants. The analytical challenge is identifying which environmental and individual factors predict the step from consumption to action — and research suggests social ties, economic strain, mental health, and perceived grievances matter more than content consumption per se.
35.6.2 Evidence on Online-to-Violence Translation
The January 6, 2021 Capitol attack provides the most studied case of online mobilization translating into political violence. Research by Cynthia Miller-Idriss and others shows:
- Participants were not primarily "extreme" individuals with long radicalization histories. Many were otherwise ordinary citizens mobilized by the specific rhetoric of the "stolen election" framing.
- Social ties were critical: many participants traveled in groups, attended with family members, or were connected to specific organizations (Proud Boys, Oath Keepers, Three Percenters) that provided both social pressure and organizational logistics.
- Online platforms served primarily a coordination function (assembling people in a specific time and place) rather than an ideological conversion function.
The implication for protest analytics: measuring online activity is not the same as measuring mobilization potential. The relationship between online political community activity and real-world collective action (violent or nonviolent) is mediated by social ties, organizational infrastructure, and specific mobilizing events — not primarily by ideological content exposure.
⚖️ Ethical Analysis: Protest Surveillance and Counter-Extremism Analytics The same analytical tools used by Sam Harding at ODA to understand social movements are also used by law enforcement and intelligence agencies to monitor and preemptively disrupt political organizing. Social network analysis of activist communities, protest event prediction models, and online mobilization monitoring systems have been used by DHS, the FBI, and local police departments — sometimes targeting constitutionally protected political activity. The Standing Rock pipeline protests in 2016 were monitored by private intelligence contractors using protest analytics tools. Black Lives Matter organizers have documented extensive law enforcement surveillance. The data tools of protest analytics are not politically neutral: in unequal power relationships, they tend to be turned toward monitoring the less powerful. This is a fundamental ethical constraint that must inform how protest analytics data is collected, stored, and made available.
35.7 Climate Movement Analytics
The climate movement offers a data-rich case for applying protest analytics frameworks, with the additional complexity of a global movement with multiple competing organizational poles.
35.7.1 The Global Climate Strike Wave
The Fridays for Future (FFF) movement, initiated by Greta Thunberg's school strikes beginning in 2018, produced the largest globally coordinated protest wave on a single issue in history. The September 20, 2019 Global Climate Strike saw approximately 7.6 million participants across 185 countries, according to FFF's own count — a figure that cannot be independently verified but that is supported by protest data from multiple regional sources.
Applying PEA methodology to the FFF data reveals several interesting patterns:
The power of a simple, replicable frame. FFF's core action (school strike, every Friday) was extraordinarily easy to replicate without organizational infrastructure. Any individual with a hand-lettered sign could participate in the "movement" by sitting outside their school on a Friday afternoon. This enabled extremely rapid geographic spread — from Sweden to virtually every country in the world within 18 months — but also produced enormous variation in what participants were actually demanding and doing.
The bifurcation between Global North and Global South movements. FFF's Northern chapter (Sweden, Germany, US, UK) emphasized fossil fuel divestment, emissions targets, and international climate agreements. Southern chapters (Philippines, Uganda, India, Brazil) increasingly emphasized climate justice, loss and damage compensation, and the historical responsibility of wealthy nations. These are related but distinct agendas, and protest data that aggregates all FFF activity treats them as a single movement when they are better understood as a loosely coupled coalition.
35.7.2 Sunrise Movement and the Policy Pivot
The Sunrise Movement's approach illustrates a different analytical pattern: a US-focused organization explicitly designed around influencing the Democratic Party rather than staging public protest per se. Sunrise's framing — the Green New Deal as comprehensive economic and environmental policy — targeted specific institutional actors (Democratic congressional leadership, presidential candidates) rather than diffuse public opinion.
Sunrise's data footprint looks different from mass-mobilization protest data: fewer large public events, more targeted lobbying, congressional office occupations, and candidate endorsements. Standard protest event analysis, focused on public demonstrations, significantly undercounts Sunrise's actual political activity.
📊 Real-World Application: ODA's Climate Tracking Sam Harding's climate movement tracking work for ODA uses a multi-source approach that distinguishes five types of climate movement activity: (1) large public demonstrations, (2) civil disobedience arrests, (3) legislative and regulatory interventions (testimony, lobbying), (4) institutional targeting campaigns (divestment campaigns, board seat contests), and (5) legal actions (climate liability litigation). Standard protest data captures primarily types 1 and 2; ODA's expanded framework captures the full portfolio. The policy impact of climate movements, Sam argues, has come disproportionately from types 3–5, which are systematically undercounted in most protest datasets.
35.8 The Methodological Frontier
Protest analytics is a rapidly evolving field, with several important methodological developments underway.
Computer vision for crowd estimation. Research teams have developed neural network models that estimate crowd sizes from aerial and satellite imagery, addressing one of the most intractable problems in protest data: crowd size counts by event organizers (who overestimate) and police (who often underestimate) are both unreliable. Drone footage and Google Maps satellite imagery have been used in experimental crowd estimation work with promising results.
Event stream processing. Rather than building a protest database from media archives retrospectively, real-time event stream processing (using platforms like Kafka or AWS Kinesis) can detect protest events as they emerge from social media feeds, enabling near-real-time tracking. Sam Harding's 2020 tracker used a simplified version of this approach.
Multinomial event classification. Standard PEA codes events as "protest" or "not protest," but the repertoire of collective action is actually a multi-dimensional space (march, rally, vigil, strike, blockade, occupation, riot, etc.). Multinomial classification models trained on labeled data can code events into fine-grained repertoire categories that capture analytically meaningful distinctions.
The API access crisis. Following Twitter's 2023 API access changes and similar restrictions by other platforms, protest data analysts have lost access to the data streams that made social-media-based protest tracking feasible. ODA and similar organizations are actively building alternative data pipelines, but the field is experiencing a significant methodological disruption that has not been fully resolved.
35.8a The History and Limitations of Protest Event Analysis as a Method
Before examining more recent methodological developments, it is worth understanding protest event analysis as a scholarly tradition — how it was developed, what debates it generated, and what constraints remain fundamental to the approach rather than incidental to particular implementations.
35.9.1 PEA's Scholarly Origins
Protest event analysis as a systematic methodology was developed primarily by Charles Tilly and his collaborators in the 1960s and 1970s, as part of a broader effort to bring quantitative rigor to the study of collective action. Tilly's foundational research on French contentious politics — eventually published as The Contentious French (1986) — involved coding events from newspaper archives spanning centuries, enabling the first truly longitudinal analysis of how repertoires of collective action change over historical time.
The ambition was transformative. Rather than studying individual social movements as case studies, PEA promised to make social movement research cumulative and comparative: if researchers collected data on protest events using consistent coding rules, findings could be aggregated, compared across countries and time periods, and ultimately contribute to genuine theoretical generalizations about collective action. The approach reflected a broader movement in political science toward quantification and cross-national comparison.
The critical scholarly debate that followed PEA's development focused on the validity of newspaper archives as a data source. Tilly himself acknowledged the problem explicitly, devoting substantial methodological discussion to what he called "reactivity" — the way that newspapers and other archival sources respond to, and therefore partly construct, the collective action they record. A protest that does not appear in any newspaper record is, for PEA purposes, as if it never happened. But the absence is about media coverage decisions, not about whether the protest occurred.
Several major validation studies have tested the extent of coverage bias. William Gamson and Emilie Schmeidler's classic 1984 study compared newspaper-based event counts with police records for the same set of events, finding that newspaper coverage captured only 60–70% of events that appeared in police records — and that the uncovered events were not randomly distributed. They were concentrated among smaller events, minority-community events, and events without dramatic incidents. Research by Davin Phoenix in the 2010s extended this work to the racial dimension specifically, finding consistent undercoverage of Black-led protest relative to white-led protest of equivalent size and type.
35.9.2 The Automation Problem and Its Consequences
The development of automated text analysis — first keyword-based event extraction, then machine learning classifiers — dramatically expanded the scale at which PEA could be conducted. GDELT, the most ambitious attempt, codes events from global news media in near-real-time. This scale comes with costs that have been documented in subsequent validation research.
False positive rates in automated coding are substantially higher than in human coding. GDELT's event classification system produces spurious protest event records from articles about protest anniversaries, historical references to past events, and opinion pieces discussing protest in the abstract. Studies comparing GDELT protest counts to manually coded data for the same time periods and geographies have found false positive rates of 20–40% in some contexts, with rates higher in lower-resource languages where the underlying NLP models are weaker.
Geographic bias compounds media bias. Automated systems that code from English-language news media inherit English media's geographic coverage priorities. Events in the United States, United Kingdom, and Anglophone Sub-Saharan Africa receive disproportionate coding relative to equivalent events in Latin America, East Asia, or non-Anglophone Europe. For research questions about global protest patterns, this geographic bias is a fundamental limit on what automated global datasets can support.
The temporal granularity problem. Newspaper archives, the primary historical PEA source, vary in their reporting cadences. Weekly newspapers, daily newspapers, and wire service feeds all have different lag structures between event occurrence and publication. Research that aggregates events at the weekly or monthly level may be less sensitive to this problem; research that depends on day-level event timing (for example, to study how governments respond to protest within 24–48 hours) faces serious data quality issues because the reported event date may reflect publication date rather than event date.
35.9.3 What PEA Can and Cannot Answer
Despite these limitations, PEA remains the best available approach for several important research questions — and is genuinely inadequate for others.
Where PEA has produced reliable findings: Long-run trends in protest frequency and repertoire; comparative analysis of protest across countries with similar media systems; the relationship between large, newsworthy protest events and policy change; the geographic spread of protest waves across regions with comparable coverage.
Where PEA findings require strong caveats: Comparisons between groups with different media visibility (racial minority vs. majority, rural vs. urban, left vs. right in contexts with politically biased media); precise size estimates for individual events; research on sustained, low-profile organizing that does not generate newsworthy events; research in countries or time periods with limited media infrastructure.
Where PEA is fundamentally inadequate: Study of online-only mobilization, private organizing through encrypted channels, and the pre-public phases of movement formation before any newsworthy event occurs. These dimensions of contemporary collective action require methodological approaches that go beyond PEA.
35.8b Social Network Analysis of Activist Communities: A Deeper Treatment
The NetworkX preview in section 35.5 introduced the basic concepts of social network analysis applied to activist communities. This section develops the methodology further, examining what SNA can and cannot reveal about movement dynamics and what data sources make real-world activist network analysis possible.
35.10.1 Data Sources for Movement Network Analysis
The fundamental challenge of activist community SNA is data collection. Unlike corporate networks (where organizational ties are often a matter of public record) or citation networks (where connections are explicitly documented), activist communities often maintain deliberately informal and opaque organizational structures — particularly when operating in repressive environments or under law enforcement surveillance.
Twitter/X network data was the dominant source for activist network analysis from approximately 2010–2023, when the platform allowed academic researchers to collect follower, follow, and mention network data through its API. The methodology: identify a set of seed accounts known to be movement-affiliated (based on public self-identification, organizational affiliation, or prior research), then collect their follower/following networks and the follower/following networks of their followers (two-degree expansion), and use that expanded network as the basis for community detection and centrality analysis.
The strength of this approach is that Twitter networks reflect actual, self-selected connections between accounts — a genuine measure of relational structure. The limitation: Twitter following and Twitter mentions are proxies for actual coordination and information exchange, not direct measures of those processes. Two accounts that follow each other may never have communicated; two accounts that communicate extensively through private messages are invisible to network analysis based on public connections.
Hyperlink networks offer an alternative based on organizational websites. When a social movement organization's website links to other organizations' websites, that link can be treated as an organizational tie. Hyperlink network analysis of climate movement organizations, for instance, has been used to map the structure of the international climate movement and identify bridging organizations that connect otherwise separate national or issue-specific clusters.
Co-occurrence in event data is a less rich but more accessible approach: if multiple organizations co-organize a protest event, they are connected in the network. Coding organizational co-sponsorship from protest event records produces an organizational co-occurrence network that captures operational coordination (organizations that work together) rather than just communication ties (organizations that follow each other online).
35.10.2 Community Detection in Movement Networks
Once a movement network is constructed, community detection algorithms can identify subgroups — clusters of nodes that are more densely connected internally than they are to the rest of the network. In movement research, these communities often correspond to meaningful organizational divisions: ideological factions, geographic clusters, issue-based coalitions, or organizational generations (founding organizations vs. later entrants).
The Louvain algorithm and the Girvan-Newman algorithm are the two most commonly applied community detection methods in movement network research. Both attempt to optimize network modularity — a measure of how much more dense within-cluster connections are relative to what would be expected by chance. High modularity indicates strong community structure; low modularity indicates a relatively homogeneous network without clear subgroup divisions.
Sam Harding's 2020 BLM network analysis used Louvain community detection on the Twitter follower network of accounts identified as BLM-affiliated, identifying seven distinct communities: a national organizing cluster (Movement for Black Lives and affiliated organizations), local chapter clusters (geographically defined), a media and commentary cluster (journalists and commentators covering BLM), a mutual aid cluster (organizations focused on direct community support), a coalition cluster (organizations whose primary mission was not racial justice but who had become BLM allies), an academic and research cluster, and a counter-movement cluster (accounts that had entered the network perimeter through conflict rather than alignment).
This community structure was analytically significant: the relative density of connections between the national organizing cluster and local chapter clusters revealed which local chapters were most integrated into the national movement versus which were operating more independently. Chapters with low connectivity to the national cluster showed different tactical repertoires, less organizational support, and higher rates of burnout and dissolution after the 2020 protest wave peaked.
35.10.3 Network Vulnerability and Resilience
The practical implications of network structure for movement strategy deserve explicit attention. SNA is not only a descriptive research tool — it reveals structural vulnerabilities that both movement strategists and law enforcement analysts have used.
High-betweenness vulnerability: A movement network with two or three nodes carrying extremely high betweenness centrality — bridging most of the connections between different parts of the network — is structurally vulnerable to disruption through targeted removal of those nodes. "Node removal" can mean: arrest of key organizers (legal system as disruption mechanism), sustained harassment campaigns (social media pile-ons, doxxing, threats that force activists to withdraw from public-facing roles), organizational defunding (philanthropic withdrawal from key bridging organizations), or platform banning (removal of high-betweenness accounts from social media platforms used as organizing infrastructure).
Resilience through decentralization: The historical comparison between the US civil rights movement's hierarchical structure (SCLC with centralized charismatic leadership, NAACP with formal membership structures) and the decentralized structure of the 2020 BLM network is analytically interesting. Centralized structures can execute coordinated campaigns more effectively but are more vulnerable to decapitation strategies; decentralized networks are more resilient to disruption but may have difficulty achieving the sustained coordination that durable policy change requires.
The cell structure parallel: Research on both historical labor organizing and contemporary online activist communities shows a tendency toward modular network structures — loosely coupled clusters that each have internal coordination capacity but maintain limited inter-cluster connectivity except through specific bridging nodes. This structure is organizationally analogous to cell-based clandestine organization: if one cell is compromised, the others remain operational. For above-ground social movements, modular structure protects against organizational collapse following leader arrests or organizational defeats, though it complicates coalition-building and unified messaging.
35.8c Platform Transformations and Movement Organizing: Arab Spring Through Black Lives Matter
The relationship between digital platforms and social movement organizing has evolved substantially over the past fifteen years. A chronological review reveals both recurring patterns and important changes across movement waves.
35.11.1 The Arab Spring and the Techno-Optimist Hypothesis
The 2010–2012 Arab Spring protests — Tunisia, Egypt, Libya, Syria, Bahrain, and beyond — were interpreted at the time through an intensely techno-optimist lens: "Facebook revolutions," "Twitter revolutions," movements enabled and perhaps caused by social media platforms that connected activists, spread information, and coordinated action across repressive states.
Philip Howard and Muzammil Hussain's research provided empirical support for the social media connection: they showed that spikes in Twitter activity about Arab political topics preceded the spread of protest from country to country, and that social media usage in a country was associated with higher subsequent protest intensity. The mechanism they proposed: social media enabled the rapid diffusion of information about what was happening in neighboring countries, making visible that effective challenge to authoritarian governments was possible — lowering the psychological barriers that had previously made mass protest feel futile in many of these contexts.
The techno-optimist interpretation was subsequently critiqued on multiple grounds. Selection effects: Countries where Twitter was heavily used were also countries with larger urban middle classes, stronger civil society traditions, and more experienced activist communities — any of which might independently explain higher protest capacity. Means-ends confusion: Social media may have accelerated the spread of protests that would have occurred (perhaps more slowly) through other information channels, without being causally necessary for any given protest. Survivorship bias: Researchers studied the Arab Spring countries where protest occurred and found social media usage was high; they paid less attention to the many countries with comparable social media usage where protest did not occur.
Most importantly, the Arab Spring produced few durable political outcomes. Egypt's revolution brought a brief democratic transition, then a military coup and authoritarian restoration more repressive than the Mubarak regime it replaced. Syria descended into catastrophic civil war. Bahrain's protests were suppressed with external military assistance. Only Tunisia produced a genuine democratic transition — and even that eventually reversed in 2021. The movements that generated the most spectacular short-term social media mobilization proved organizationally fragile when confronted with sustained authoritarian resistance.
35.11.2 Occupy and the Limits of Horizontalism
The Occupy Wall Street movement (2011–2012) provides a crucial case for understanding how platform-enabled organizing interacts with movement strategy and organizational form. Occupy's combination of viral social media spread (the "We Are the 99%" frame was one of the most shared political messages of 2011) and radical organizational horizontalism (no leaders, no demands, consensus decision-making) illustrates both the potential and the limits of social media as organizing infrastructure.
Occupy's social media footprint was enormous: hundreds of occupy encampments in cities worldwide, coordinated through Twitter hashtags, Facebook groups, and livestreaming video. The movement generated massive earned media coverage and produced a genuine shift in public discourse about economic inequality — polling in 2011 showed marked increases in public attention to the 1%/99% framing.
What Occupy did not produce was durable organizational capacity or policy change. Without organizational structure, the movement had no mechanism for translating mobilization into political demands that institutional actors could respond to. Without leadership, the movement could not make strategic decisions about when to escalate, when to negotiate, or when to accept partial victories. The "general assembly" governance model, while internally democratic, was too slow and too easily captured by disruptive participants to enable strategic adaptation.
The Occupy comparison to the Tea Party movement — which emerged at roughly the same time, used many of the same social media tools, but combined digital organizing with rapid organizational institutionalization through Republican Party primary structures — is instructive. The Tea Party, despite its grassroots organic origins, quickly developed organizational capacity that translated digital mobilization into electoral outcomes (2010 midterms). Occupy, despite comparable or greater digital mobilization, did not.
35.11.3 Black Lives Matter: The Platform as Infrastructure and Threat
The Black Lives Matter movement's evolution across three phases — the first emergence (2014 after Ferguson), the between-cycles organizing (2015–2019), and the 2020 uprising — provides the most detailed case study of how digital platforms have transformed contemporary movement organizing.
2014: Twitter as movement infrastructure. The initial Black Lives Matter hashtag was created by Alicia Garza and amplified by Patrisse Cullors and Opal Tometi after the Trayvon Martin verdict in 2013; it became nationally prominent after the Ferguson uprising in 2014. Twitter served multiple simultaneous functions: real-time information dissemination (what was happening in Ferguson as it happened), organizational coordination (where to meet, who was organizing), frame propagation (the specific language of BLM spreading across activist networks), and documentation (citizen journalism through tweeted photos and video creating a record that challenged official police accounts).
2015–2019: Organizational development and platform risk. The between-surge years saw both organizational development (the Movement for Black Lives' policy platform, the Black Lives Matter Global Network Foundation's institutionalization) and a growing recognition of platform dependency risk. After Facebook's algorithm changes in 2015–2016 dramatically reduced organic reach for movement organizations' posts, many BLM-affiliated organizations began building platform-independent infrastructure: mailing lists, text message programs, organizational websites, and offline organizing capacity.
2020: The multi-platform moment. The 2020 uprising was distinctive in its use of multiple platforms simultaneously: Twitter for journalist and activist coordination, Instagram for visual documentation and broader youth audience, TikTok for frame spread to the youngest activists, Facebook for local event organization and older community organizing, and encrypted apps (Signal, WhatsApp) for sensitive operational coordination. The movement's geographic spread — to all 50 states and 60+ countries within a week — was enabled by this multi-platform infrastructure existing and being actively used before May 2020. The organizational investment of 2015–2019 paid off.
The platform risks materialized in 2020 as well: accounts suspended, content moderated, algorithm changes that reduced movement content reach after initial amplification, and extensive government and private surveillance of platform activity. The Standing Rock precedent of private contractors using social media monitoring for protest surveillance was repeated and expanded in 2020. Platform companies' content moderation policies — designed primarily for commercial speech norms — were applied inconsistently to political organizing content in ways that disproportionately affected minority-community activists.
35.8d The Ethics of Researching Social Movements
The methodological tools of protest analytics — network analysis, location data, social media monitoring, protest event databases — are powerful precisely because they make activist community structure visible. This power creates ethical obligations that researchers must take seriously.
35.12.1 The Surveillance Dimension
Social movement research inherently involves collecting information about individuals engaged in constitutionally protected political activity. The same dataset that a researcher uses to study movement network structure could be used by law enforcement to identify movement leaders for prosecution, by employers to screen job applicants for political activism, by opposition organizations to target activists for harassment, or by authoritarian governments to surveil diaspora communities.
This is not a hypothetical risk. FBI documents obtained through FOIA requests have shown law enforcement use of open-source social media analysis to monitor BLM organizers, Standing Rock protesters, and immigrant rights activists using methods that overlap substantially with academic protest analytics. The Department of Homeland Security's social media monitoring programs have collected data on journalists and activists under programs whose legal authority was contested. Private intelligence contractors have sold protest analytics tools to law enforcement agencies with minimal oversight.
Researchers using the same methodological toolkit as law enforcement surveillance cannot credibly claim their work is neutral. The analytical tools that make social movement research rigorous also make it potentially harmful when data or findings reach actors who would use them to suppress movement activity.
35.12.2 IRB Considerations and Their Limits
Institutional Review Boards (IRBs) are the formal mechanism through which US academic institutions review research involving human subjects for ethical compliance. For protest analytics research, key IRB considerations include:
Public vs. private information. IRB regulations distinguish between research involving "publicly available" information (generally exempt from review) and research involving private information collected without participants' consent. Social media posts can fall into either category depending on privacy settings — but the public/private distinction is analytically contested for political activism contexts, where individuals may post publicly without anticipating that their posts will be systematically collected, analyzed, and archived as research data.
Minimal risk standard. For research to qualify as minimal risk (and thus receive expedited or exempt review), it must pose no more than minimal risk to participants. Protest analytics research poses potentially significant risks — identification of movement participants, exposure of organizational structure, creation of searchable databases linking individuals to political activity — that may exceed the minimal risk standard even when individual data is anonymized, because de-anonymization is increasingly feasible with even small amounts of contextual information.
The community consent gap. Standard IRB protocols involve consent from individual research participants. Social movement network analysis may identify, analyze, and publish about individual activists without their individual consent being feasible to obtain — the network has thousands of nodes, many of whom would not be identifiable to the researcher before analysis. Community-level consent processes (consulting with movement organizations before conducting research about their communities) represent an alternative that IRBs are beginning to recognize but that requires researchers to proactively develop community relationships rather than treating movements as passive data sources.
35.12.3 Sam Harding's Ethical Framework
Sam Harding at ODA has developed a working ethical framework for protest analytics that shapes ODA's research practices:
Purpose primacy. Before collecting any data about activist communities, Sam articulates the research purpose and asks: would this analysis primarily benefit public understanding or movement accountability, or would it primarily benefit actors who seek to monitor and constrain movement activity? Research that answers the former question receives priority; research that primarily answers the latter receives higher ethical scrutiny and, in some cases, is declined.
Data minimization. ODA does not collect or retain more data about individual activists than is necessary for the stated research purpose. Protest event analysis requires knowing that an event occurred, its location, and its approximate size — it does not require archiving the names and social media profiles of individual attendees. Where individual-level data is used for network analysis, ODA's practice is to publish findings at the aggregate community level rather than reporting on individual network positions.
Differential vulnerability. Sam applies heightened ethical scrutiny to research involving communities with higher vulnerability to state surveillance: undocumented immigrant activists, racial minority communities with documented law enforcement targeting, activists in authoritarian-adjacent environments, and youth activists whose participation could affect their educational and employment futures.
Research transparency. ODA publishes its methodology, including its limitations and the ways in which coverage bias shapes its findings. This transparency is partly intellectual honesty and partly ethical practice: it prevents findings from being cited with more confidence than the data supports, and it makes visible the ways in which the research reflects ODA's own decisions about what to measure and how.
⚖️ Ethical Analysis: Protest Surveillance and Counter-Extremism Analytics The same analytical tools used by Sam Harding at ODA to understand social movements are also used by law enforcement and intelligence agencies to monitor and preemptively disrupt political organizing. Social network analysis of activist communities, protest event prediction models, and online mobilization monitoring systems have been used by DHS, the FBI, and local police departments — sometimes targeting constitutionally protected political activity. The Standing Rock pipeline protests in 2016 were monitored by private intelligence contractors using protest analytics tools. Black Lives Matter organizers have documented extensive law enforcement surveillance. The data tools of protest analytics are not politically neutral: in unequal power relationships, they tend to be turned toward monitoring the less powerful. This is a fundamental ethical constraint that must inform how protest analytics data is collected, stored, and made available. The researcher who develops more powerful protest analytics tools bears some responsibility for who uses those tools and to what ends — a responsibility that cannot be discharged simply by publishing in academic journals.
35.9 Chapter Summary
Social movements are among the most important forces shaping democratic politics, and protest analytics provides the empirical tools to study them systematically. Resource mobilization theory tells us to look at organizational resources; political opportunity structure theory tells us to look at the political environment; framing theory tells us to look at how movements construct and communicate their demands.
Protest event analysis (PEA), implemented through datasets like GDELT, ACLED, the Mass Mobilization Project, and the Crowd Counting Consortium, provides the data infrastructure for quantitative movement research — but all of these datasets embed systematic coverage biases that shape what "we know" about protest. Media coverage is not ground truth; it is a biased sample of actual collective action.
The "Who Gets Counted" theme runs through every dimension of protest analytics. Minority-community protests are undercovered. Rural protests are undercovered. Non-dramatic, legally conventional protest is undercovered. Protest in encrypted messaging channels is largely invisible to standard data collection. ODA's multi-source approach, developed by Sam Harding, represents a best practice for mitigating these biases — but cannot eliminate them.
Social media has transformed mobilization dynamics, lowering the cost of assembling people at a specific moment while potentially reducing the organizational depth that sustained campaigns require. Network analysis provides tools for understanding the structural properties of activist communities that shapes both their resilience and their vulnerability. And the radicalization pipeline, while real as a phenomenon, is more mediated by social ties and specific mobilizing events than by algorithmic content exposure alone.
35.9 Movements and Electoral Politics: The Interface Problem
Social movements and electoral campaigns are not the same thing, but they interact constantly — sometimes productively, sometimes destructively, and always in ways that require careful analytical attention.
35.9.1 The Inside-Outside Strategy
Social movements face a fundamental strategic choice about their relationship to electoral politics. The inside strategy involves working through parties and candidates — endorsing allied politicians, mobilizing movement-aligned voters, lobbying legislators, and accepting incremental policy change as the achievable goal within existing institutional constraints. The outside strategy involves maintaining movement independence from electoral politics — prioritizing mass mobilization, direct action, and prefigurative politics (modeling the social relations the movement seeks to create) over electoral engagement.
Most sophisticated movements pursue both simultaneously through different organizational components: a non-profit advocacy wing operates independently of elections (legally required for 501(c)(3) organizations), while affiliated voter mobilization organizations engage electoral politics through legally separated structures. The Sierra Club's 501(c)(3) educational wing and its 501(c)(4) advocacy wing and its affiliated PAC represent three legally distinct organizational layers pursuing the same ultimate goals through different legal-strategic means.
Analytically, inside-outside strategy coordination creates a data interpretation challenge. When movement-aligned organizations increase activity before an election, is that evidence of genuine social movement mobilization or of strategic electoral ground operation dressed in movement clothing? Protest data, voter registration data, and campaign finance data each captures a different part of the picture; none alone is adequate.
35.9.2 Movement Backlash and the Countermobilization Problem
One of the most robust findings in social movement research is that successful movements generate countermovements. The civil rights movement generated resistance organizations; the women's movement generated anti-feminist mobilization; the LGBTQ rights movement generated religious liberty advocacy coalitions; Black Lives Matter generated "Blue Lives Matter" and "All Lives Matter" countermobilization.
Countermovements present a specific challenge for protest analytics: they use many of the same repertoires (marches, rallies, media campaigns) as the original movement, making comparative analysis methodologically difficult. When the Crowd Counting Consortium records both a BLM march and a police-support rally in the same city on the same day, what does the relative attendance tell us? The interpretive frame — which is the movement and which is the countermovement — significantly shapes the meaning we assign to the data.
The countermobilization problem also applies temporally: movements that achieve initial policy success may find that the backlash they generate exceeds their gains. Research by political scientist Michael Olzak and others shows that civil rights legislation in the US was followed by increased racial violence in some contexts — not because the legislation caused violence, but because it activated previously latent opposition. Protest analytics that tracks mobilization without tracking countermobilization captures only half the political ecology.
35.9.3 The Electoral Consequences of Protest
The relationship between protest and electoral outcomes is one of the most consequential and most contested questions in social movement research. Three patterns appear consistently in the evidence:
Protest mobilizes movement supporters electorally. Research consistently shows that protest activity increases voter registration, voter turnout, and party identification among people who participate in protests. The mobilization mechanism is both direct (campaigns use protest events for voter contact) and indirect (participation creates political identity and commitment that persists beyond any single event).
Violent protest activates opposition backlash. Omar Wasow's research on 1960s protests and 2020 protests finds consistent evidence that protest events associated with violence — whether initiated by protesters, police, or counter-protesters — generate media coverage framed around law-and-order concerns and activate conservative voters. The effect is asymmetric: nonviolent protest generally helps movement-aligned electoral candidates; violence-adjacent protest often hurts them.
The aggregate effect is contingent. Whether protest on balance helps or hurts movement-aligned electoral candidates depends heavily on context: the specific issue, the political competitiveness of affected districts, the media framing of protest events, and the counter-mobilization capacity of the opposition. No generalizable rule governs the protest-elections relationship.
ODA's electoral impact tracker attempts to link protest data to electoral outcomes by measuring: changes in voter registration rates in counties with significant protest activity, changes in turnout in subsequent elections for districts with above-average protest activity, and changes in party primary outcomes for candidates who actively engaged with or distanced themselves from movement activity.
35.9.4 Sam Harding's Electoral Analysis Caution
Sam Harding is careful about one specific claim that journalists and advocates frequently make: that large protest events "changed the election." The causal inference problems are severe. Large protests and electoral shifts often co-occur because they are both driven by underlying political conditions (high mobilization context, competitive race, galvanizing political event) rather than because protest caused the electoral outcome.
"The most I can honestly say about most protest-election relationships is: they co-occurred, and here's the pattern," Sam explains. "Establishing that the protest caused the electoral shift requires a counterfactual I can't observe — what would the election have looked like without the protest? The data doesn't answer that question directly."
This epistemic caution is not a counsel of analytical despair. Regression discontinuity designs (exploiting the arbitrary boundary between counties that did and did not have protest activity), instrumental variable approaches (using unexpected events as instruments for protest intensity), and difference-in-differences estimators can all provide credible causal estimates in the right circumstances. Sam's point is that these approaches require careful design, they are not always available, and journalists should not report correlation as causation just because the co-occurrence fits a preferred narrative.
Key Terms
- Resource mobilization theory: Framework emphasizing organizational resources (money, networks, leadership) as the key determinant of movement formation and success
- Political opportunity structure: The political environment's openness or closure to movement demands, including elite divisions, state capacity, and institutional permeability
- Framing theory: The study of how movements construct and communicate diagnostic, prognostic, and motivational frames
- Protest event analysis (PEA): Systematic data collection about individual protest events, typically from media sources, enabling large-scale quantitative analysis
- Coverage bias: The systematic underrepresentation of certain events in media-based protest datasets due to news media coverage decisions
- GDELT: Global Database of Events, Language, and Tone; automated protest data from global news media
- ACLED: Armed Conflict Location and Event Data Project; human-coded event data focused on conflict and demonstration
- Crowd Counting Consortium: US-focused, human-coded protest data with transparent methodology
- Betweenness centrality: A network measure identifying nodes that serve as bridges between different parts of the network
- Radicalization pipeline: The sequence by which individuals move from mainstream political engagement toward extremist views and potential violent action
Discussion Questions
-
Resource mobilization theory and framing theory offer very different accounts of what makes movements succeed. Apply both frameworks to a movement of your choice — what does each explain that the other misses?
-
The coverage bias problem means that protest datasets systematically underrepresent minority-community and rural protest. What are the political consequences of this underrepresentation for policy-makers and researchers who rely on these datasets?
-
The "slacktivism" critique of social media organizing suggests that online participation often substitutes for offline action. But some research suggests online participation can build toward offline action. How would you design a study to test which effect predominates?
-
ODA's multi-source methodology captures more events than newspaper-only approaches, but it also introduces new biases (toward organizations with ODA relationships, toward protest forms that generate social media activity). Is a multi-source approach better than a consistent single-source approach? Justify your answer.
-
Law enforcement and intelligence agencies use protest analytics tools to monitor political organizing. What regulatory framework, if any, should govern this use? What are the competing interests at stake?