Chapter 24: Exercises — Computational Propaganda and Bot Detection
Conceptual Exercises
Exercise 24.1 — Defining Computational Propaganda
a) Woolley and Howard define computational propaganda as involving algorithms, automation, and human curation. Provide a concrete example of each component from a documented information operation (IRA, 50 Cent Army, or another case from the chapter). b) Explain why the human curation component is essential — why can't effective computational propaganda be fully automated? c) Big data micro-targeting is described as a third pillar of computational propaganda alongside automation. How does micro-targeting change the threat model compared to traditional mass propaganda? Use the Cambridge Analytica case as a reference point. d) Some researchers argue that computational propaganda is simply a more efficient form of political advertising. Others argue it is qualitatively different and uniquely threatening to democracy. Which view do you find more compelling, and why?
Exercise 24.2 — The Bot Ecosystem Taxonomy
For each account type below, specify: (a) whether it is fully automated, semi-automated, or fully human; (b) which detection methods from Section 24.4 would most effectively identify it; (c) what the appropriate platform policy response should be.
- A news organization that automatically posts headlines from its RSS feed using a Twitter API integration.
- A state-sponsored troll account operated by a human full-time employee who writes original content and engages in debates, but uses automation to follow/unfollow strategically.
- A network of 500 accounts created over a single weekend, all with stock photo profile pictures, that post only retweets of a political candidate's tweets.
- A commercial bot service that sells "engagement" (fake likes and follows) to paying customers who want to inflate their social media metrics.
- A single person operating 30 accounts with distinct personas, each posting manually but all promoting the same political position.
Exercise 24.3 — The 50 Cent Army's Distraction Strategy
Based on the King, Pan & Roberts research discussed in Section 24.3:
a) Why does the 50 Cent Army primarily post distracting cheerful content rather than directly countering criticism? What theory of political persuasion does this strategy implicitly rely on? b) The operation produces an estimated 448 million posts per year. Even if each post reaches only a small number of users, what is the cumulative effect on the information environment? c) Design a counter-strategy that a domestic civil society organization might use against a distraction-based manipulation campaign. What are the technical and practical challenges? d) The 50 Cent Army operates domestically (targeting Chinese citizens). The IRA operates internationally (targeting foreign populations). Why might these different targets require different operational strategies?
Exercise 24.4 — Bot Detection Feature Analysis
A researcher collects the following data on five Twitter accounts:
| Account | Age (days) | Tweets/day | Follower/Following | Retweet% | Profile Complete | Verified |
|---|---|---|---|---|---|---|
| A | 1,200 | 8 | 1.2 | 45% | Yes | No |
| B | 3 | 180 | 0.05 | 95% | No | No |
| C | 800 | 25 | 0.8 | 70% | Yes | Yes |
| D | 45 | 55 | 0.12 | 90% | Partial | No |
| E | 2,400 | 3 | 3.5 | 20% | Yes | Yes |
a) Based on these features, rank the accounts from most to least likely to be bots. Justify your ranking. b) Account C has a high tweet frequency and retweet rate but is verified and has an established account. What does this suggest about the limitations of feature-based detection? c) What additional features — not listed in the table — would you want to collect to improve your classification? d) If a platform wanted to use these features to automatically suspend accounts, which threshold would you recommend for tweets/day? Justify your choice by discussing precision-recall tradeoffs.
Exercise 24.5 — Botometer and Machine Learning
a) Botometer uses more than 1,200 features. Why are so many features used rather than a small set of the most predictive ones? What are the disadvantages of using very large feature sets? b) Botometer outputs a probability score rather than a binary label. As a researcher studying bot prevalence, at what threshold would you classify accounts as bots? At what threshold would you use it if you were a platform enforcing account suspensions? Why are these thresholds different? c) Explain the concept of "dataset shift" in the context of bot detection. Why does this make it difficult to train a bot detector that remains accurate over time? d) A news article reports that "30% of all Twitter accounts are bots, according to Botometer." What questions would you ask to evaluate this claim before reporting it? e) Botometer has been shown to have higher false positive rates for accounts of Black Americans and non-English speakers. Propose two specific technical interventions and two process-level interventions that could address this disparity.
Exercise 24.6 — Coordinated Inauthentic Behavior
a) Explain the key conceptual difference between bot detection and CIB detection. Why is coordination evidence sufficient for removal even if each individual account is operated by a real human? b) A group of 50 real activists simultaneously post the same hashtag as part of a coordinated Twitter campaign they organized through a private group chat. Is this CIB? Why or why not? c) Design a formal definition of "temporal coordination" that could be implemented as an algorithm. What time window would you use? What action similarity threshold? How would you distinguish organic trending behavior from coordinated inauthentic behavior? d) Meta's CIB reports describe operations as "deceptive" when participants conceal their organizational affiliation. Some argue that political parties routinely coordinate their members' social media activity without disclosing it. Is this CIB? Should disclosure of political coordination be legally required?
Exercise 24.7 — Astroturfing Analysis
a) Explain why low posting pattern entropy is a signal of astroturfing. What is the intuition, and what are the exceptions (legitimate accounts that might also have low entropy)? b) A pharmaceutical company funds a patient advocacy group that runs a social media campaign promoting a specific medication. The patient accounts posting in the campaign are all real patients who genuinely believe in the medication. Is this astroturfing? Does it matter for your answer whether the funding is disclosed? c) Design a study to distinguish between an organic community of fans sharing enthusiasm about a new product versus an astroturfing campaign funded by the product's manufacturer. What data would you collect and what analyses would you run? d) Geographic analysis is listed as an astroturfing detection method. What specific geographic features would you look for, and what are the privacy implications of using location data for this purpose?
Exercise 24.8 — Platform Transparency Reports
Read Meta's most recent CIB report (available at transparency.fb.com) or review the summary provided in Section 24.8. Then:
a) Identify one documented operation and describe: its apparent origin country, its target audience, the platforms it used, and its apparent goal. b) The report does not disclose how many accounts were reviewed that were NOT removed because they did not meet the CIB threshold. Why does this matter for interpreting the report? c) Meta states that CIB removals are based on behavior, not content. What are the advantages and disadvantages of this behavior-first approach? d) Compare the level of detail in Meta's CIB reports with Twitter's Elections Integrity Data releases. Which provides more value for academic researchers? Why? e) If you were designing a legal mandate for platform transparency (for a national legislature), what specific information would you require platforms to disclose, and what would you not require them to disclose?
Exercise 24.9 — Arms Race Dynamics
a) The chapter describes bot evolution from first-generation (simple) to emerging fourth-generation (LLM-powered). For each generation, identify: (i) the primary detection method that forced the evolution, and (ii) the evasion strategy adopted. b) Large language models make it possible to generate unlimited, human-indistinguishable text at near-zero marginal cost. What implications does this have for content-based bot detection methods? c) "Any publicly described detection method can be evaded by sufficiently motivated actors." If this is true, should detection methods be published in academic journals? What are the arguments for and against open publication? d) Design an adversarial test for a bot detection system: describe how you would attempt to create a bot that evades detection while still performing its intended manipulation function. e) The chapter notes that infrastructure analysis (IP addresses, hosting providers, payment methods) is more robust to evasion than behavioral analysis. Why is this? What are the civil liberties implications of infrastructure-level detection?
Exercise 24.10 — Ethics of Computational Propaganda Research
a) Researchers studying computational propaganda typically need to collect data on suspected bot accounts, which may include data from real people wrongly suspected of being bots. What ethical protocols should govern this research? b) A bot detection system you developed is licensed to a government agency that wants to use it to monitor political dissidents. What ethical obligations do you have as the developer? c) The same technical tools used for bot detection can be used by authoritarian governments to identify and suppress genuine political dissent. How should researchers navigate this dual-use problem? d) IRB (Institutional Review Board) protocols for human subjects research may not cover research on publicly posted social media data. Do you think social media users whose public posts are analyzed for bot detection research are "human subjects"? What ethical frameworks apply?
Coding Exercises
Exercise 24.11 — Bot Feature Engineering
Write a Python script that: a) Generates a synthetic dataset of 200 accounts (100 human, 100 bot) with realistic feature distributions. b) Bot features: high tweet frequency, low follower/following ratio, low original content ratio, low temporal entropy. c) Human features: variable tweet frequency, realistic follower/following ratio, higher original content ratio, higher temporal entropy. d) Computes all features from raw posting timestamps and content data. e) Trains a logistic regression classifier and a random forest classifier. f) Evaluates both with 5-fold cross-validation, reporting precision, recall, F1, and AUC-ROC.
Exercise 24.12 — Temporal Coordination Detection
Write a Python script that: a) Generates a dataset with 50 accounts: 30 organic and 20 coordinated (posting identical hashtags within 60-second windows). b) Computes pairwise co-occurrence counts for all account pairs within a sliding 60-second window. c) Constructs a co-occurrence network and applies community detection. d) Computes the precision and recall of using co-occurrence threshold > K (for various K) to flag coordination. e) Plots the precision-recall curve.
Exercise 24.13 — Botometer Feature Replication
Write a Python script that: a) Given a simulated account's posting history (timestamps, retweet flags, URLs, hashtags), computes the following Botometer-inspired features: tweet frequency, retweet ratio, URL ratio, hashtag density, temporal entropy. b) Applies these features to classify 100 synthetic accounts using a random forest. c) Analyzes feature importance scores and plots them as a horizontal bar chart. d) Tests classifier robustness to adversarial evasion: for the 10 accounts classified as bots with highest confidence, modify their features to minimize the classifier's bot probability score while keeping the modifications realistic.
Exercise 24.14 — Content Similarity CIB Detection
Write a Python script that: a) Generates a dataset of tweets from 100 accounts: 70 organic (diverse content) and 30 coordinated (near-duplicate content from a shared template with random word substitutions). b) Computes pairwise cosine similarity on TF-IDF vectors of each account's aggregate content. c) Applies hierarchical clustering to the similarity matrix to identify coordinated groups. d) Evaluates clustering performance against the known labels using NMI and purity. e) Visualizes the similarity matrix as a heatmap with account type annotations.
Exercise 24.15 — Astroturfing vs. Organic Campaign Analysis
Write a Python script that: a) Simulates two campaigns with 100 accounts each: one organic (variable posting times, diverse hashtags, original content), one astroturfing (coordinated posting times, uniform hashtags, duplicated content). b) Computes the following features for each campaign: posting time entropy, hashtag diversity, content originality ratio, account age distribution. c) Visualizes the feature distributions side by side for both campaigns. d) Trains a classifier to distinguish the campaigns based on these features. e) Applies the classifier to a third synthetic dataset with mixed astroturfing and organic accounts, and evaluates performance.
Exercise 24.16 — Random Forest Bot Classifier
Extend Exercise 24.11 to: a) Use 10-fold cross-validation with stratified splits. b) Plot learning curves (training vs. validation accuracy as training set size increases). c) Compute and plot the confusion matrix. d) Compute the ROC curve and AUC score. e) Analyze which accounts in the test set are most frequently misclassified, and identify the features that contribute to misclassification.
Exercise 24.17 — Platform Transparency Data Analysis
Using the Twitter Elections Integrity dataset (or the synthetic equivalent in code/case-study-code.py):
a) Compute the distribution of account ages at time of suspension.
b) Compute the distribution of tweet frequencies across accounts.
c) Identify the top 10 hashtags used.
d) Compute the fraction of tweets that are retweets vs. original.
e) Compute the fraction of accounts with profile pictures.
f) Compare all metrics to a control sample of verified authentic accounts to quantify the contrast.
g) Create a one-page summary report with visualizations.
Exercise 24.18 — Adversarial Bot Evasion Simulation
Write a Python script that: a) Trains a bot detector on a labeled dataset. b) Identifies the five most important features in the trained classifier. c) Simulates an adversary who knows these top-5 features and modifies bot accounts to change those feature values toward the human distribution. d) Re-applies the classifier to the modified accounts and measures how much accuracy drops. e) Retrain the classifier with the modified bot accounts added to the training data and measure recovery. f) Plot the adversarial accuracy degradation and recovery curve.
Exercise 24.19 — Temporal Entropy and Coordination Index
Write a Python script that: a) Defines a "temporal entropy" function for an account's posting activity: compute the entropy of the distribution of posts across 24 hours. b) Defines a "coordination index" for a pair of accounts: the fraction of the first account's posts that are within 30 seconds of the second account's posts. c) Applies both functions to a simulated dataset of 60 accounts: 20 human, 20 simple bots, 20 sophisticated cyborgs. d) Plots temporal entropy vs. coordination index as a scatter plot with account type color-coded. e) Fits a logistic regression decision boundary and overlays it on the scatter plot.
Exercise 24.20 — End-to-End CIB Detection Pipeline
Design and implement a complete CIB detection pipeline: a) Data ingestion: load a synthetic dataset of 500 accounts with posting histories. b) Feature extraction: compute account-level, content-level, and temporal features. c) Coordination graph construction: build a network connecting accounts with high co-occurrence. d) Community detection: identify coordinated clusters. e) Classification: label clusters as CIB or organic based on a composite score. f) Evaluation: compare against ground truth labels; compute precision, recall, F1. g) Reporting: generate a summary report including flagged networks, their estimated coordination scores, and top shared content.
Exercise 24.21 — Multilingual Bot Detection Challenges
a) Why might a bot classifier trained on English-language accounts have higher false positive rates for accounts tweeting in Arabic, Swahili, or Hindi? b) Design a language-agnostic feature set for bot detection that does not rely on natural language content analysis. c) Implement these language-agnostic features in Python and evaluate their discriminative power using a multilingual synthetic dataset. d) Compare the performance of a content-based classifier vs. your language-agnostic classifier across different language groups in your dataset.
Exercise 24.22 — Sockpuppet Detection via Stylometry
Write a Python script that: a) Simulates a sockpuppet network: one "author" who creates text in a consistent style across 10 different accounts. b) Creates a control group: 10 accounts each with different writing styles. c) Extracts stylometric features: word frequency distributions, function word usage, sentence length distribution, punctuation patterns. d) Applies hierarchical clustering to stylometric feature vectors. e) Evaluates whether the sockpuppet accounts cluster together versus the control group.
Exercise 24.23 — Bot Prevalence Estimation Under Uncertainty
Bot prevalence estimates from different classifiers with different thresholds can vary widely. Write a Python script that: a) Generates a dataset of 1,000 accounts with known labels. b) Trains three classifiers with different feature sets. c) Applies each classifier at five different probability thresholds. d) Computes bot prevalence estimates for all 15 combinations. e) Plots the distribution of prevalence estimates as a bar chart with error bars. f) Reports the range and median of estimates and discusses what this variability implies for interpreting published bot prevalence numbers.
Exercise 24.24 — Platform Transparency Report Comparison
Compare Meta's CIB report framework with Twitter's Elections Integrity Data: a) For each platform, list three pieces of information that are disclosed and three that are not. b) Write code to analyze and visualize patterns in a collection of Meta CIB report summaries (using synthetic data calibrated to actual reports if needed). c) Compute year-over-year trends in: number of operations reported, countries of origin, accounts removed. d) Identify which geographies are underrepresented in platform disclosures and propose explanations.
Exercise 24.25 — Full Bot Detection Pipeline Report
Design and implement a complete bot detection analysis and produce a 500-word analytical report (embedded as a docstring in your code) that: a) Describes your feature engineering choices and justifications. b) Reports classifier performance metrics. c) Identifies the most common misclassification patterns. d) Discusses the ethical implications of deploying this classifier at scale. e) Proposes three improvements for future work.