Case Study 14.1: Cambridge Analytica — When Commercial Surveillance Met Political Manipulation

Case Study 14.1: Cambridge Analytica — When Commercial Surveillance Met Political Manipulation

The Convergence of Data Brokerage, Academic Psychology, and Political Campaigning

Origins: The SCL Group and Political Data

To understand Cambridge Analytica, it is necessary to understand its parent organization: the SCL Group (Strategic Communication Laboratories), a British defense contractor and political consulting firm founded in 1993. SCL specialized in what it called "election management" and "influence operations" — helping political actors understand and target voter populations in elections around the world, primarily in developing countries where democratic oversight was limited. By 2016, SCL claimed to have worked on elections in more than 25 countries.

Cambridge Analytica was created in 2013 as SCL's American political data subsidiary, with significant funding from the Mercer family — billionaire hedge fund manager Robert Mercer and his daughter Rebekah Mercer. Steve Bannon, who would later become a senior adviser to the Trump campaign, was vice president of Cambridge Analytica and a board member. The organization was from its inception a political operation, not a neutral data analytics firm.

What distinguished Cambridge Analytica from conventional political data operations was its claim to use psychographic modeling — specifically, prediction of Big Five personality traits from Facebook behavioral data — to target political messaging with unprecedented precision. The company marketed itself to political clients on the claim that psychographic profiles would allow campaigns to craft messages that resonated at the personality level, not just the issue level.

The Data: How 87 Million Profiles Were Obtained

The acquisition of Facebook data at scale was accomplished through an academic intermediary. Aleksandr Kogan, a psychologist with an appointment at Cambridge University, created an app called "thisisyourdigitallife" — a personality quiz that approximately 270,000 Facebook users installed for a small financial incentive. The app collected user data under the guise of academic research.

The critical mechanism was Facebook's "friends permissions" API, which at the time allowed third-party apps to access not just the data of the user who installed the app but the data of all of that user's Facebook friends, who had not installed the app and had not consented to the data access. Kogan's app, installed by 270,000 users, thereby accessed data from those users' friends — an average of 300 friends per user — producing a dataset of approximately 87 million profiles.

Kogan transferred this dataset to Cambridge Analytica, which Facebook's terms of service at the time prohibited. (The terms allowed academic research but prohibited transferring data to third parties for commercial use.) Kogan did not disclose this transfer to Facebook until 2018, when the story became public.

Facebook's response to learning of the unauthorized transfer — in 2015, well before the public disclosure — was to ask Kogan, Cambridge Analytica, and affiliated parties to certify that the data had been deleted. They provided certifications. The data had not been deleted. Facebook did not audit the certifications.

The Methodology: What Psychographic Targeting Actually Involved

Cambridge Analytica's methodology, as described by whistleblowers including Christopher Wylie (a former employee who became the primary public source on the company's practices), involved several stages:

Personality modeling: Using the Facebook data, the company built predictive models for Big Five personality traits. The models were based on the Kosinski/Stillwell methodology from the 2013 PNAS study (discussed in Chapter 13), which had shown that Facebook Likes were predictive of personality traits. Cambridge Analytica's version extended this to behavioral data beyond Likes.

Voter file matching: The personality models were matched to U.S. voter registration data, creating a database of registered voters with associated personality profiles.

Message testing: The company conducted experimental research — focus groups, online testing, survey experiments — to identify which messages resonated most strongly with different personality profiles on different political issues.

Targeted delivery: The personality-profiled voter database was used on advertising platforms (Facebook, Google, programmatic display) to deliver personality-tailored messages to specific individuals.

The Effectiveness Question

The central empirical question about Cambridge Analytica — did psychographic targeting actually work? — is both important and contested. Cambridge Analytica's own claims were extensive: the company marketed itself as having "the most sophisticated targeting in U.S. political history" and having been decisive in both the Trump election victory and the Brexit referendum.

Independent academic analysis of these claims has been largely skeptical:

Studies of the 2016 election have not found evidence that Cambridge Analytica's targeting was responsible for measurable electoral effects. The Trump campaign's internal digital operation (run separately from Cambridge Analytica) was arguably more sophisticated. The areas where Cambridge Analytica's influence was most claimed — specific swing states — showed demographic and social shifts consistent with many other explanations.

Meta-analysis of digital political advertising effectiveness suggests that online advertising in general has modest effects on political attitudes and behavior, and there is no established evidence that psychographic targeting outperforms conventional targeting for political persuasion.

The underlying science of the Kosinski/Stillwell methodology has been partially replicated but its external validity — whether personality-trait predictions from Facebook data are accurate enough to meaningfully guide message tailoring — is disputed among psychologists.

Why, then, did Cambridge Analytica's approach generate so much concern? The concern is perhaps less about what Cambridge Analytica accomplished than about what it revealed was possible — and what it might become. If the methodology's current effectiveness is limited, the trajectory — more data, better models, more precisely targeted messages — points toward a future in which behavioral data-driven political manipulation could achieve effects that the 2016 version did not.

Regulatory Aftermath and Consequences

The public disclosure of Cambridge Analytica's practices in March 2018, through reporting by Carole Cadwalladr in The Observer and Sheera Frenkel in The New York Times, and through Christopher Wylie's testimony, triggered one of the largest privacy regulatory responses in U.S. and UK history:

FTC action against Facebook: The FTC settled with Facebook for $5 billion in July 2019 — the largest privacy fine in FTC history. The settlement also imposed structural changes, including the creation of an independent privacy oversight committee within Facebook's board structure, privacy-by-design requirements, and limitations on Facebook's ability to share user data with third parties without affirmative user consent.

UK ICO investigation: The Information Commissioner's Office issued Facebook a £500,000 fine (the maximum under pre-GDPR law) and conducted a comprehensive investigation into data analytics and political campaigning. The investigation found that multiple political campaigns, data companies, and brokers had engaged in practices inconsistent with UK data protection law.

Cambridge Analytica dissolution: The company ceased operations in May 2018, filing for bankruptcy shortly after the disclosures. However, the talent, data, and methodologies were dispersed rather than destroyed — many CA employees moved to other political data firms. Mark Turnbull, a key CA executive, later founded a successor entity.

Facebook data policy changes: Facebook substantially restricted third-party app data access in the aftermath of the Cambridge Analytica revelation, eliminating the friends permissions API that had enabled the data collection. The changes came four years after the unauthorized data transfer had occurred.

Analysis Questions

Cambridge Analytica obtained 87 million profiles through the consent of 270,000. This mechanism — using willing participants to access data about non-participants — is structurally similar to the shadow profile problem in Chapter 13. What does this structural similarity reveal about the design of consent mechanisms in networked data systems?
Facebook knew about the unauthorized data transfer in 2015 but relied on certifications (which proved false) rather than conducting audits. What does this response reveal about the relationship between terms-of-service enforcement and meaningful privacy protection?
The empirical evidence on Cambridge Analytica's effectiveness is mixed to negative. Should the ethical analysis of psychographic political targeting depend on whether it works? Is attempting to psychologically manipulate voters using commercial surveillance data wrong only if it succeeds?
Cambridge Analytica's parent company, SCL, had worked in elections in developing countries for years before the US/UK controversy. Why do you think this practice received limited attention compared to the US/UK cases? What does the discrepancy suggest about whose privacy and whose elections are considered important?
The chapter notes that the commercial advertising and political targeting infrastructure are technically identical — same data, same platforms, same mechanisms. What policy implication follows from this? Should political advertising be treated differently from commercial advertising on behavioral platforms? If so, how?

Case Study 14.1 | Chapter 14 | Part 3: Commercial Surveillance