Chapter 29 Further Reading: A/B Testing Your Mind

The Facebook Emotional Contagion Study

1. Kramer, A. D. I., Guillory, J. E., & Hancock, J. T. (2014). "Experimental Evidence of Massive-Scale Emotional Contagion Through Social Networks." Proceedings of the National Academy of Sciences, 111(24), 8788–8790. The original study. Essential primary reading. The paper is short and accessible; reading it alongside the ethical commentary enables direct evaluation of the gap between the scientific content and the ethical controversy it generated.

2. Jouhki, J., Lauk, E., Penttinen, M., Rohila, J., Sormanen, N., & Uskali, T. (2016). "Facebook's Emotional Contagion Experiment as a Challenge to Research Ethics." Media and Communication, 4(4), 75–85. A detailed academic analysis of the ethical failures of the emotional contagion study, evaluating it against established research ethics principles. Comprehensive and carefully argued.

3. Selinger, E., & Frischmann, B. (2018). Re-Engineering Humanity. Cambridge University Press. A sustained philosophical examination of how behavioral engineering in digital environments affects human autonomy and agency. The emotional contagion experiment serves as a recurring case study for broader arguments about the ethics of "techno-social engineering."

A/B Testing Methods and Scale

4. Kohavi, R., Tang, D., & Xu, Y. (2020). Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press. The definitive practitioner's guide to A/B testing at scale, written by researchers with experience at Microsoft, Amazon, and LinkedIn. Essential for understanding the technical realities of large-scale platform experimentation — how experiments are designed, run, analyzed, and acted on.

5. Deng, A., Lu, J., & Chen, S. (2016). "Continuous Monitoring of Online Controlled Experiments: Applying Spotlight." Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 833–841. Technical paper on continuous experimentation monitoring at scale. Illustrates the sophistication of production A/B testing systems at major platforms and the continuous, real-time character of modern platform optimization.

6. Agrawal, S., & Goyal, N. (2012). "Analysis of Thompson Sampling for the Multi-Armed Bandit Problem." Proceedings of the 25th Annual Conference on Learning Theory (COLT), 23.1–23.26. Foundational academic paper on Thompson Sampling, one of the primary multi-armed bandit algorithms used in production recommendation and A/B testing systems. Technical but important for understanding the continuous optimization mechanisms described in the chapter.

Research Ethics: History and Principles

7. National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. (1979). The Belmont Report: Ethical Principles and Guidelines for the Protection of Human Subjects of Research. US Department of Health and Human Services. The foundational document of US human subjects research ethics. Should be read in full by anyone evaluating the ethics of behavioral research. Freely available from the US Department of Health and Human Services.

8. Beecher, H. K. (1966). "Ethics and Clinical Research." New England Journal of Medicine, 274(24), 1354–1360. The landmark paper that exposed unethical research practices in American medicine and contributed to the development of modern research ethics frameworks. Historical context for understanding why the IRB system was created.

9. Ledford, J. L. (2009). Google Analytics. Wiley. Though focused on analytics rather than A/B testing specifically, provides context for understanding how commercial digital measurement practices developed alongside the research practices that preceded them. Useful background for understanding the scale of digital behavioral data collection.

OkCupid and Platform Transparency

10. Rudder, C. (2014, July 28). "We Experiment on Human Beings!" OkCupid Blog. The primary source. Essential reading for the OkCupid case study. Available through web archives; the original post was published on OkCupid's blog before the company significantly restructured its web presence.

11. Rudder, C. (2014). Dataclysm: Who We Are (When We Think No One's Looking). Crown Publishers. Rudder's book-length treatment of OkCupid's data analysis approach, extending the argument of the blog post into a full account of what user behavioral data reveals about human social and romantic behavior. Engaging and accessible; raises important questions about privacy, transparency, and the use of social data.

Optimization Target Problem and Wellbeing Metrics

12. Orben, A., & Przybylski, A. K. (2019). "The Association Between Adolescent Well-Being and Digital Technology Use." Nature Human Behaviour, 3(2), 173–182. Methodologically sophisticated analysis of the relationship between social media use and adolescent wellbeing, using specifications curve analysis to assess the robustness of findings across many different analytical choices. Essential for understanding the complexity of the engagement-wellbeing relationship.

13. Coyle, D. (2014). GDP: A Brief but Affectionate History. Princeton University Press. Coyle's analysis of the limitations of GDP as a measure of national wellbeing provides an accessible framework for understanding the optimization target problem generally. The "measuring what matters vs. what is measurable" problem she describes applies directly to social media engagement metrics.

14. Lazer, D. M. J., Pentland, A., Adamic, L., Aral, S., Barabási, A. L., Brewer, D., ... & Van Alstyne, M. (2009). "Computational Social Science." Science, 323(5915), 721–723. Foundational paper establishing "computational social science" as a research field that could leverage large digital datasets for social science research. Important for understanding the scientific potential of platform behavioral data and the governance challenges that potential creates.

Regulation and Policy

15. European Parliament and Council. (2022). Regulation (EU) 2022/2065 on a Single Market For Digital Services (Digital Services Act). Official Journal of the European Union. The text of the DSA, the most comprehensive regulatory framework for large platforms currently in force. Particularly relevant are Articles 34-36 (risk assessment and mitigation) and Articles 37-40 (independent auditing and researcher access) for their relevance to platform experimentation oversight.

16. Doshi-Velez, F., & Kim, B. (2017). "Towards a Rigorous Science of Interpretable Machine Learning." ArXiv preprint arXiv:1702.08608. Technical paper on the interpretability of machine learning systems, directly relevant to the transparency challenges in understanding how platform A/B testing systems and multi-armed bandit algorithms make decisions. Useful background for policy discussions about algorithmic transparency requirements.

17. Zuboff, S. (2019). The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. PublicAffairs. Zuboff's account of how behavioral data extraction and prediction has become the core of dominant technology companies' business models. The framework of "behavioral surplus" and "behavioral futures markets" provides important context for understanding platform A/B testing as part of a broader economic logic.

Critical Technology Ethics Perspectives

18. O'Neil, C. (2016). Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown Publishers. O'Neil's analysis of how algorithmic systems can produce harmful outcomes through opaque optimization processes, with accessible case studies. The optimization target problem as described in this chapter is closely related to the "feedback loop" problem that O'Neil documents across many domains.

19. Harris, T. (2020). The Social Dilemma (documentary film). Netflix. Harris and colleagues' documentary treatment of social media design ethics, featuring interviews with former platform designers and executives. Makes the optimization target problem and behavioral manipulation concerns accessible to general audiences. Should be watched alongside critical academic commentary that identifies the film's limitations and simplifications.