Chapter 23: Further Reading — Data Privacy Fundamentals

DataField.Dev

Chapter 23: Further Reading — Data Privacy Fundamentals

Foundational Texts

1. Nissenbaum, Helen. Privacy in Context: Technology, Policy, and the Integrity of Social Life. Stanford University Press, 2010. The definitive statement of contextual integrity theory. Nissenbaum develops the concept that privacy is about appropriate information flows within social contexts, providing a more nuanced framework than the traditional "public vs. private" binary. Essential reading for anyone working on privacy policy or system design.

2. Solove, Daniel J. Understanding Privacy. Harvard University Press, 2008. Solove argues that privacy is not a single coherent concept but a plurality of related social interests. His taxonomy of privacy violations — including information collection, aggregation, insecurity, secondary use, exclusion, breach of confidentiality, disclosure, exposure, increased accessibility, blackmail, appropriation, and distortion — provides a rich framework for analyzing privacy harms.

3. Cavoukian, Ann. "Privacy by Design: The 7 Foundational Principles." Information and Privacy Commissioner of Ontario, 2009. The foundational document of Privacy by Design. Freely available at the IPC Ontario website. Required reading for anyone implementing privacy programs in technology organizations. Concise and practical.

4. Westin, Alan F. Privacy and Freedom. Atheneum, 1967. A foundational work on privacy theory, still widely cited despite its age. Westin's definition of privacy as the individual's right to control, or at least influence, what information about them is communicated to others remains influential. His research on privacy attitudes established an empirical approach to privacy scholarship.

Regulatory and Legal Sources

5. European Data Protection Board. Guidelines, Opinions, and Recommendations. The EDPB produces authoritative guidance on GDPR interpretation and application. Key documents include guidelines on consent, data breach notification, automated decision-making, and data transfers. Available at edpb.europa.eu. Essential reference for GDPR compliance.

6. Information Commissioner's Office. "Guide to the UK GDPR." The ICO's comprehensive guide to UK GDPR is freely available online. The ICO also publishes enforcement decisions, investigation reports, and policy guidance. The ICO's decision in the Cambridge Analytica case and the DeepMind/NHS case are available in full on the ICO website.

7. California Privacy Protection Agency. "CPRA Regulations." The CPPA's implementing regulations for CPRA, finalized in 2023, provide detailed guidance on CPRA compliance requirements, including the automated decision-making provisions. Available at cppa.ca.gov.

8. Federal Trade Commission. "Privacy & Data Security Update." The FTC's annual update on privacy and data security enforcement actions provides a useful overview of US enforcement trends. FTC enforcement actions under Section 5 of the FTC Act have shaped US privacy norms in the absence of comprehensive federal law.

AI and Privacy

9. Mittelstadt, Brent Daniel, et al. "The Ethics of Algorithms: Mapping the Debate." Big Data & Society 3, no. 2 (2016). A systematic mapping of ethical concerns raised by algorithms, including privacy, autonomy, fairness, and accountability. Useful for understanding where AI privacy concerns fit within the broader AI ethics landscape.

10. Shokri, Reza, and Vitaly Shmatikov. "Privacy-Preserving Deep Learning." Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 2015. A seminal technical paper on applying differential privacy techniques to deep learning. Accessible to non-technical readers as an introduction to the technical approaches available for privacy-preserving AI. The broader literature on federated learning (McMahan et al., 2017) is also relevant.

11. Carlini, Nicholas, et al. "Extracting Training Data from Large Language Models." USENIX Security 2021. Demonstrates that large language models memorize and can reproduce verbatim training data, including personal information. This paper raised significant concerns about the privacy of data used to train LLMs and is essential reading for anyone involved in LLM development or deployment.

12. Sweeney, Latanya. "Simple Demographics Often Identify People Uniquely." Carnegie Mellon University, 2000. Sweeney's landmark demonstration that 87% of the US population can be uniquely identified by ZIP code, date of birth, and sex alone. One of the foundational papers on re-identification risk and the inadequacy of simple anonymization techniques.

Regulatory Case Studies

13. Information Commissioner's Office. "Investigation into the Use of Data Analytics in Political Campaigns." ICO, 2018. The ICO's investigation report on Cambridge Analytica, Facebook, and other actors in the political data ecosystem. Provides detailed factual findings and legal analysis. Available in full on the ICO website.

14. Powles, Julia, and Hal Hodson. "Google DeepMind and Healthcare in an Age of Algorithms." Health and Technology 7, no. 4 (2017): 351–367. The independent review of the DeepMind/NHS arrangement, commissioned by DeepMind itself. Critical but rigorous analysis of the data sharing agreement, the claims made about it, and the governance failures. Freely available online.

Advanced Privacy Topics

15. Dwork, Cynthia, and Aaron Roth. "The Algorithmic Foundations of Differential Privacy." Foundations and Trends in Theoretical Computer Science 9, nos. 3–4 (2014): 211–407. The authoritative technical reference on differential privacy — the mathematical framework for quantifying and limiting privacy loss in data analysis. Challenging for non-technical readers but invaluable for understanding what differential privacy can and cannot guarantee.

16. Obermeyer, Ziad, et al. "Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations." Science 366, no. 6464 (2019): 447–453. Demonstrates that a widely used healthcare algorithm contained significant racial bias resulting partly from the use of healthcare cost as a proxy for healthcare need — a proxy that systematically disadvantaged Black patients. Illustrates how AI systems can encode discrimination through seemingly neutral data choices.

17. Zuboff, Shoshana. The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. PublicAffairs, 2019. While primarily covered in Chapter 24, Zuboff's foundational analysis of surveillance capitalism is essential reading for understanding the data economy that creates data privacy problems at scale. The first chapter's analysis of how Google developed its surveillance model is particularly relevant to understanding training data issues.

18. World Economic Forum. "Redesigning Data Privacy: Reimagining Notice and Choice." World Economic Forum, 2020. Examines the failure of notice-and-consent as a privacy protection mechanism and proposes alternative approaches. Includes practical recommendations for how organizations and regulators might move beyond consent as the primary privacy protection mechanism.