Further Reading: Accountability and Audit

The sources below provide deeper engagement with the themes introduced in Chapter 17. They are organized by topic and include a mix of foundational texts, empirical research, accessible popular works, and policy reports. Annotations describe what each source covers and why it is relevant to the chapter's core questions.

Accountability Frameworks and the Accountability Gap

Bovens, Mark. "Analysing and Assessing Accountability: A Conceptual Framework." European Law Journal 13, no. 4 (2007): 447-468. Bovens provides the most widely cited academic framework for analyzing accountability, distinguishing between accountability as a virtue (a normative standard) and accountability as a mechanism (a social arrangement). His three-element framework — information, discussion, and consequences — closely parallels the answerability-attributability-enforceability structure used in this chapter. Essential for understanding why algorithmic systems disrupt accountability at a structural level.

Diakopoulos, Nicholas. "Accountability in Algorithmic Decision Making." Communications of the ACM 59, no. 2 (2016): 56-62. An early and influential articulation of what algorithmic accountability means in practice. Diakopoulos proposes a framework that includes transparency, testability, and redress as components of algorithmic accountability, and argues that accountability requires not just technical auditing but institutional design. A concise, accessible entry point into the accountability literature.

Wieringa, Maranke. "What to Account for When Accounting for Algorithms: A Systematic Literature Review on Algorithmic Accountability." Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT)*, 1-18. ACM, 2020. A comprehensive review of the academic literature on algorithmic accountability, mapping the various definitions, frameworks, and proposals that have emerged. Wieringa identifies tensions between technical and social conceptions of accountability and argues for an integrated approach. Useful as a literature map for students undertaking research projects on accountability.

Algorithmic Auditing: Methods and Practice

Raji, Inioluwa Deborah, et al. "Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing." Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT)*, 33-44. ACM, 2020. The most detailed published framework for internal algorithmic auditing, developed from the authors' experience at a major technology company. Raji et al. propose a five-stage audit process — scoping, mapping, artifact collection, testing, and reflection — and demonstrate it through a case study. Essential reading for anyone interested in the operational details of how audits actually work.

Sandvig, Christian, Kevin Hamilton, Karrie Karahalios, and Cedric Langbort. "Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms." Paper presented at the International Communication Association 64th Annual Conference, Seattle, WA, 2014. A foundational methodological paper on algorithmic audit studies. Sandvig et al. catalog five approaches to auditing algorithms — code audits, noninvasive user audits, scraping audits, sock puppet audits, and crowdsourced audits — and evaluate the strengths, limitations, and ethical considerations of each. The paper remains the standard reference for audit methodology classification.

Buolamwini, Joy, and Timnit Gebru. "Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification." Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 77-91. PMLR, 2018. The landmark study that audited commercial facial recognition systems (from Microsoft, IBM, and Face++) and found dramatic accuracy disparities across intersections of race and gender. Darker-skinned women faced error rates up to 34.7%, compared to 0.8% for lighter-skinned men. The study's methodology — intersectional benchmarking against a diverse dataset — became a model for subsequent audits. It also demonstrated the real-world impact of audit research: all three companies subsequently improved their systems.

Angwin, Julia, Jeff Larson, Surya Mattu, and Lauren Kirchner. "Machine Bias." ProPublica, May 23, 2016. The investigative report that brought algorithmic bias into mainstream public awareness by auditing the COMPAS recidivism prediction algorithm used in criminal sentencing. ProPublica's analysis found that COMPAS was roughly twice as likely to falsely flag Black defendants as high-risk compared to white defendants. The methodology and findings remain fiercely debated, which itself makes the piece essential reading: the debate reveals fundamental disagreements about how fairness should be defined and measured.

Algorithmic Impact Assessments

Selbst, Andrew D. "An Institutional View of Algorithmic Impact Assessments." Harvard Journal of Law & Technology 35, no. 1 (2021): 117-191. The most comprehensive legal analysis of AIAs to date. Selbst examines AIAs through the lens of institutional design theory, drawing on the decades-long experience with Environmental Impact Assessments to identify conditions under which AIAs are likely to succeed or fail. His central argument — that AIAs work only when embedded in robust institutional frameworks with clear enforcement — directly informs the chapter's discussion of AIA limitations.

Metcalf, Jacob, et al. "Algorithmic Impact Assessments and Accountability: The Co-Construction of Impacts." Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT), 735-746. ACM, 2021. Metcalf et al. examine how the process of conducting an AIA can shape stakeholders' understanding of impacts — not just measuring harms but constructing the categories through which harms are recognized. The paper offers a constructivist perspective that complements the chapter's more pragmatic treatment, and raises important questions about who participates in the assessment process and whose definitions of harm prevail.

Government of Canada. "Directive on Automated Decision-Making." Treasury Board of Canada Secretariat, 2019. The Canadian government's directive requiring federal agencies to conduct Algorithmic Impact Assessments before deploying automated decision-making systems. The directive includes a publicly available AIA tool that agencies can use to evaluate their systems. As one of the first government-mandated AIA frameworks in the world, it provides a concrete template for students designing their own AIAs in the exercises.

Liability and Legal Frameworks

Selbst, Andrew D., and Solon Barocas. "The Intuitive Appeal of Explainable Machines." Fordham Law Review 87, no. 3 (2018): 1085-1139. Selbst and Barocas analyze the legal demand for explainability in algorithmic systems and argue that the intuitive appeal of explanations obscures deeper questions about what explanations can and cannot achieve. Their analysis is relevant to the chapter's discussion of how answerability — one of accountability's three elements — is complicated by the opacity of machine learning models. The paper also explores how legal frameworks might adapt to systems that resist human-interpretable explanation.

Citron, Danielle Keats, and Frank Pasquale. "The Scored Society: Due Process for Automated Predictions." Washington Law Review 89 (2014): 1-33. An early and influential legal analysis of algorithmic scoring systems — credit scores, risk scores, predictive policing scores — and the due process implications of decisions made on their basis. Citron and Pasquale argue that individuals subjected to algorithmic scoring deserve procedural protections analogous to those available in traditional administrative processes: notice, explanation, and an opportunity to contest. Their framework directly addresses the enforceability element of accountability.

Vladeck, David C. "Machines Without Principals: Liability Rules and Artificial Intelligence." Washington Law Review 89, no. 1 (2014): 117-150. Vladeck examines the application of existing tort law to autonomous AI systems and finds that neither negligence nor strict liability frameworks map cleanly onto the AI context. He proposes modifications that would assign liability based on the entity best positioned to prevent harm — a principle with significant implications for the many hands problem discussed in the chapter.

Platform Accountability and Discrimination

Edelman, Benjamin, Michael Luca, and Dan Svirsky. "Racial Discrimination in the Sharing Economy: Evidence from a Field Experiment." American Economic Journal: Applied Economics 9, no. 2 (2017): 1-22. The landmark audit study of racial discrimination on Airbnb, examined in detail in Case Study 2. Edelman, Luca, and Svirsky demonstrate a 16% reduction in acceptance rates for guests with distinctively African American names — a finding with implications for platform design, anti-discrimination law, and the accountability of marketplace intermediaries. The study's clean methodology makes it an exemplary model for audit study design.

Noble, Safiya Umoja. Algorithms of Oppression: How Search Engines Reinforce Racism. New York: NYU Press, 2018. Noble documents how search engine algorithms — particularly Google's — produce racist and sexist results, and argues that these outcomes are not bugs but features of systems optimized for advertising revenue. While focused on search rather than accountability per se, Noble's analysis demonstrates why external auditing is necessary: the harms of algorithmic systems are often invisible to those who design and profit from them.

These readings are starting points, not endpoints. As subsequent chapters address generative AI (Chapter 18) and autonomous systems (Chapter 19), the accountability frameworks introduced here will be extended and tested against new challenges.