Chapter 19: Further Reading — Auditing AI Systems
Foundational Works on AI Fairness and Auditing
-
Angwin, Julia, Jeff Larson, Surya Mattu, and Lauren Kirchner. (2016). "Machine Bias." ProPublica. The investigative report that triggered the academic literature on AI fairness metrics and demonstrated the methodology of external AI auditing from output data. Essential primary source. Available at propublica.org.
-
Buolamwini, Joy, and Timnit Gebru. (2018). "Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification." Proceedings of Machine Learning Research, 81, 1–15. The study that documented dramatic accuracy disparities in commercial facial recognition systems by gender and skin type. A foundational paper in AI fairness research and a model of systematic performance auditing.
-
Chouldechova, Alexandra. (2017). "Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments." Big Data, 5(2), 153–163. The paper that rigorously demonstrated the mathematical incompatibility of fairness criteria in recidivism prediction when base rates differ across groups. Essential for understanding why fairness metric selection is a normative choice.
-
Kleinberg, Jon, Sendhil Mullainathan, and Manish Raghavan. (2016). "Inherent Trade-Offs in the Fair Determination of Risk Scores." Proceedings of Innovations in Theoretical Computer Science (ITCS). Independent confirmation of Chouldechova's incompatibility result, derived from a different mathematical framework. Together with Chouldechova (2017), establishes the fundamental impossibility result in AI fairness.
Technical Auditing Methods and Tools
-
Mitchell, Margaret, et al. (2019). "Model Cards for Model Reporting." Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT). The paper introducing model cards — standardized documentation for AI models — now a widely adopted standard. Explains the rationale, content, and limitations of model cards as audit and accountability tools.
-
Gebru, Timnit, et al. (2018). "Datasheets for Datasets." Communications of the ACM, 64(12), 86–92. The paper introducing the Datasheets for Datasets documentation standard, modeled on electronic component datasheets. Essential reference for data auditing and provenance documentation.
-
Bellamy, Rachel K.E., et al. (2018). "AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias." IBM Journal of Research and Development, 63(4/5). Documentation of IBM's AI Fairness 360 toolkit, the most comprehensive open-source resource for AI fairness auditing. Describes fairness metrics, bias mitigation approaches, and the tool's architecture.
-
Selbst, Andrew D., and Julia Barocas. (2018). "The Intuitive Appeal of Explainable Machines." Fordham Law Review, 87, 1085–1139. A critical analysis of AI explainability requirements, examining what explainability can and cannot accomplish for accountability. Essential for understanding the limits of technical audit based on explainability methods.
Regulatory Frameworks
-
New York City Department of Consumer and Worker Protection. (2023). "Rules Concerning Automated Employment Decision Tools." Chapter 5 of Title 6 of the Rules of the City of New York. The implementing rules for NYC Local Law 144. Essential primary source for understanding what LL 144 requires and how the DCWP has interpreted its requirements. Available at nyc.gov.
-
Board of Governors of the Federal Reserve System. (2011). "SR 11-7: Guidance on Model Risk Management." Supervisory Letter SR 11-7. The Federal Reserve guidance that created mandatory model validation requirements for financial institutions — the most established U.S. regulatory AI audit framework. Available at federalreserve.gov.
-
Government of Canada. (2019, revised 2023). "Directive on Automated Decision-Making." Treasury Board of Canada Secretariat. Canada's mandatory algorithmic impact assessment requirement for federal government AI systems. One of the most developed government AI governance frameworks in the world.
-
European Parliament and Council of the European Union. (2024). "Regulation (EU) 2024/1689 Laying Down Harmonised Rules on Artificial Intelligence" [AI Act]. Official Journal of the European Union. The EU AI Act, including its conformity assessment requirements for high-risk AI. The most comprehensive mandatory AI audit framework in force globally.
External Auditing Theory and Practice
-
Raji, Inioluwa Deborah, et al. (2020). "Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing." Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT). A practical framework for internal AI auditing, describing the audit process from initiation through action on findings. One of the most cited practical frameworks in AI auditing.
-
Koshiyama, Adriano, et al. (2021). "Towards Algorithm Auditing: A Survey on Managing Legal, Ethical, and Technological Risks of AI, ML, and Associated Algorithms." SSRN Working Paper. A comprehensive survey of algorithm auditing approaches, covering technical, legal, and governance dimensions. Useful as an overview of the rapidly developing field.
-
Vecchione, Briana, Solon Barocas, and Karen Levy. (2021). "Algorithmic Auditing and Social Justice: Lessons from the History of Audit Studies." Proceedings of the ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO). Examines AI auditing in the context of audit studies from social science — the longstanding research methodology of sending matched testers (of different races, genders, etc.) to document discriminatory treatment. Illuminates both the connections and the differences between traditional audit studies and AI auditing.
Generative AI and Advanced Model Auditing
-
Ganguli, Deep, et al. (2022). "Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned." Anthropic Technical Report. Anthropic's report on red-teaming methodology for large language models. Describes the red-teaming process, findings, and limitations from one of the leading AI safety research organizations.
-
Weidinger, Laura, et al. (2021). "Ethical and Social Risks of Harm from Language Models." arXiv preprint arXiv:2112.04359. DeepMind's taxonomy of potential harms from language models, organized by harm type, affected population, and causal mechanism. Essential framework for structuring generative AI auditing.
-
UK AI Safety Institute. (2024). "Capabilities and Alignment Evaluations for Advanced AI Models." DSIT Technical Report. Documentation of the UK AI Safety Institute's approach to evaluating frontier AI models, including methodology for capability evaluations and safety assessments. The emerging standard for government-sponsored AI model evaluation.
Critical Perspectives
-
Metcalf, Jacob, Emanuel Moss, and danah boyd. (2019). "Owning Ethics: Corporate Logics, Silicon Valley, and the Institutionalization of Ethics." Social Research: An International Quarterly, 86(2), 449–476. A critical analysis of the institutionalization of AI ethics in tech companies, arguing that corporate ethics processes often serve to manage reputation rather than to prevent harm. Provides essential critical perspective for evaluating internal AI audit processes.
-
Raji, Inioluwa Deborah, and Joy Buolamwini. (2019). "Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products." Proceedings of AAAI/ACM Conference on AI, Ethics, and Society. Examines whether public disclosure of AI system bias actually changes company behavior. Finds that public naming of bias in commercial AI products led to measurable improvements in some cases. Essential empirical evidence for the effectiveness (and limitations) of disclosure-based accountability.