Further Reading: Chapter 33
Fairness, Bias, and Responsible ML
Foundational Papers
1. "Fairness and Abstraction in Sociotechnical Systems" --- Selbst et al. (2019) The most important conceptual paper in the ML fairness literature. The authors identify five "traps" that technical fairness work falls into: the framing trap (failing to model the entire sociotechnical system), the portability trap (assuming a fairness solution transfers across contexts), the formalism trap (failing to account for the social meaning of fairness metrics), the ripple effect trap (failing to understand how technology changes social dynamics), and the solutionism trap (treating fairness as a purely technical problem). Published at FAT* (now FAccT) 2019. Read this before any other paper on the list.
2. "Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments" --- Chouldechova (2017) The paper that proved the impossibility theorem: when base rates differ between groups, it is mathematically impossible to simultaneously equalize false positive rates, false negative rates, and predictive values. The proof is concise (three pages) and the implications are profound --- every fairness intervention involves a tradeoff. Published in Big Data, 5(2). This is the theoretical foundation for Part 3 of the chapter.
3. "Inherent Trade-Offs in the Fair Determination of Risk Scores" --- Kleinberg, Mullainathan, and Raghavan (2016) An independent proof of the impossibility result, arrived at from a different direction than Chouldechova. The authors show that calibration and equal error rates are incompatible when base rates differ, and they provide a formal framework for understanding why. Available on arXiv (1609.05807). Read alongside Chouldechova for the complete picture.
4. "Model Cards for Model Reporting" --- Mitchell et al. (2019) The paper that introduced model cards as a documentation standard for ML models. The authors (from Google) propose a structured format that includes model details, intended use, metrics disaggregated by relevant factors, ethical considerations, and caveats. Published at FAT* 2019. The model card template in this chapter is based on this paper. Available on arXiv (1810.03993).
5. "Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations" --- Obermeyer et al. (2019) The landmark study that discovered racial bias in a widely used healthcare algorithm. The algorithm used healthcare costs as a proxy for healthcare needs, but because Black patients had historically received less care (and therefore generated lower costs) at the same level of need, the algorithm systematically underestimated their needs. The fix increased the fraction of Black patients receiving additional care from 17.7% to 46.5%. Published in Science, 366(6464). This is the real-world case that makes the Metro General example concrete.
Tools and Libraries
6. AI Fairness 360 (AIF360) --- aif360.res.ibm.com IBM's open-source toolkit for detecting and mitigating bias in ML models. Includes 70+ fairness metrics, 11 bias mitigation algorithms (pre-processing, in-processing, post-processing), and dataset loaders for fairness benchmarks. The API uses custom data structures (BinaryLabelDataset) that require some adaptation from pandas workflows, but the breadth of algorithms is unmatched. Apache 2.0 license. Start with the "Credit Scoring" tutorial for the most relevant walkthrough.
7. Fairlearn --- fairlearn.org
Microsoft's open-source toolkit focused on fairness-constrained model training and fairness assessment. More tightly integrated with scikit-learn than AIF360. Key features: MetricFrame for disaggregated metrics (compute any sklearn metric by group in one line), ThresholdOptimizer for post-processing threshold adjustment, and ExponentiatedGradient for in-processing fairness-constrained optimization. MIT license. The "Quickstart" and "Assessment" tutorials are excellent.
8. Aequitas --- aequitas.dssg.io An open-source bias audit toolkit from the University of Chicago's Center for Data Science and Public Policy. Aequitas takes a different approach than AIF360 and Fairlearn: it focuses on auditing rather than mitigation, producing clear "pass/fail" reports against configurable fairness criteria. The web-based tool (aequitas.dssg.io/upload) allows non-technical stakeholders to run audits by uploading a CSV. Particularly good for communicating results to policymakers.
9. What-If Tool --- pair-code.github.io/what-if-tool Google's interactive visual tool for exploring ML model performance and fairness. Integrates with TensorFlow, XGBoost, and scikit-learn models. The "Fairness" tab allows you to visually explore the fairness-accuracy tradeoff by adjusting thresholds for different groups and seeing the effect on multiple fairness metrics simultaneously. Available as a TensorBoard plugin or standalone Jupyter widget. Best for interactive exploration and stakeholder demos.
Books
10. Fairness and Machine Learning: Limitations and Opportunities --- Barocas, Hardt, and Narayanan (2023) The definitive textbook on ML fairness. Free online at fairmlbook.org. Covers the mathematical foundations of fairness definitions, the impossibility results, causal reasoning about discrimination, and the social context of algorithmic decision-making. Chapters 2 (classification) and 3 (legal background) are the most relevant to this chapter. The mathematical treatment is rigorous but accessible to readers comfortable with probability and statistics.
11. Weapons of Math Destruction --- Cathy O'Neil (2016) The popular science book that brought algorithmic bias to mainstream awareness. O'Neil, a former Wall Street quant, describes how opaque, unregulated, and uncontested mathematical models reinforce inequality in education, criminal justice, lending, and insurance. Not technically deep, but essential reading for understanding why fairness matters and how biased models cause real harm. Crown Publishing.
12. Algorithms of Oppression --- Safiya Umoja Noble (2018) An examination of how search engines and recommendation algorithms reproduce and amplify racial and gender stereotypes. Noble's analysis of Google Search results for terms related to Black women reveals how commercial algorithms can produce discriminatory outputs even without discriminatory intent. NYU Press. Provides the sociological context that technical fairness literature often lacks.
13. Race After Technology --- Ruha Benjamin (2019) Benjamin introduces the concept of the "New Jim Code" --- the ways that technology, particularly algorithmic systems, can encode and perpetuate racial hierarchy even (or especially) when designed with good intentions. The book connects technical fairness to broader structures of power, inequality, and social design. Polity Press. Essential for understanding why "fixing the algorithm" is necessary but insufficient.
Regulatory and Legal Context
14. EU AI Act (2024) The European Union's comprehensive regulation of AI systems, which classifies AI applications by risk level and imposes requirements on high-risk systems (including employment, credit, healthcare, and law enforcement). High-risk systems must undergo conformity assessments that include bias testing, documentation (similar to model cards), and human oversight. The AI Act makes fairness auditing a legal requirement for many ML applications deployed in or affecting EU residents. Available at eur-lex.europa.eu.
15. "Algorithmic Accountability Act" (U.S., proposed) Proposed U.S. legislation that would require companies to conduct impact assessments for automated decision systems, including assessments of bias and discriminatory effects. While not yet law as of 2025, the framework signals the direction of U.S. regulation and provides a useful template for voluntary fairness assessments.
16. EEOC and the Four-Fifths Rule --- Uniform Guidelines on Employee Selection Procedures (1978) The original regulatory source for the disparate impact ratio (the "80% rule" or "four-fifths rule") used in this chapter. If the selection rate for a protected group is less than four-fifths (80%) of the rate for the group with the highest selection rate, it constitutes evidence of adverse impact. While the Guidelines were written for employment, the four-fifths framework has been adopted as a practical threshold in many other fairness contexts.
Healthcare-Specific Fairness
17. "The Problem with Risk Scores" --- Paulus and Kent (2020) A critical examination of risk prediction models in clinical medicine, focusing on how models trained on biased data produce biased predictions that then inform clinical decisions. Published in the Annals of Internal Medicine. Directly relevant to the Metro General case study and the broader question of how algorithmic bias translates to healthcare disparities.
18. "Ensuring Fairness in Machine Learning to Advance Health Equity" --- Chen et al. (2021) A practical framework for assessing and mitigating fairness in clinical ML models. The authors propose a "fairness-aware" development process that integrates fairness considerations at every stage --- from problem formulation to deployment and monitoring. Published in the Annals of Internal Medicine. Includes a checklist that maps well to the model card framework in this chapter.
Advanced Topics
19. "On Formalizing Fairness in Prediction with Machine Learning" --- Hardt, Price, and Srebro (2016) The paper that formalized equalized odds and equal opportunity as fairness criteria. The authors also provide a simple post-processing algorithm for achieving equalized odds by randomizing predictions near the decision boundary. Published at NeurIPS 2016. The algorithm is the theoretical basis for threshold adjustment methods.
20. "Fairness Without Demographics in Repeated Loss Minimization" --- Hashimoto et al. (2018) Addresses a practical problem: what if you do not have access to protected attribute labels? The authors show that optimizing for the worst-off group (distributionally robust optimization) can improve fairness even without knowing group membership. Published at ICML 2018. Relevant when protected attributes are unavailable or when collecting them raises privacy concerns.
21. "Fairness is not Static: Deeper Understanding of Long Term Fairness via Simulation Studies" --- D'Amour et al. (2020) A study of how fairness interventions play out over time through feedback loops. The authors show that a model that is fair at deployment can become unfair as its predictions influence future outcomes (e.g., a lending model that denies loans to a group reduces that group's credit history, making future predictions worse). Published at FAT* 2020. Connects to the production monitoring discussion in Chapter 32.
Where to Start
If you read three things from this list, read:
-
Obermeyer et al. (2019) --- the healthcare algorithm study. It makes the abstract concrete: bias is not hypothetical, it is measured, and it affects millions of patients.
-
Barocas, Hardt, and Narayanan (2023) --- the textbook. Chapters 2 and 3 give you the mathematical and legal foundations. Free online.
-
Fairlearn documentation --- the tool. It integrates with scikit-learn, computes disaggregated metrics with a single function call, and provides threshold optimization out of the box. You can run a fairness audit on your production model this afternoon.
This reading list accompanies Chapter 33: Fairness, Bias, and Responsible ML. Return to the chapter for full context.