Chapter 31: Further Reading
Essential Sources
1. Solon Barocas, Moritz Hardt, and Arvind Narayanan, Fairness and Machine Learning: Limitations and Opportunities (fairmlbook.org, 2023)
The definitive textbook on fairness in machine learning. Barocas, Hardt, and Narayanan provide a rigorous, mathematically grounded treatment of fairness definitions, impossibility results, and mitigation strategies, situated within the broader context of anti-discrimination law, moral philosophy, and social science. The book is freely available online at fairmlbook.org, and its depth of treatment far exceeds what any single chapter can cover.
Reading guidance: Chapter 2 (Classification) introduces the formal setup — protected attributes, predictions, outcomes, and the confusion matrix quantities that define every group fairness metric. This maps directly to the FairnessMetrics class in Section 31.2 of this chapter. Chapter 3 (Legal background and formal definitions) provides the legal foundations (ECOA, Title VII, disparate impact, disparate treatment) and the formal definitions of demographic parity, equalized odds, and calibration — with the impossibility theorem presented as a central result rather than a technical curiosity. Chapter 4 (Causality) covers counterfactual fairness and the causal perspective, connecting to our Chapter 17 (graphical causal models). Chapter 6 (A broader view of discrimination) addresses the limitations of the formal framework — when the math runs out and ethical judgment begins. For practitioners, Chapter 5 (Testing discrimination in practice) provides audit methodology that complements the Fairlearn-based approach in Section 31.11 of this chapter. The exercises throughout the book are research-grade and suitable for advanced courses. For a shorter introduction, Narayanan's FAT* 2018 tutorial "21 Fairness Definitions and Their Politics" (available on YouTube) provides a 1-hour overview of the landscape, including definitions not covered in this chapter (treatment equality, balance for the positive class, well-calibration).
2. Alexandra Chouldechova, "Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments" (Big Data, 2017)
The paper that proved the impossibility theorem in its clearest form. Chouldechova demonstrates that calibration, balance for the positive class (equal FPR), and balance for the negative class (equal FNR) cannot simultaneously hold when base rates differ — unless the classifier is perfect. The proof is concise (less than one page) and the paper is written for a policy audience, making it accessible to readers without deep mathematical background.
Reading guidance: Section 2 defines the three fairness conditions and proves the impossibility result. The proof is essentially the same as the sketch in Section 31.6.2 of this chapter but with cleaner notation. Section 3 applies the result to the COMPAS recidivism prediction instrument, which was the subject of a ProPublica investigation alleging racial bias. Chouldechova shows that the COMPAS controversy — ProPublica argued the tool was unfair because FPR differed across races, while Northpointe (the tool's developer) argued it was fair because it was calibrated — was a direct manifestation of the impossibility theorem: both sides were correct, and the disagreement was about which fairness criterion mattered more, not about the empirical facts. This is the clearest real-world illustration of why the impossibility theorem has practical, not just theoretical, importance. For the independent and nearly simultaneous proof from a different angle, see Kleinberg, Mullainathan, and Raghavan, "Inherent Trade-Offs in the Fair Determination of Risk Scores" (ITCS, 2017), which proves the result using a calibration-based framework and discusses its implications for criminal justice and lending. For extensions to multi-class and continuous settings, see Pleiss et al., "On Fairness and Calibration" (NeurIPS, 2017).
3. Fairlearn Documentation (https://fairlearn.org/)
The official documentation for Fairlearn, Microsoft's open-source toolkit for assessing and mitigating fairness issues in machine learning. Fairlearn provides two core capabilities: assessment (the MetricFrame class for disaggregated metric computation) and mitigation (the ExponentiatedGradient and ThresholdOptimizer classes for constrained optimization and post-processing).
Reading guidance: The "User Guide" section provides conceptual grounding: the distinction between allocation harms (resources distributed unequally) and quality-of-service harms (model performance differs across groups), and the mapping from harm types to fairness metrics. The "Assessment" section documents MetricFrame, which is the workhorse for fairness auditing (Section 31.11.1 of this chapter) — it takes any scikit-learn-compatible metric and computes it disaggregated by sensitive features, with .difference() and .ratio() methods for computing disparities. The "Mitigation" section documents ExponentiatedGradient (Agarwal et al., 2018) and ThresholdOptimizer (Hardt et al., 2016), both of which are used in this chapter. The "Examples" gallery includes end-to-end fairness audit notebooks for credit scoring, hiring, and healthcare — directly relevant to the Meridian Financial case study. For the underlying algorithm, see Agarwal, Beygelzimer, Dudik, Langford, and Wallach, "A Reductions Approach to Fair Classification" (ICML, 2018), which formalizes the reduction of constrained fairness optimization to cost-sensitive classification and proves convergence guarantees.
4. Moritz Hardt, Eric Price, and Nathan Srebro, "Equality of Opportunity in Supervised Learning" (NeurIPS, 2016)
The paper that introduced the equalized odds and equal opportunity fairness criteria and proposed the threshold adjustment post-processing method. Hardt, Price, and Srebro formalize the idea that a classifier should be "equally accurate" across groups — where "equally accurate" means equal TPR and FPR (equalized odds) or equal TPR alone (equal opportunity) — and show that these criteria can be achieved by post-processing a calibrated classifier with group-specific thresholds.
Reading guidance: Section 2 defines equalized odds and equal opportunity formally and argues for their desirability relative to demographic parity (which ignores the true label and can require accepting unqualified applicants to equalize rates). Section 3 presents the post-processing algorithm: given a score function and a fairness criterion, find the optimal randomized threshold policy that satisfies the criterion while maximizing accuracy. The key insight is that the optimal policy randomizes between at most two thresholds per group. Section 4 applies the method to the FICO credit score dataset (publicly available from the CFPB), demonstrating that equalized odds can be achieved with modest accuracy cost — a result directly relevant to the Meridian Financial case study. For extensions to multi-class, multi-group, and intersectional settings, see Kearns, Neel, Roth, and Wu, "An Empirical Study of Rich Subgroup Fairness for Machine Learning" (FAT*, 2019), which addresses the challenge of satisfying fairness across exponentially many subgroups.
5. IBM AI Fairness 360 Documentation and Bellamy et al., "AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias" (arXiv, 2018)
AI Fairness 360 (AIF360) is IBM's comprehensive open-source toolkit for fairness in machine learning, providing over 70 fairness metrics, 12 bias mitigation algorithms (spanning pre-processing, in-processing, and post-processing), and dataset wrappers for standard fairness benchmarks (Adult Income, German Credit, COMPAS).
Reading guidance: The toolkit paper (Bellamy et al.) provides an overview of the architecture and the design decisions behind the metric and algorithm taxonomy. Section 3 catalogs the metrics, organized by whether they measure bias in the dataset, the classifier, or the predictions — a taxonomy that maps to the dataset-level vs. classifier-level distinction in our aif360_audit() function (Section 31.11.2). Section 4 catalogs the mitigation algorithms, with each algorithm classified by intervention point (pre/in/post) and the fairness criterion it targets. The online documentation (https://aif360.readthedocs.io/) includes tutorial notebooks for each algorithm, applied to standard datasets. The most practically useful tutorials are: "Bias in Credit Decisions" (directly relevant to Case Study 1), "Medical Expenditure" (healthcare resource allocation), and "Detecting and Mitigating Age Bias" (relevant to StreamRec's user fairness). For a comparison of Fairlearn and AIF360 with additional toolkits (Themis-ML, FairML), see Lee and Singh, "A Landscape and Comparison of Fair ML Toolkits" (CHI Workshop, 2021). For an enterprise deployment perspective, see Holstein et al., "Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need?" (CHI, 2019), which surveys 35 practitioners and identifies the gap between toolkit capabilities and production needs.