Further Reading: Bias in Data, Bias in Machines

The sources below provide deeper engagement with the themes introduced in Chapter 14. They are organized by topic and include landmark studies, foundational texts, technical papers, and policy analyses. Annotations describe what each source covers and why it is relevant.


Landmark Studies in Algorithmic Bias

Angwin, Julia, Jeff Larson, Surya Mattu, and Lauren Kirchner. "Machine Bias." ProPublica, May 23, 2016. The investigation that launched the algorithmic fairness debate into public consciousness. ProPublica's analysis of COMPAS scores in Broward County revealed racially disparate false positive and false negative rates, sparking a national conversation about algorithmic accountability in criminal justice. The article is accessible, data-driven, and remains the essential primary source for the COMPAS case. ProPublica also released the underlying dataset and methodology, enabling independent replication — a model for algorithmic accountability journalism.

Obermeyer, Ziad, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. "Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations." Science 366, no. 6464 (2019): 447-453. The study that revealed racial bias in a healthcare allocation algorithm affecting approximately 200 million patients. Obermeyer et al. showed that using healthcare costs as a proxy for health need systematically underestimated the needs of Black patients. The paper is a model of empirical rigor and has become a touchstone for discussions of proxy variables and measurement bias. Reading the original paper (available with institutional access or through the authors' website) provides depth that no summary can match.

Buolamwini, Joy, and Timnit Gebru. "Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification." Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 77-91. PMLR, 2018. The study that demonstrated intersectional bias in commercial facial recognition systems, finding error rates of up to 34.7% for dark-skinned women compared to less than 1% for light-skinned men. "Gender Shades" is essential reading for understanding representation bias, evaluation bias, and the necessity of intersectional analysis. Buolamwini's companion documentary, Coded Bias (2020), provides an accessible film version of the research and its implications.

Dastin, Jeffrey. "Amazon Scraps Secret AI Recruiting Tool That Showed Bias against Women." Reuters, October 10, 2018. The Reuters report that broke the Amazon hiring algorithm story. Concise, well-sourced, and widely cited, this article provides the factual foundation for the case study in this chapter. It illustrates the journalistic genre of technology accountability reporting and demonstrates how much can be revealed through insider sources even about proprietary systems.


Foundational Texts on Algorithmic Bias and Race

Benjamin, Ruha. Race After Technology: Abolitionist Tools for the New Jim Code. Cambridge: Polity Press, 2019. Benjamin argues that algorithmic systems do not merely reflect racial inequality but actively produce it under the guise of neutrality — a phenomenon she calls the "New Jim Code." The book is theoretically sophisticated, drawing on critical race theory and science and technology studies, while remaining accessible to undergraduate readers. Particularly relevant for understanding historical bias and the structural conditions that make proxy variables discriminatory.

Noble, Safiya Umoja. Algorithms of Oppression: How Search Engines Reinforce Racism. New York: New York University Press, 2018. While primarily focused on search algorithms (relevant to Chapter 13), Noble's analysis of how algorithmic systems encode racial stereotypes provides essential context for understanding the bias dynamics examined in this chapter. Her concept of "technological redlining" — the use of algorithms to reinforce racial and economic stratification — connects directly to the proxy variable discussion in Section 14.3.

Eubanks, Virginia. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. New York: St. Martin's Press, 2018. Eubanks examines algorithmic bias in social services — welfare eligibility, homelessness services, and child protective services — where the affected populations are predominantly low-income and disproportionately people of color. Her case studies illustrate how bias operates at every pipeline stage and how the Power Asymmetry shapes who is algorithmically sorted and who does the sorting.


Technical Foundations of Bias and Fairness

Suresh, Harini, and John Guttag. "A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle." Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO), Article 17. ACM, 2021. The taxonomy of bias types (historical, representation, measurement, aggregation, evaluation, deployment) used in this chapter draws on Suresh and Guttag's framework. The paper provides rigorous definitions, clear examples, and a structured approach to diagnosing bias. Essential for students who want to move from general awareness of bias to systematic diagnostic capability.

Barocas, Solon, and Andrew D. Selbst. "Big Data's Disparate Impact." California Law Review 104, no. 3 (2016): 671-732. A foundational legal analysis of how machine learning can produce disparate impact even without discriminatory intent. Barocas and Selbst trace the legal concept of disparate impact through its application to algorithmic systems, demonstrating that existing anti-discrimination law is poorly equipped to address the forms of bias documented in this chapter. Essential reading for students interested in the intersection of law and technology.

Chouldechova, Alexandra. "Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments." Big Data 5, no. 2 (2017): 153-163. Chouldechova's impossibility proof — that calibration and equal error rates cannot be simultaneously achieved when base rates differ — is one of the most important results in algorithmic fairness. While the mathematical proof is presented in Chapter 15, reading Chouldechova's original paper provides the formal rigor and the nuanced discussion of implications that a textbook summary necessarily compresses.

Corbett-Davies, Sam, and Sharad Goel. "The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning." arXiv preprint arXiv:1808.00023, 2018. A comprehensive survey of fairness definitions, their relationships, and their limitations. Corbett-Davies and Goel organize the landscape of fairness research into coherent categories and evaluate the strengths and weaknesses of each approach. An invaluable map for navigating a rapidly growing and sometimes confusing field.


The Hiring Algorithm Landscape

Raghavan, Manish, Solon Barocas, Jon Kleinberg, and Karen Levy. "Mitigating Bias in Algorithmic Hiring: Evaluating Claims and Practices." Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT)*, 469-481. ACM, 2020. A systematic evaluation of the bias mitigation claims made by hiring technology vendors. Raghavan et al. find that many vendors' claims of fairness are vague, unsubstantiated, or based on inappropriate metrics. This paper is essential for developing critical evaluation skills — the ability to assess whether a company's claims about its algorithm's fairness are credible.

Kim, Pauline T. "Data-Driven Discrimination at Work." William & Mary Law Review 58, no. 3 (2017): 857-936. A legal scholar's analysis of how data-driven hiring tools create new forms of employment discrimination that existing law is not well equipped to address. Kim examines how proxy variables, feedback loops, and opacity in hiring algorithms interact with anti-discrimination law. Relevant for students considering the regulatory dimensions of algorithmic bias.


Criminal Justice and Prediction

Harcourt, Bernard E. Against Prediction: Profiling, Policing, and Punishing in an Actuarial Age. Chicago: University of Chicago Press, 2007. Written before the COMPAS controversy but prescient in its analysis, Harcourt argues that actuarial methods in criminal justice — predicting risk based on group characteristics — are fundamentally at odds with individual justice. His analysis provides philosophical depth for evaluating whether algorithmic risk assessment should be used in criminal justice at all, regardless of its accuracy.

Dressel, Julia, and Hany Farid. "The Accuracy, Fairness, and Limits of Predicting Recidivism." Science Advances 4, no. 1 (2018): eaao5580. Dressel and Farid demonstrate that untrained laypeople using only age and criminal history can predict recidivism as accurately as COMPAS — raising questions about whether the complexity and opacity of proprietary risk tools is justified. The paper suggests that simpler, more transparent models may be both more fair and more interpretable.


Feedback Loops and Structural Dynamics

Ensign, Danielle, Sorelle A. Friedler, Scott Neville, Carlos Scheidegger, and Suresh Venkatasubramanian. "Runaway Feedback Loops in Predictive Policing." Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 160-171. PMLR, 2018. A formal analysis of how predictive policing algorithms create feedback loops — more police in an area leads to more detected crime, which leads to the algorithm predicting more crime, which leads to more police. Ensign et al. model the dynamics mathematically and demonstrate conditions under which the feedback loop converges, diverges, or oscillates. Essential for understanding the formal structure behind the intuitive concept presented in Section 14.8.


These readings deepen the analysis begun in Chapter 14 and prepare you for Chapter 15's exploration of fairness definitions. The impossibility theorem, previewed in the COMPAS discussion, will receive its full mathematical and ethical treatment in the next chapter.