Chapter 7: Exercises

DataField.Dev

Chapter 7: Exercises

Difficulty Scale: - ⭐ Foundational — recall and basic application - ⭐⭐ Intermediate — analysis and synthesis - ⭐⭐⭐ Advanced — evaluation and design - ⭐⭐⭐⭐ Capstone — open-ended research and argument

† = Recommended for class discussion or assignment submission

Part A: Foundational Knowledge (⭐)

Exercise 1 ⭐ Define the three levels of algorithmic bias — technical, emergent, and sociotechnical — in your own words. For each level, provide an original example drawn from a domain other than the examples used in Chapter 7.

Exercise 2 ⭐ Match each term on the left to its correct definition on the right:

Term	Definition
Disparate impact	A variable that carries information about a protected characteristic
Proxy variable	Intentional differential treatment based on a protected characteristic
Feedback loop	When a neutral practice disproportionately harms a protected class
Disparate treatment	When outputs influence future training data, reinforcing bias
Protected class	A group protected by anti-discrimination law

Exercise 3 ⭐ List the eight stages of the ML pipeline described in Section 7.3. For each stage, identify the single most significant bias risk and name one detection strategy.

Exercise 4 ⭐ In two to three sentences each, describe the bias pattern documented in each of the following domains: (a) employment, (b) criminal justice, (c) healthcare, and (d) facial recognition.

Exercise 5 ⭐ What did NIST's 2019 FRVT evaluation find about facial recognition accuracy across demographic groups? Summarize the key findings in no more than 150 words without using any technical jargon.

Part B: Conceptual Analysis (⭐⭐)

Exercise 6 ⭐⭐ † The Mirror Problem in Practice A major bank wants to build an AI system to predict which loan applicants are most likely to default. They propose training the model on 20 years of their own historical loan outcomes. Explain, using the concepts from Section 7.2, at least four specific ways in which this approach could produce a biased model. For each mechanism you identify, describe what data or process generated the bias and who would be harmed.

Exercise 7 ⭐⭐ Proxy Variables A tech company is hiring software engineers. An analyst proposes including the following features in a résumé screening model. For each feature, evaluate whether it might function as a proxy variable for a protected characteristic, explain the mechanism, and recommend whether it should be included, excluded, or modified:

a. Name of undergraduate university attended b. Years of professional experience c. Participation in hackathons d. Programming languages listed e. Employment gaps longer than six months f. GitHub activity metrics (commits, followers)

Exercise 8 ⭐⭐ Disparate Impact vs. Disparate Treatment Review the following three scenarios and determine whether each describes disparate impact, disparate treatment, or neither. Justify your answer with reference to the legal definitions.

a. A retail company's scheduling algorithm systematically assigns fewer weekend shifts to employees who have listed childcare obligations in their HR profile. b. A landlord personally rejects every application from a specific ethnic group, without using any algorithmic tool. c. An insurance company uses a telematics algorithm that produces higher premiums for drivers in zip codes that are predominantly Black, without any intent to discriminate based on race.

Exercise 9 ⭐⭐ The Feedback Loop Draw a diagram (or describe in structured text) of the credit scoring feedback loop described in Section 7.5. Your description should include: the initial state, the algorithmic decision, the real-world consequence, the effect on future data, and the mechanism by which bias intensifies over time. Then identify at least two intervention points where the loop could be interrupted.

Exercise 10 ⭐⭐ † Intersectionality in Practice A company is testing an AI candidate screening tool for gender bias. They evaluate the tool on two groups — all male applicants and all female applicants — and find that accuracy is similar for both groups. They conclude the tool is fair. Critique this conclusion using the concept of intersectionality. What additional analysis should they conduct? What categories should they examine? What would Buolamwini's Gender Shades research suggest about where the largest disparities are likely to be found?

Part C: Applied Analysis (⭐⭐⭐)

Exercise 11 ⭐⭐⭐ † The COMPAS Fairness Dilemma ProPublica found that COMPAS produced false positive rates approximately twice as high for Black defendants as for white defendants. Northpointe argued the tool was calibrated — that a score of 7, for example, corresponded to the same probability of recidivism regardless of race. Both claims are mathematically correct.

a. Explain the mathematical reason these two claims can both be true simultaneously. b. Which criterion — equal false positive rates or calibration — do you believe should take precedence in a pretrial risk assessment context? Construct the strongest argument for your position. c. Who should make this decision — the AI developer, the court system, the legislature, or some other body? Justify your answer.

Exercise 12 ⭐⭐⭐ Pre-Deployment Audit Design You are the Chief Ethics Officer of a large logistics company considering purchasing an AI-based hiring tool from a vendor. The vendor claims the tool is "rigorously tested for bias." Design a pre-deployment audit process you would require before authorizing purchase and deployment. Specify: - What information and documentation you will require from the vendor - What independent testing you will conduct - Which demographic subgroups you will require disaggregated performance data for - What minimum performance thresholds you will set - What ongoing monitoring you will require post-deployment

Exercise 13 ⭐⭐⭐ The Disclosure Decision Amazon discovered in 2017 that its hiring algorithm systematically penalized female candidates. It shut the tool down without public disclosure. Construct a structured argument evaluating Amazon's disclosure decision, addressing: a. The legal obligations Amazon had under applicable law at the time b. The ethical obligations Amazon had under a stakeholder ethics framework c. The business risk implications of disclosure vs. non-disclosure d. What a different company in the same position should do today, given the current regulatory environment

Exercise 14 ⭐⭐⭐ Global Variation in Bias Law A multinational corporation operates in the United States, the United Kingdom, Germany, and Japan. It is developing an AI performance management system that will evaluate employee performance and recommend promotion decisions across all four jurisdictions. Research and compare the anti-discrimination legal frameworks in each jurisdiction as they apply to AI systems. Identify at least three specific tensions the company will need to manage across jurisdictions, and propose a compliance approach.

Exercise 15 ⭐⭐⭐ † Affected Community Engagement A municipal government is considering deploying a predictive policing algorithm to optimize patrol resource allocation. You have been hired to design a community engagement process that will meet the standard of genuine engagement — not performative consultation — described in Section 7.9. Design the engagement process, specifying: - Which community groups you would seek to involve and how you would identify them - How you would structure the engagement to ensure it has substantive influence on the deployment decision, not just advisory status - How you would handle disagreements within the community about the system - What you would do if the engagement process produced a community preference for non-deployment

Part D: Research and Argumentation (⭐⭐⭐⭐)

Exercise 16 ⭐⭐⭐⭐ Original Case Research Identify an AI system bias case from the past five years that is not discussed in Chapter 7. Using the analytical frameworks from this chapter, write a 1,000–1,500 word case analysis covering: - Description of the system and its purpose - The bias that was discovered (or is alleged) - Classification of the bias type using the technical/emergent/sociotechnical taxonomy - The mechanism(s) through which bias entered the system (training data, proxy variables, feedback loop, etc.) - The harm caused and who bore it - The organizational and regulatory response - Lessons for practitioners

Exercise 17 ⭐⭐⭐⭐ Policy Proposal: AI Bias Regulation Draft a 1,000–1,200 word policy proposal for a federal regulation governing AI systems in employment contexts. Your proposal should specify: - Which systems are covered and how "employment AI" is defined for regulatory purposes - Required documentation and testing standards (including disaggregated evaluation) - Mandatory disclosure requirements to applicants and regulators - Enforcement mechanism and penalty structure - Safe harbor provisions, if any, for compliant vendors and employers - How the regulation addresses the problem of bias discovered post-deployment

Your proposal should engage explicitly with the tension between regulatory burden and innovation incentives, and should address the interests of at least three distinct stakeholder groups: employers, job applicants, and AI vendors.

Exercise 18 ⭐⭐⭐⭐ † Debate: Should Facial Recognition Be Banned? Prepare a structured argument on one side of the following proposition: "Municipal governments should ban all law enforcement use of facial recognition technology until minimum accuracy standards across demographic groups can be demonstrated and independently verified."

Affirmative team should argue that the ban is necessary and proportionate, drawing on the NIST findings, the wrongful arrest cases, and the civil liberties concerns discussed in Case Study 7.2.

Negative team should argue that the ban is disproportionate and that regulation rather than prohibition is the appropriate response, drawing on the potential benefits of facial recognition in solving serious crimes and the availability of accuracy-improving technical measures.

Both teams should engage substantively with the strongest version of the opposing argument.

Exercise 19 ⭐⭐⭐⭐ Longitudinal Impact Analysis The predictive policing feedback loop described in Section 7.5 has been operating in several US cities for more than a decade. Design a research study that would measure the long-term demographic impact of predictive policing on crime statistics, arrest rates, and community trust in policing in a specific city. Your design should address: - Research questions and hypotheses - Data sources and data access challenges - Methods for separating the effect of the algorithm from other factors affecting crime and policing patterns - Ethical considerations in conducting the research - How findings should be communicated to the affected community and to policymakers

Exercise 20 ⭐⭐⭐⭐ Organizational Culture Assessment Section 7.9 describes the organizational conditions that enable early bias detection. Design a self-assessment instrument for a technology company to evaluate the strength of its "culture of bias awareness." Your instrument should include: - At least 15 specific assessment questions across the dimensions of team diversity, process institutionalization, testing standards, post-deployment monitoring, and community engagement - A scoring framework with behavioral anchors at each level - Guidance on interpreting results and prioritizing improvement - An explanation of how you would validate the instrument's predictive validity — that is, how you would test whether organizations that score higher on your instrument actually deploy less biased AI systems

Part E: Reflection and Personal Application (⭐ to ⭐⭐⭐)

Exercise 21 ⭐ Describe a situation from your own experience — professional, educational, or personal — where you believe an algorithmic or rule-based system may have produced unfair or biased outcomes. What type of bias do you think was operating? What would a fairer design have looked like?

Exercise 22 ⭐⭐ Consider a major AI system you interact with regularly — a search engine, a social media algorithm, a recommendation system, or a hiring platform. Based on what you know about how it works and what you have learned in this chapter, identify at least three specific ways in which it might exhibit bias. What data would you need to verify or refute your hypothesis?

Exercise 23 ⭐⭐ † The Bystander Scenario You are a data scientist at a financial services company. During development of a new loan underwriting model, you discover that a key feature being used by the model — neighborhood name — is highly correlated with race due to residential segregation patterns. Your manager tells you that the feature improves overall model accuracy and that legal review has determined the model does not explicitly use race as a feature. What do you do? Write a structured response addressing: what further analysis you would conduct, who you would escalate to, what arguments you would make, and what you would do if your concerns were dismissed.

Exercise 24 ⭐⭐⭐ Ethics Washing Identification Review the following three company announcements related to AI bias and evaluate each on a spectrum from "ethics washing" to "genuine ethics." Support your evaluations with specific reference to the criteria discussed in this chapter — particularly disclosure, remediation, prevention, and community engagement.

a. A company's press release announcing that it has "formed an AI Ethics Board composed of senior executives and academic advisers who will review our AI systems for bias." b. A company's announcement that it has "paused deployment of our hiring AI pending an independent bias audit, will contact all candidates screened during the past 18 months, and will publish the audit results regardless of findings." c. A company's response to a press investigation into its credit algorithm: "We take these concerns seriously. Our model does not use race as an input variable, and we comply with all applicable regulations."

Exercise 25 ⭐⭐ † Organizational Priority Setting You have just been appointed Chief AI Ethics Officer at a mid-sized insurance company that uses AI for claims processing, fraud detection, underwriting, and customer service routing. You have a team of three people and a budget of $500,000 for the first year. Using the frameworks from this chapter, prioritize your first-year agenda. Which systems will you evaluate first and why? What assessment methods will you use? What institutional practices will you try to establish? How will you measure progress? Be specific and justify your prioritization choices.