Chapter 14: Exercises

Difficulty scale: One star (introductory) through four stars (advanced/integrative).

Exercises marked with are recommended for in-class discussion or small-group work.


Part A: Comprehension and Vocabulary (One Star)

Exercise 1 ★ Define explainability and interpretability in your own words. Give one example of an AI application where interpretability (transparent model logic) is feasible and preferable, and one where you might accept a less interpretable model paired with post-hoc explanation. Justify each choice in two or three sentences.

Exercise 2 ★ Match each term on the left with its definition on the right.

Term Definition
LIME A. The minimum changes to an input that would produce a different model output
SHAP B. A post-hoc method that fits a local linear model to perturbations of the input
Counterfactual explanation C. A feature-attribution method grounded in cooperative game theory
Faithfulness D. The degree to which an explanation accurately represents the model's actual behavior
Proxy variable E. A non-protected characteristic that correlates with a protected one
Global explanation F. An explanation describing how the model behaves across its full input distribution
Local explanation G. An explanation describing why the model made one specific prediction
Algorithmic recourse H. The ability to understand what changes would lead to a different outcome

Exercise 3 ★ List three types of people who might need an explanation of an AI credit decision, and describe what type of explanation would be most useful for each. Use the framework from Section 14.1 (global vs. local, feature importance vs. counterfactual) to structure your answer.

Exercise 4 ★ True or false: If a company deploys SHAP-based explanations for its credit model and provides SHAP-derived adverse action notices to denied applicants, it has guaranteed that its model is fair and non-discriminatory. Explain your answer in three or four sentences.

Exercise 5 ★ In what sense is a LIME explanation a "model of a model"? What does this mean for how we should interpret and trust LIME-generated explanations?


Part B: Conceptual Analysis (Two Stars)

Exercise 6 ★★ † A logistics company has deployed a machine learning model to determine which drivers receive access to premium delivery routes (and thus higher earnings). When a driver is denied access, the company provides a LIME-based explanation identifying the top three factors. A driver named Priya receives the following explanation: "Top factors affecting your application: (1) your average delivery rating is 4.2, which is below our threshold; (2) your route completion rate is 94%, which falls in the medium-risk range; (3) your vehicle age is 8 years, which exceeds our preferred maximum."

a) Is this a global or local explanation? What is its purpose? b) What makes this explanation relatively useful compared to a generic rejection notice? c) What makes this explanation potentially inadequate — what information is missing that Priya might need? d) Would a counterfactual explanation serve Priya better? Draft one that you think would be genuinely useful.

Exercise 7 ★★ The FICO credit score is described in Section 14.2 as an interpretable model with both advantages and disadvantages. List three specific governance advantages of the FICO score's interpretability. Then list two ways in which interpretability did not prevent problematic outcomes for FICO scoring in practice. What does this illustrate about the relationship between interpretability and fairness?

Exercise 8 ★★ Section 14.3 describes LIME's instability problem: different runs of LIME on the same instance can produce different explanations.

a) Why does this instability arise? (Hint: think about the random perturbation step.) b) An employment platform uses LIME to generate adverse action explanations for job candidates whose applications were rejected by an AI screening system. What are the specific compliance risks posed by LIME's instability in this context? c) Propose two ways the platform could manage this instability while still using LIME-based explanations.

Exercise 9 ★★ SHAP values have a mathematical property called "additivity" — the SHAP values for all features sum to the difference between the prediction and the baseline.

a) In plain English, what does this property mean for an affected individual reading their SHAP-based explanation? b) Why is this property useful for compliance officers reviewing a model for adverse action notice accuracy? c) LIME does not have this property. Does this necessarily make LIME explanations wrong? Explain.

Exercise 10 ★★ † A healthcare AI system recommends which patients should be flagged for intensive care management. A data scientist runs SHAP on the model and produces a beeswarm plot. The top five features by mean absolute SHAP value are: (1) number of prior hospitalizations, (2) age, (3) number of chronic conditions, (4) insurance type, (5) ZIP code.

a) Which features in this list might raise ethical or legal concerns if included in a healthcare resource allocation model? Why? b) What additional analysis would you want to conduct on each concerning feature before concluding there is a problem? c) The model's clinical team argues that prior hospitalizations, age, and chronic conditions are legitimate clinical predictors. Do you agree? Are there circumstances where clinically accurate predictions could still be ethically problematic?

Exercise 11 ★★ Section 14.6 discusses the "stethoscope correlation" case — a chest X-ray AI that partially relied on the presence of a stethoscope in images as a predictive signal. Saliency maps correctly identified the stethoscope region as influential.

a) Is the saliency map faithful in this case — does it accurately reflect what drove the prediction? b) Is the saliency map useful in this case? Explain the distinction between faithful and useful. c) Who was in a position to notice and act on this finding? What governance process would have caught this problem before deployment?

Exercise 12 ★★ Jain and Wallace (2019) showed that attention weights in transformer-based language models are not reliable explanations. A legal technology company uses a language model to assist in contract review, and presents attention heatmaps to lawyers as evidence of which clauses the AI "focused on" when identifying risk.

a) Based on Section 14.7, what is the fundamental problem with this presentation? b) What would a more reliable explanation look like for this application? c) If a lawyer makes a decision based on the AI's recommendation and the attention-based explanation, and the decision turns out to be wrong, who bears responsibility? Does the misleading explanation affect your answer?


Part C: Application and Analysis (Three Stars)

Exercise 13 ★★★ † Reading a SHAP Waterfall Plot.

The following SHAP waterfall output describes why a rental housing application was denied by an AI screening system. Baseline denial probability: 0.31. Final denial probability: 0.68.

Feature Value SHAP Value
prior_evictions 0 -0.04
credit_score 612 +0.11
income_to_rent_ratio 2.8 +0.09
rental_history_years 1.5 +0.06
employment_stability_score 62 +0.03
num_prior_applications 7 +0.02
zip_code_risk 74 +0.08
age_of_oldest_account 2.1 yrs +0.05
cosigner_present No +0.07

a) Verify that the SHAP values are additive: do they sum to the difference between final prediction and baseline? (Show your arithmetic.) b) Identify the three features most responsible for this denial (by SHAP value magnitude). Are any of these potentially problematic from a fair housing perspective? Explain. c) What might "zip_code_risk" be capturing? Why does its presence in this model raise fair housing concerns under the Fair Housing Act? d) Write a plain-English adverse action notice based on this SHAP output that would be genuinely useful to the applicant — not just technically compliant. e) Write a counterfactual explanation for this applicant that is specific and actionable. Be realistic about what the applicant can actually change.

Exercise 14 ★★★ The Rashomon Set Decision.

A hospital is building an AI model to predict which patients with heart disease are at highest risk of readmission within 30 days, with the goal of providing targeted follow-up care to high-risk patients.

Three candidate models are evaluated:

Model AUC-ROC Accuracy Interpretability
Logistic Regression 0.771 0.72 High (direct coefficient interpretation)
Explainable Boosting Machine (EBM) 0.789 0.74 High (feature shapes visualized)
XGBoost + SHAP explanations 0.801 0.76 Moderate (SHAP post-hoc)
Deep Neural Network + SHAP 0.812 0.77 Low (SHAP approximate, computationally costly)

a) Which model would you recommend for this deployment, and why? Your answer should consider clinical stakes, the faithfulness problem, regulatory context, and the accuracy-interpretability tradeoff explicitly. b) The clinical informatics team argues that the DNN's higher accuracy means more patients get the follow-up they need — that choosing a less accurate model to maintain interpretability will result in preventable readmissions. How do you respond to this argument? c) If the hospital chooses XGBoost + SHAP, what additional governance measures should be in place to compensate for the lower inherent interpretability relative to the logistic regression or EBM?

Exercise 15 ★★★ Designing an XAI Governance Framework.

You are the Chief AI Officer of a consumer lending company preparing to deploy a gradient-boosted tree model for personal loan approvals. The model will process approximately 50,000 applications per month. Sketch a complete XAI governance framework that addresses:

a) The explanation types and formats that will be generated (global and local, SHAP and/or LIME, counterfactual). b) The audience for each type of explanation and how it will be delivered. c) How adverse action notices will be generated, and what validation will confirm their faithfulness. d) How the model will be monitored for proxy variable usage on an ongoing basis. e) What happens if a SHAP audit identifies a potential proxy variable — the escalation and remediation process. f) How the adversarial explanation risk (Slack et al.) will be addressed.

Your framework should be concrete and implementation-oriented, not just aspirational.

Exercise 16 ★★★ † Counterfactual Design Challenge.

An algorithmic hiring tool has rejected a job application from Marcus, a 34-year-old candidate with 6 years of relevant experience, a bachelor's degree from a public university, and a gap in employment history from 18 months ago (for caregiving reasons). The system provides him with the following SHAP-based adverse action notice: "Your application was not selected due to: (1) employment gap in history; (2) undergraduate institution tier score below threshold; (3) years of relevant experience below ideal range."

a) Identify at least two ways in which this explanation may fail the "actionability" standard. b) Design a counterfactual explanation that is actionable, free of protected-characteristic references, and genuinely useful for Marcus's future applications. c) The "undergraduate institution tier score" is based on a ranking system that correlates with applicant wealth and family socioeconomic status. What legal and ethical issues does this feature raise under Title VII of the Civil Rights Act? d) Marcus discovers that his friend Aisha, who has a 3-year employment gap and similar qualifications from a higher-tier institution, was advanced to the next round. He believes the system is discriminating. What processes should the company have in place for him to challenge this outcome?

Exercise 17 ★★★ The Adversarial Explanation Scenario.

Read Case Study 14.2 carefully, then answer the following:

a) The Slack et al. attack requires the adversarial classifier to distinguish between explanation-tool queries and real applicant data. In technical terms, how does the classifier learn to make this distinction? b) A bank's compliance team reviews monthly SHAP reports and sees that the model's explanations consistently show only non-protected, economically intuitive features (income, debt, employment). They conclude the model is not discriminating. Why is this conclusion potentially unjustified? c) Design an audit protocol that would be resistant to the adversarial explanation attack. What elements of your protocol do not rely on explanation outputs? d) If the adversarial explanation attack were discovered after deployment, what legal theories could a harmed applicant use to pursue a claim? What evidence would they need?


Part D: Synthesis and Critical Analysis (Four Stars)

Exercise 18 ★★★★ The Rudin Argument.

Cynthia Rudin's 2019 paper "Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead" is one of the most cited and most debated papers in AI ethics and ML. Read the argument presented in Section 14.2 carefully, then:

a) Summarize Rudin's argument in your own words (150-200 words). b) Identify the strongest empirical claim in the argument and evaluate its strength based on what you know about the Rashomon set concept. c) Identify the most significant objection to Rudin's argument — the best case for why black-box models with post-hoc explanation are sometimes appropriate in high-stakes settings. d) Take a position: do you agree with Rudin that the default for high-stakes decisions should be interpretable models, with black-box models requiring explicit justification? Defend your position with specific reference to at least two domains (e.g., credit, healthcare, criminal justice, hiring). e) Rudin wrote in 2019. How, if at all, does the development of large language models since 2019 affect her argument?

Exercise 19 ★★★★ † Ethics Washing Analysis.

The chapter introduces the concept of the "explanation placebo": using the availability of explanations to create the appearance of accountability without the substance. This is an instance of the broader pattern of ethics washing (introduced in Chapter 3).

a) Identify three specific organizational behaviors — not hypothetical, but realistic practices you could imagine a real company engaging in — that would constitute using XAI as ethics washing. b) For each behavior, identify who would be harmed, how the harm would manifest, and what governance mechanism could detect or prevent it. c) The concept of ethics washing implies that there is a meaningful distinction between genuine accountability and the appearance of accountability. In the context of XAI, how would you operationalize this distinction — what criteria would you use to distinguish genuine transparency from transparency theater? d) Some scholars argue that ethics washing, while problematic, is better than nothing: that even imperfect explanations create some accountability, some possibility of challenge, some norm of transparency that can be strengthened over time. Evaluate this argument. Do you agree? What evidence from this chapter supports or undermines it?

Exercise 20 ★★★★ Global Variation in XAI Requirements.

The chapter references multiple regulatory frameworks: GDPR Article 22 (EU), the EU AI Act, the CFPB/OCC guidance on ML in credit (US), and SR 11-7 (US banking). A multinational financial services firm operates in the United States, the European Union, and three Southeast Asian jurisdictions with less developed AI-specific regulation.

a) Research the specific explainability requirements that apply to automated credit decisions in the EU under GDPR and the AI Act. How specific are these requirements? Do they mandate any particular explanation methods? b) How do these requirements compare to US ECOA/Regulation B adverse action notice requirements and SR 11-7 model risk management guidance? c) The firm's data science team wants to deploy a single global explanation methodology for all jurisdictions, for operational simplicity. What are the risks of this approach? What are the benefits? d) Design a jurisdiction-differentiated XAI governance framework that satisfies the most stringent requirements in each jurisdiction while maintaining operational feasibility. Identify where genuine conflicts exist between jurisdictions' requirements.

Exercise 21 ★★★★ † XAI for Large Language Models.

Section 14.7 identifies large language models (LLMs) as a special challenge for explainability: they have hundreds of billions of parameters, exhibit emergent behaviors, and current XAI tools cannot provide reliable global explanations for their behavior.

Despite this, LLMs are being deployed in high-stakes contexts: legal research assistants, medical information systems, hiring tools, financial advisory chatbots.

a) For each of the four application domains listed above, identify: (i) what the primary harm of an unexplainable error would be; (ii) what type of explanation would most matter to the person affected by that error; and (iii) whether current XAI methods can provide that explanation. b) Given the current limitations of LLM explainability, should these applications be permitted at all? Develop a principled framework for deciding which LLM applications are acceptable given current explainability limitations and which are not. c) What would have to be true — what technical advances would have to occur, or what governance mechanisms would have to be in place — for LLM deployment in high-stakes decisions to be ethically defensible in your view?

Exercise 22 ★★★★ The Discrimination Legacy Problem.

Case Study 14.1 describes a scenario where a bank's neighborhood risk score encoded the legacy of redlining — historical discrimination that produced demographic disparities in neighborhood financial health that still persist. The remediation reduced but did not eliminate the demographic disparity in denial rates (from 1.8x to 1.2x).

a) Is a residual 1.2x denial rate disparity ethically acceptable if it reflects "genuine" differences in financial profile distributions across neighborhoods? How should we evaluate this question when those differences are themselves the product of historical discrimination? b) Some scholars argue that genuine equity requires not just non-discrimination but active remediation of historical disadvantage — affirmative action in lending decisions. Under this view, a 1.2x disparity is not a success but a failure. Is this argument consistent with current anti-discrimination law? Is it ethically compelling regardless of its legal status? c) The chapter suggests that interpretable models are more easily audited for this kind of discriminatory impact than black-box models. But even a fully interpretable logistic regression would encode this historical pattern if trained on historical data. At what point in the data pipeline must remediation occur, and who is responsible for each stage? d) Design a research methodology that would allow a bank to answer the question: "Is the residual denial rate disparity in our remediated model the result of genuine creditworthiness differences, or the result of historical discrimination that our model continues to perpetuate?" What data would you need? What statistical methods would you apply? What findings would you consider actionable?

Exercise 23 ★★★★ Legislative Drafting Exercise.

You have been retained as a consultant to advise a legislative committee drafting a comprehensive AI transparency bill that applies to AI systems used in consequential individual decisions (credit, employment, housing, healthcare, benefits, and criminal justice). The committee wants the bill to require meaningful explanations while avoiding requirements that can be satisfied through ethics washing.

Drawing on everything in this chapter, draft the core provisions of an AI transparency and explanation section of this bill. Your draft should:

a) Define what constitutes a "meaningful" explanation for each audience type (affected individual, auditor, regulator). b) Specify which explanation methods are acceptable, which are unacceptable, and what validation is required. c) Include provisions that address the adversarial explanation vulnerability (Slack et al.). d) Include provisions for ongoing monitoring, not just point-in-time explanation. e) Specify enforcement mechanisms and who has standing to bring a claim.

Annotate each provision with a brief justification explaining why it is designed the way it is, what problem it addresses, and what tradeoffs it involves.

Exercise 24 ★★★★ † Cross-Chapter Integration: Accountability and Explanation.

This chapter concludes by noting that explanation does not equal accountability — that Chapter 18 will address the organizational structures needed for genuine accountability. Drawing on this chapter and Chapter 3 (Ethics Washing), Chapter 6 (Bias and Fairness Metrics), and any other relevant chapters from this textbook:

a) Map the relationship between explainability, fairness, and accountability. What does each contribute to responsible AI governance, and where does each fall short on its own? b) Some scholars propose that the appropriate unit of accountability for AI harms is the organization that deployed the system, regardless of whether any individual within the organization intended the harm. Others argue that accountability requires identifying specific responsible individuals. Evaluate both positions in the context of the Meridian Bank case study. Who should be accountable for the discriminatory impact of the neighborhood risk score? c) Design an organizational governance structure — board-level oversight, internal audit functions, documentation requirements, escalation protocols — that would create genuine accountability for AI-caused harm in a regulated financial institution. Be specific about roles, reporting lines, and consequences.

Exercise 25 ★★★★ Original Research Proposal.

You have observed that the existing literature on XAI focuses primarily on tabular data and computer vision. Very little research addresses explainability for time-series models (predicting patient deterioration from continuous vital sign monitoring, predicting loan default from transaction history streams) or for reinforcement learning systems (adaptive pricing systems, resource allocation algorithms).

Design a research proposal — suitable for submission to an academic conference on AI ethics — that addresses XAI for one of these underexplored domains. Your proposal should include:

a) A clear research question and its practical governance significance. b) A review of why existing methods (LIME, SHAP, counterfactuals) are inadequate for this domain. c) A proposed methodological approach. d) A plan for validating whether your proposed explanation method is faithful. e) An assessment of who the primary beneficiary of your research would be (data scientists, affected individuals, regulators) and how the research would reach them.