Chapter 26 Exercises: Fairness, Explainability, and Transparency

DataField.Dev

Chapter 26 Exercises: Fairness, Explainability, and Transparency

Section A: Fairness Definitions and Tradeoffs

Exercise 26.1 — Mapping Fairness Definitions

A health insurance company uses a machine learning model to predict which customers will file high-cost claims in the next year. The model is used to set premium prices. Two demographic groups have different base rates of high-cost claims: Group A (12 percent) and Group B (7 percent) due to differences in average age and pre-existing conditions.

a) If the model satisfies demographic parity, what would the flagging rate need to be for each group? Explain why enforcing demographic parity in this context could lead to mispricing risk. b) If the model satisfies equalized odds, describe what that means in terms of the model's true positive rate and false positive rate for each group. Would the overall flagging rates be different? c) If the model satisfies calibration, what does a predicted probability of 0.15 mean for a member of Group A versus Group B? Could the flagging rates differ? d) Which fairness definition would you recommend for this use case? Justify your choice, considering the stakeholders (customers, the insurer, regulators) and the nature of the decision (pricing, not denial of service).

Exercise 26.2 — The Impossibility Theorem in Practice

Consider a university admissions model that predicts whether an applicant will graduate within four years. Historical data shows that applicants from high-income backgrounds graduate at a rate of 82 percent, while applicants from low-income backgrounds graduate at a rate of 64 percent (due to financial pressures, work obligations, and other systemic factors).

a) Explain why, according to the impossibility theorem, this admissions model cannot simultaneously satisfy calibration and equalized odds. b) If the university prioritizes calibration, what are the likely consequences for low-income applicants? Be specific about admission rates. c) If the university prioritizes equalized odds (specifically, equal false positive rates), what are the likely consequences for overall graduation rates? d) A university administrator says: "We should just use a model that satisfies all fairness definitions." Write a paragraph explaining why this is not possible and what the administrator should consider instead. e) Propose a framework (3-5 steps) for how the university should decide which fairness definition to prioritize. Your framework should include stakeholder consultation, legal review, and impact analysis.

Exercise 26.3 — Fairness Definition Selection Matrix

For each of the following AI applications, identify which fairness definition (demographic parity, equalized odds, predictive parity, calibration, or individual fairness) is most appropriate. Justify your choice in 2-3 sentences, explaining why the alternatives are less suitable.

a) A predictive policing model that assigns patrol resources to neighborhoods b) A resume screening model used for the first round of hiring at a Fortune 500 company c) A medical triage model in an emergency room that prioritizes patients by urgency d) A content recommendation algorithm on a social media platform e) A credit scoring model used to set interest rates (not for approval/denial) f) A recidivism prediction model used for parole decisions

Section B: Legal Frameworks

Exercise 26.4 — The 4/5ths Rule

A technology company uses an AI model to screen job applicants for software engineering roles. The following data represents the pass rates for the initial screening across four demographic groups:

Group	Applicants	Passed Screening	Pass Rate
Group A	1,200	480	40.0%
Group B	800	256	32.0%
Group C	600	180	30.0%
Group D	400	100	25.0%

a) Calculate the impact ratio for each group relative to the group with the highest pass rate. Show your work. b) Which groups, if any, fail the 4/5ths rule? What is the impact ratio for each? c) Does failing the 4/5ths rule mean the company is discriminating? Explain the legal distinction between adverse impact (a statistical finding) and discrimination (a legal conclusion). d) If the company wants to defend its screening model against a disparate impact claim, what must it demonstrate? (Hint: think about "business necessity" and "less discriminatory alternatives.") e) Write the Python code using the check_adverse_impact function from the chapter to verify your manual calculations.

Exercise 26.5 — Disparate Treatment vs. Disparate Impact

Classify each of the following scenarios as primarily raising disparate treatment concerns, disparate impact concerns, or both. Explain your reasoning.

a) A hiring model uses "years of continuous employment" as a key feature, which tends to disadvantage women who took parental leave. b) A loan approval model explicitly uses gender as an input variable to comply with an insurance regulation that permits gender-based pricing. c) A facial recognition system trained primarily on lighter-skinned faces has a 12 percent error rate for darker-skinned faces compared to a 1 percent error rate for lighter-skinned faces. d) A model uses zip code as a feature. The development team is aware that zip code is correlated with race but decides to keep it because removing it reduces accuracy. e) A customer service routing algorithm assigns high-value customers to human agents and low-value customers to chatbots. Customer value is calculated from purchase history, which correlates with income.

Exercise 26.6 — GDPR Right to Explanation

An online retailer uses a machine learning model to determine which customers receive a premium delivery offer (free next-day delivery). A customer in Germany, where GDPR applies, asks: "Why didn't I receive the premium delivery offer?"

a) Under the narrow interpretation of GDPR Article 22, what must the retailer disclose? b) Under the moderate interpretation, what additional information must the retailer provide? c) Under the broad (counterfactual) interpretation, what would an ideal response look like? Write a sample 3-4 sentence response to the customer. d) Which interpretation do you believe European data protection authorities will converge on? Justify your prediction. e) How would using the ExplainabilityDashboard's explain_prediction method help the retailer satisfy these requirements?

Section C: Explainability Techniques

Exercise 26.7 — Interpreting SHAP Values

A customer churn model produces the following SHAP values for Customer #3847 (predicted churn probability: 0.78, base value: 0.35):

Feature	Feature Value	SHAP Value
purchase_frequency_30d	0.5	+0.18
days_since_last_purchase	42	+0.14
support_tickets_90d	3	+0.09
loyalty_points_balance	0	+0.07
avg_order_value	$85	-0.03
account_age_months	36	-0.05
email_open_rate	0.12	+0.03

a) Verify that the SHAP values are approximately additive: does the sum of SHAP values plus the base value approximately equal the predicted probability? Show your calculation. b) Write a plain-language narrative explaining this customer's churn risk, suitable for a customer service representative. Your narrative should include the top three risk factors and one protective factor. c) If the customer's purchase frequency increased from 0.5 to 3.0 per month, would you expect the SHAP contribution of that feature to change from +0.18 to a negative value? Explain why or why not, and what additional information you would need. d) A colleague says: "The SHAP value for avg_order_value is -0.03, so average order value doesn't matter." Is this a correct interpretation? Why or why not?

Exercise 26.8 — SHAP vs. LIME

You are explaining a neural network model that predicts whether a customer will respond to an email marketing campaign. You run both SHAP (using KernelExplainer) and LIME on the same instance. The results differ:

Feature	SHAP Value	LIME Coefficient
email_frequency	+0.22	+0.15
last_email_opened_days	+0.18	+0.31
total_purchases	-0.11	-0.08
account_age	-0.05	+0.02
avg_session_duration	+0.08	+0.04

a) For which features do SHAP and LIME substantially agree? For which do they disagree? b) The account_age feature has opposite signs in the two explanations. What could cause this discrepancy? (Hint: consider LIME's local approximation approach.) c) If you had to present one explanation to a regulator, which method would you choose and why? d) If you had to quickly debug why a specific prediction seems wrong, which method would you choose and why?

Exercise 26.9 — Partial Dependence Interpretation

A partial dependence plot for a real estate price prediction model shows the following relationship between "distance to city center (km)" and predicted price:

0-5 km: predicted price declines steeply ($800K → $550K)
5-15 km: predicted price declines gradually ($550K → $450K)
15-25 km: predicted price is flat (~$440K)
25-40 km: predicted price increases slightly ($440K → $480K)

a) Explain the pattern in business terms. Why might prices increase slightly beyond 25 km? b) A colleague argues that the PDP proves that moving a house from 30 km to 5 km would increase its value by $320K. Why is this interpretation incorrect? What does a PDP actually show? c) What additional analysis would you perform to validate whether the PDP accurately represents the model's behavior? (Hint: consider individual conditional expectation plots.)

Section D: Model Documentation

Exercise 26.10 — Writing a Model Card

You have built a sentiment analysis model for a hotel chain that classifies customer reviews as "positive," "neutral," or "negative." The model is a fine-tuned BERT classifier trained on 50,000 labeled hotel reviews from TripAdvisor, Booking.com, and Google Reviews. The model achieves 89 percent accuracy overall, but accuracy varies by language: 92 percent for English reviews, 84 percent for Spanish reviews, and 71 percent for French reviews (the model was trained primarily on English data). The model will be deployed to automatically categorize incoming reviews and route negative reviews to a customer service team for response.

Write a complete model card following the framework from Section 26.9. Include all seven sections: model details, intended use, training data, evaluation data, performance metrics (broken down by language), ethical considerations, and caveats/recommendations. Pay special attention to the ethical considerations and limitations sections.

Exercise 26.11 — Datasheet for a Dataset

You are leading a team that has assembled a dataset of 200,000 customer service chat transcripts to train a chatbot. The transcripts were collected over 18 months from your company's customer service platform. Customers consented to data collection through the platform's terms of service (a checkbox during account creation) but were not specifically informed that their transcripts would be used for AI training.

a) Write the "Motivation" and "Collection Process" sections of a datasheet for this dataset. b) Identify three ethical concerns with using this dataset for chatbot training, focusing on consent, representation, and privacy. c) What preprocessing steps would you recommend before the dataset is used for training? Consider both technical preprocessing (deduplication, cleaning) and ethical preprocessing (PII removal, sensitive content filtering). d) Under GDPR, would the existing consent mechanism (terms of service checkbox) be sufficient for using this data to train an AI model? Explain, referencing the concept of "freely given, specific, informed, and unambiguous" consent.

Section E: Applied Scenarios

Exercise 26.12 — The Zip Code Problem

This exercise mirrors Athena's experience in the chapter. You are a data scientist at a retail company. Your churn prediction model uses the following features, and SHAP analysis reveals these importance rankings:

| Rank | Feature | Mean |SHAP| | Potential Proxy? | |------|---------|------------|-----------------| | 1 | purchase_frequency_30d | 0.142 | No | | 2 | days_since_last_purchase | 0.098 | No | | 3 | zip_code (encoded) | 0.071 | Race, income | | 4 | avg_basket_size | 0.058 | No | | 5 | loyalty_tier | 0.052 | No | | 6 | channel_preference | 0.044 | Age (weak) | | 7 | return_rate | 0.039 | No |

a) Explain why simply removing zip code is not necessarily the best solution. What useful signal might be lost? b) Propose three alternative features that could capture the geographic purchasing signal of zip code without directly encoding demographic information. For each, explain why it is less likely to serve as a demographic proxy. c) Design an experiment to test whether your alternative features maintain model accuracy while reducing demographic bias. Specify the metrics you would compare and the threshold for acceptable accuracy loss. d) Your VP of Marketing argues: "Zip code is just geography. It's not a protected characteristic. We shouldn't drop it." Write a 200-word response explaining why this argument is insufficient.

Exercise 26.13 — Fairness Audit Design

You are hired as an external consultant to audit the AI hiring system of a large corporation. The system screens resumes, scores candidates, and ranks them for recruiter review. The company wants to ensure the system is fair before a regulatory audit.

a) Design a comprehensive fairness audit plan (10-15 steps). Your plan should include data collection, metric selection, analysis, reporting, and remediation. b) Which fairness definitions would you test? Justify the inclusion of each. c) What data would you need from the company to conduct the audit? List at least eight specific data requirements. d) How would you handle a situation where the model satisfies equalized odds but fails the 4/5ths rule? What would you recommend to the company? e) Draft the executive summary section of your audit report (200-300 words) for a scenario where you find moderate adverse impact against one group but no evidence of disparate treatment.

Exercise 26.14 — Stakeholder Communication

A mortgage lender has deployed a gradient boosting model that predicts default risk. An applicant has been denied a mortgage. The SHAP analysis for this applicant shows:

Debt-to-income ratio (0.58): SHAP = +0.24
Credit score (620): SHAP = +0.19
Employment length (8 months): SHAP = +0.15
Down payment percentage (3%): SHAP = +0.11
Number of credit inquiries (6 in 12 months): SHAP = +0.08
Annual income ($72,000): SHAP = -0.06
Savings balance ($15,000): SHAP = -0.04

a) Write a denial letter to the applicant (200-300 words) that explains the decision in plain language, identifies the top factors, and provides actionable guidance. The letter should comply with the Equal Credit Opportunity Act (ECOA), which requires lenders to provide specific reasons for denial. b) Write a 3-sentence explanation suitable for a customer service representative's script when the applicant calls to discuss the denial. c) Write a summary for the lender's internal compliance dashboard that includes the SHAP values, the fairness status, and a recommendation for human review. d) For each of the three audiences (applicant, customer service rep, compliance team), explain why the level of technical detail differs.

Section F: Python Implementation

Exercise 26.15 — Extending the ExplainabilityDashboard

The ExplainabilityDashboard from the chapter provides core functionality. Extend it with the following capabilities:

a) Add a counterfactual_explanation method that, given an instance and a target prediction, identifies the minimum feature changes needed to flip the prediction. (Hint: iterate through features in order of SHAP magnitude and test the effect of changing each to the population median.)

def counterfactual_explanation(
    self, instance_index: int, target_class: int = 0,
    max_features_to_change: int = 3
) -> dict:
    """
    Find the minimum changes needed to flip a prediction.
    Returns a dict with the original prediction, target,
    and suggested changes.
    """
    # Your implementation here
    pass

b) Add a compare_models class method that takes two ExplainabilityDashboard instances (e.g., one for the original churn model and one for the revised model without zip code) and generates a comparison report showing changes in feature importance, fairness metrics, and accuracy.

c) Add a temporal_stability method that checks whether the model's SHAP-based feature importance rankings are stable across different time periods in the test data (e.g., month by month). If the top-5 features change significantly over time, the method should flag this as a stability concern.

Exercise 26.16 — Building a Fairness-Aware Classifier

Using scikit-learn, build a logistic regression model on a lending dataset (you may use a synthetic dataset) and implement the following fairness workflow:

a) Train the model and calculate all five fairness metrics (demographic parity, equalized odds, predictive parity, calibration error, and individual fairness approximation) across two demographic groups. b) Implement threshold adjustment: find different classification thresholds for each group that satisfy equalized odds (equal TPR and FPR across groups). Compare the resulting thresholds and discuss the tradeoffs. c) Use the ExplainabilityDashboard to generate a model card for both the unadjusted and adjusted models. d) Write a 300-word memo to the Chief Risk Officer comparing the two approaches (single threshold vs. group-specific thresholds), covering accuracy, fairness, legal risk, and implementation complexity.

These exercises cover fairness definitions (Exercises 26.1-26.3), legal frameworks (26.4-26.6), explainability techniques (26.7-26.9), model documentation (26.10-26.11), applied scenarios (26.12-26.14), and Python implementation (26.15-26.16). Exercises marked with Python code require programming; all others can be completed analytically. For selected answers, see Appendix B.