Exercises: Transparency, Explainability, and the Black Box Problem

DataField.Dev

Exercises: Transparency, Explainability, and the Black Box Problem

These exercises progress from concept checks to challenging applications. Estimated completion time: 3-4 hours.

Difficulty Guide: - ⭐ Foundational (5-10 min each) - ⭐⭐ Intermediate (10-20 min each) - ⭐⭐⭐ Challenging (20-40 min each) - ⭐⭐⭐⭐ Advanced/Research (40+ min each)

Part A: Conceptual Understanding ⭐

Test your grasp of core concepts from Chapter 16.

A.1. Section 16.1.1 defines a black box as "a system whose internal workings are opaque." Explain the difference between the two types of black boxes identified in Section 16.1.2: the "locked room" (technically opaque) and the "locked safe" (deliberately opaque). For each type, provide an example from the chapter and explain what kind of solution is required.

A.2. Section 16.2.1 distinguishes between global and local explainability. Define each in your own words. Then identify: (a) a stakeholder who would primarily need global explainability and why, and (b) a stakeholder who would primarily need local explainability and why.

A.3. Explain the difference between model-agnostic and model-specific explanation methods (Section 16.2.2). Why might a regulator prefer model-agnostic methods? Why might a model developer prefer model-specific methods?

A.4. Describe how LIME works at a conceptual level (Section 16.3.1). You do not need to discuss the mathematics — explain the five-step process in plain language. Then identify one strength and one limitation of the method.

A.5. Describe how SHAP works at a conceptual level (Section 16.3.2). Explain the connection to cooperative game theory and Shapley values. What question does SHAP answer for each feature?

A.6. Section 16.4 discusses GDPR Article 22 and the debate over whether it creates a "right to explanation." Summarize the two sides of this debate. What does Article 22 explicitly guarantee, and what is disputed?

A.7. Define transparency theater as presented in Section 16.5. Provide two examples — one from the chapter and one of your own — of practices that create the appearance of transparency without its substance. For each example, explain what makes the transparency performative rather than meaningful.

Part B: Applied Analysis ⭐⭐

Analyze scenarios, arguments, and real-world situations using concepts from Chapter 16.

B.1. Consider the following scenario:

A bank denies a customer's mortgage application. The bank uses a machine learning model (a gradient-boosted ensemble of 5,000 decision trees) to assess creditworthiness. When the customer asks why they were denied, the bank provides the following explanation: "Your application was assessed by our advanced credit evaluation system, which considers multiple factors including credit history, income, and financial obligations. Based on a holistic assessment of these factors, your application did not meet our approval threshold."

Evaluate this explanation using the concepts from Chapter 16. Is this meaningful transparency or transparency theater? What specific information is missing? What would a meaningful explanation look like? What would LIME or SHAP produce for this case?

B.2. Section 16.1.3 argues that the black box problem undermines accountability, fairness, consent, trust, and democracy. Choose two of these five values and, for each, construct a specific scenario in which an opaque algorithmic system undermines the value. Be concrete — identify the system, the stakeholder, and the specific way opacity causes harm.

B.3. The chapter discusses the explainability-accuracy trade-off — the observation that the most accurate models (deep neural networks, large ensembles) are often the hardest to explain, while the most interpretable models (linear regression, decision trees) are often less accurate. Consider the following domains:

(a) Medical diagnosis (detecting cancer in radiology images)
(b) Loan approval for consumer credit
(c) Spam filtering in email
(d) Bail/pretrial detention decisions in criminal justice

For each domain, assess: Is accuracy or explainability more important? Why? Is the trade-off acceptable? Should there be a regulatory requirement for explainability in each domain?

B.4. Mira is asked by VitraMed's management to present the patient risk model's results to a hospital ethics committee. The model is a gradient-boosted ensemble that she cannot fully interpret. She can use LIME to generate local explanations for individual patients. Advise Mira: What should she present? What limitations of LIME should she disclose? How should she communicate the tension between the model's accuracy and its opacity?

B.5. Eli argues in class that the black box problem in criminal justice is not primarily a technical problem but a power problem: "The system is opaque because opacity serves the interests of the companies that sell it and the institutions that use it. If transparency were profitable, these systems would be transparent." Evaluate Eli's argument. Is he right? To what extent is algorithmic opacity a choice rather than a technical limitation? What role does trade secret law play?

B.6. Section 16.5 discusses "transparency theater" in algorithmic systems. Some companies publish "model cards" — documents that describe their model's purpose, performance metrics, and limitations. Evaluate model cards as a transparency mechanism. Are they meaningful transparency or transparency theater? What would make a model card genuinely useful versus merely performative?

Part C: Real-World Application Challenges ⭐⭐-⭐⭐⭐

These exercises ask you to investigate your own algorithmic environment.

C.1. ⭐⭐ Explanation Request. Apply for a service that uses algorithmic decision-making (a credit card, a loan pre-qualification, or a job application through an automated system). If you are denied or receive a score, request an explanation. Document: (a) what explanation, if any, you received, (b) whether it was meaningful or generic, (c) whether you could identify specific factors that influenced the decision, and (d) whether the explanation was sufficient for you to take action to change the outcome. Write a one-page analysis connecting your experience to the chapter's concepts.

C.2. ⭐⭐ Platform Explanation Audit. Select a platform that provides algorithmic recommendations (YouTube, Instagram, Spotify, Amazon). Investigate what explanation the platform provides for its recommendations. Does it say "Because you watched X" or "Popular in your area" or nothing at all? Document at least 10 recommendations and the explanations provided. Assess: Are the explanations meaningful? Do they help you understand the algorithm's logic? What information is missing?

C.3. ⭐⭐⭐ GDPR Subject Access Request. If you are in the EU (or interacting with an EU-regulated service), exercise your right under GDPR Article 15 to request information about how your personal data is processed, including any automated decision-making under Article 22. Document the process: How easy was it? What did you learn? Was the information provided meaningful or boilerplate? Write a one-page analysis. (If you are not in the EU, research published examples of Article 15/22 requests and analyze the responses.)

C.4. ⭐⭐⭐ Explainability Comparison. Choose a publicly available dataset (e.g., the UCI Adult Income dataset or any classification dataset from Kaggle). Train two models on the data: one interpretable model (e.g., logistic regression or decision tree) and one black box model (e.g., random forest with 500 trees or a neural network). Compare: (a) the accuracy of each model, (b) the ease of explaining individual predictions, and (c) the trade-off between accuracy and explainability. Write a one-page report.

Part D: Synthesis & Critical Thinking ⭐⭐⭐

These questions require you to integrate multiple concepts from Chapter 16 and think beyond the material presented.

D.1. The chapter presents the explainability-accuracy trade-off as a genuine dilemma. Some researchers argue that this trade-off is overstated — that interpretable models can achieve accuracy comparable to black box models for many practical tasks (Rudin, 2019). Others argue that for the most complex tasks (natural language understanding, image recognition), accuracy requires complexity and opacity. Evaluate both positions. Under what conditions is the trade-off real? Under what conditions might it be solvable?

D.2. Consider the concept of a "right to explanation" for algorithmic decisions. Write a short essay (300-500 words) addressing: Who should have this right? For what kinds of decisions? What would a meaningful explanation look like? Should the right be to an explanation of the algorithm's general logic (global explainability) or to an explanation of the specific decision (local explainability) — or both?

D.3. The chapter identifies several reasons black boxes exist: complexity-accuracy trade-offs, dimensionality, learned representations, ensemble complexity, proprietary secrecy, and liability avoidance (Section 16.1.2). Not all of these reasons are equally legitimate. Rank these reasons from most to least legitimate, and defend your ranking. Which reasons should be accepted as unavoidable constraints, and which should be challenged through regulation, institutional design, or technical innovation?

D.4. Dr. Adeyemi asks: "Can you meaningfully consent to a decision you cannot understand?" (paraphrased from the chapter's discussion of the Consent Fiction in Section 16.1.3). Write a response that connects this question to at least three concepts from the chapter and from earlier chapters in Part 3 (algorithmic decision-making from Chapter 13, bias from Chapter 14, and fairness from Chapter 15). How does opacity compound the problems of bias and unfairness?

Part E: Research & Extension ⭐⭐⭐⭐

These are open-ended projects for students seeking deeper engagement.

E.1. The Right to Explanation in Practice. Research the implementation of GDPR Article 22 across at least three EU member states. How have different countries interpreted the "right to explanation"? Have there been enforcement actions? What have courts ruled? Write a 1,000-word comparative analysis.

E.2. Explainable AI in Healthcare. Research a specific XAI application in healthcare (e.g., IBM Watson for Oncology, Google's DeepMind health projects, or PathAI for pathology). Write a 1,200-word analysis covering: What does the system do? How is explainability implemented? How do clinicians interact with the explanations? What evidence exists about whether the explanations improve or harm clinical decision-making?

E.3. The LIME vs. SHAP Debate. Research the technical and practical differences between LIME and SHAP. When do they produce different explanations for the same prediction? What are the implications for stakeholders who rely on these explanations? Write a 1,000-word analysis that is accessible to a non-technical audience.

Solutions

Selected solutions are available in appendices/answers-to-selected.md.