Quiz: Transparency, Explainability, and the Black Box Problem

Test your understanding before moving to the next chapter. Target: 70% or higher to proceed.


Section 1: Multiple Choice (1 point each)

1. A "black box" system, as defined in this chapter, is one whose:

  • A) Code is written in a programming language the user does not understand.
  • B) Internal workings are opaque — you can observe inputs and outputs but cannot inspect or understand the process connecting them.
  • C) Predictions are always inaccurate.
  • D) Results are always biased.
Answer **B)** Internal workings are opaque — you can observe inputs and outputs but cannot inspect or understand the process connecting them. *Explanation:* Section 16.1.1 defines a black box as a system whose internal workings are opaque. The definition focuses on *understandability*, not on coding language (A), accuracy (C), or bias (D). A black box can be highly accurate and unbiased while still being unexplainable.

2. The chapter identifies two kinds of black boxes. A deep neural network whose "reasoning" is distributed across millions of weights is an example of a:

  • A) Locked safe — deliberately opaque because the builder could explain but chooses not to.
  • B) Locked room — technically opaque because even the builder cannot fully explain the system's behavior.
  • C) Open box — transparent but complex.
  • D) Glass box — interpretable by design.
Answer **B)** Locked room — technically opaque because even the builder cannot fully explain the system's behavior. *Explanation:* Section 16.1.2 distinguishes between "locked room" systems (technically opaque — the complexity of the model exceeds human interpretive capacity) and "locked safe" systems (deliberately opaque — the builder could explain but chooses secrecy for commercial or strategic reasons). A deep neural network with millions of parameters is a locked room; COMPAS (a proprietary risk tool) is a locked safe.

3. Global explainability aims to explain:

  • A) Why the model made a specific prediction for a particular individual.
  • B) The model's overall behavior — which features are most important across all predictions and what patterns the model has learned.
  • C) The geographic distribution of the model's users.
  • D) Why the model was built in the first place.
Answer **B)** The model's overall behavior — which features are most important across all predictions and what patterns the model has learned. *Explanation:* Section 16.2.1 defines global explainability as explaining the model's general behavior. It answers the question "How does this model work?" and is most useful for regulators, auditors, and developers. Local explainability (A) explains individual predictions. Options C and D describe unrelated concepts.

4. LIME (Local Interpretable Model-agnostic Explanations) explains a prediction by:

  • A) Reading the model's source code and translating it into natural language.
  • B) Generating perturbed variations of the input, feeding them through the model, and fitting a simple interpretable model to the local behavior.
  • C) Asking the model's developers to write a justification for each prediction.
  • D) Comparing the prediction to similar cases in a database of past decisions.
Answer **B)** Generating perturbed variations of the input, feeding them through the model, and fitting a simple interpretable model to the local behavior. *Explanation:* Section 16.3.1 describes LIME's five-step process: choose a prediction, generate perturbed samples by varying features, get predictions for each variation, fit a simple model (typically linear) to approximate the black box's behavior locally, and read the simple model's coefficients to identify the most influential features. LIME does not require access to source code (A), developer justification (C), or a case database (D).

5. SHAP values are derived from:

  • A) Neural network attention weights.
  • B) Cooperative game theory, specifically the concept of Shapley values for fairly distributing a game's payout among players.
  • C) The model's gradient with respect to each feature.
  • D) Statistical significance tests on feature coefficients.
Answer **B)** Cooperative game theory, specifically the concept of Shapley values for fairly distributing a game's payout among players. *Explanation:* Section 16.3.2 explains that SHAP applies Shapley values from cooperative game theory. Each feature is treated as a "player" in a game whose "payout" is the prediction. The Shapley value of each feature is its average marginal contribution to the prediction across all possible combinations of features. This provides a mathematically grounded framework for feature attribution.

6. GDPR Article 22, as discussed in Section 16.4, guarantees individuals the right to:

  • A) A complete explanation of any algorithmic decision in plain language.
  • B) Not be subject to a decision based solely on automated processing that produces legal or similarly significant effects, with certain exceptions.
  • C) Access the source code of any algorithm that makes decisions about them.
  • D) Override any algorithmic decision with their own judgment.
Answer **B)** Not be subject to a decision based solely on automated processing that produces legal or similarly significant effects, with certain exceptions. *Explanation:* Section 16.4 explains that Article 22 creates a right not to be subject to *solely* automated decisions with significant effects — but the scope of the right is debated. The article does not explicitly create a "right to explanation" (A is disputed), does not guarantee access to source code (C), and does not create an override right (D). The exceptions include explicit consent, contractual necessity, and authorization by law.

7. "Transparency theater" refers to:

  • A) Public demonstrations of AI systems at technology conferences.
  • B) Disclosure practices that create the appearance of transparency without providing meaningful understanding — performative rather than substantive transparency.
  • C) The use of theatrical metaphors to explain algorithmic concepts.
  • D) Government regulations requiring companies to disclose their algorithms.
Answer **B)** Disclosure practices that create the appearance of transparency without providing meaningful understanding — performative rather than substantive transparency. *Explanation:* Section 16.5 defines transparency theater as practices that look like transparency but fail to produce genuine understanding. Examples include publishing privacy policies that no one reads, providing generic explanations for algorithmic decisions ("your application was assessed holistically"), and releasing model cards with high-level statistics that do not enable meaningful scrutiny. The term parallels "security theater" — measures that create the *feeling* of security without the substance.

8. The explainability-accuracy trade-off refers to:

  • A) The observation that explanations of algorithmic decisions are always inaccurate.
  • B) The tendency for the most accurate models (deep neural networks, large ensembles) to be the hardest to explain, while the most interpretable models (linear regression, decision trees) are often less accurate.
  • C) The trade-off between explaining a system quickly and explaining it thoroughly.
  • D) The inverse relationship between the number of features and the model's accuracy.
Answer **B)** The tendency for the most accurate models (deep neural networks, large ensembles) to be the hardest to explain, while the most interpretable models (linear regression, decision trees) are often less accurate. *Explanation:* Section 16.1.2 discusses this trade-off as a consequence of how modern machine learning works. Complex models can capture subtle, nonlinear patterns that simpler models miss — but the price of this complexity is opacity. This creates a genuine dilemma in high-stakes domains: should we sacrifice some accuracy for interpretability, or accept opacity for better predictions?

9. Which of the following is a limitation of LIME as an explanation method?

  • A) It can only be applied to linear models.
  • B) Different perturbation strategies can produce different explanations for the same prediction, raising consistency concerns.
  • C) It requires access to the model's training data.
  • D) It can only provide global explanations, not local ones.
Answer **B)** Different perturbation strategies can produce different explanations for the same prediction, raising consistency concerns. *Explanation:* Section 16.3.1 identifies several LIME limitations: the explanation depends on how perturbations are generated, the local approximation may not be faithful if the decision boundary is complex, and explaining the same prediction twice with different random seeds can produce different results. Option A is wrong (LIME is model-agnostic). Option C is wrong (LIME only needs the model's prediction function). Option D is wrong (LIME specifically provides *local* explanations).

10. The chapter argues that the black box problem is "a concentrated expression of the Power Asymmetry" because:

  • A) Black box algorithms are always more powerful than interpretable ones.
  • B) The entity that builds the algorithm understands or controls its logic, while the entity subject to the algorithm does not — creating an informational imbalance that is a form of power.
  • C) Only powerful corporations can afford to build black box systems.
  • D) Black box systems always favor powerful individuals over marginalized ones.
Answer **B)** The entity that builds the algorithm understands or controls its logic, while the entity subject to the algorithm does not — creating an informational imbalance that is a form of power. *Explanation:* Section 16.1.3 frames the black box problem as a Power Asymmetry: the algorithm's builder has knowledge of (or at least control over) how it works, while the person affected by its decisions does not. This informational asymmetry is "the power to decide without explaining, to sort without justifying, to judge without being questioned." The problem is about the *relationship* between builder and subject, not about absolute power (A, C) or outcome direction (D).

Section 2: True/False with Justification (1 point each)

11. "All algorithms are black boxes."

Answer **False.** *Explanation:* Section 16.1.1 explicitly states that not all algorithms are black boxes. A linear regression model is highly interpretable — each feature has a coefficient, and the model's logic can be directly read. Decision trees, rule-based systems, and simple statistical models are also interpretable. Black boxes are specifically those systems whose complexity, dimensionality, or proprietary nature makes their internal workings opaque.

12. "SHAP values have a stronger mathematical foundation than LIME explanations because they satisfy desirable properties from cooperative game theory (consistency, local accuracy, efficiency)."

Answer **True.** *Explanation:* Section 16.3.2 notes that SHAP is grounded in cooperative game theory and satisfies properties including consistency (if a feature's contribution increases, its attribution should not decrease), local accuracy (the attributions sum to the actual prediction), and efficiency (all of the prediction is distributed among features). LIME, while intuitive and useful, does not guarantee these properties — its explanations depend on perturbation strategy and may vary between runs.

13. "The right to explanation under GDPR Article 22 is clearly defined and universally agreed upon by legal scholars."

Answer **False.** *Explanation:* Section 16.4 discusses the significant scholarly debate about whether GDPR Article 22 creates a genuine "right to explanation." Article 22 grants the right not to be subject to solely automated decisions with significant effects, and Recital 71 mentions "meaningful information about the logic involved." But whether this constitutes a legally enforceable right to a specific explanation of an individual decision remains contested. Some scholars (Goodman and Flaxman, 2017) argue it does; others (Wachter, Mittelstadt, and Floridi, 2017) argue it creates only a right to general information about the system's logic.

14. "Attention weights in neural networks reliably indicate which input features caused the model's prediction."

Answer **False.** *Explanation:* Section 16.2.2 includes a "Common Pitfall" warning that high attention does not necessarily mean causal importance. Research has shown that a model might attend to a feature without relying on it for the final decision. Attention weights show what the model "looked at," but not necessarily what it "decided based on." Model-agnostic methods like LIME and SHAP, while less direct, may provide more reliable explanations of actual decision factors.

15. "The black box problem undermines the fairness auditing tools developed in Chapter 14 because you cannot fully audit a system for bias if you cannot inspect its logic."

Answer **True.** *Explanation:* Section 16.1.3 makes this point directly: "You cannot audit a system for bias if you cannot inspect its logic." The BiasAuditor from Chapter 14 can detect *that* a system produces disparate outcomes, but not *why*. Without explainability, identifying the specific features, proxies, or interactions that produce bias — and therefore addressing them — is significantly harder. Opacity compounds the bias problem.

Section 3: Short Answer (2 points each)

16. Explain, using a concrete example, the difference between global and local explainability. Why does the chapter argue that both are necessary? For what purpose is each most useful?

Sample Answer Global explainability tells you how a model works in general. For example, a global explanation of a credit-scoring model might say: "The three most important features across all predictions are payment history (contributing 30% of the score's variance), credit utilization (25%), and length of credit history (15%)." This tells you the model's overall logic and priorities. Local explainability tells you why a specific prediction was made. For the same credit model, a local explanation for a denied application might say: "This application was denied primarily because of three recent late payments (-18 points) and high credit utilization at 89% (-12 points)." Both are necessary because they serve different stakeholders and different purposes. A regulator needs global explainability to assess whether the model is appropriate for its intended use — are the features reasonable? Are any features potential proxies for protected characteristics? An affected individual needs local explainability to understand their specific decision, identify errors, and take action to change the outcome (pay down credit card balances, correct a misreported late payment). *Key points for full credit:* - Provides a concrete example for each type - Explains the different audiences/purposes - Explains why both are needed

17. Explain what "transparency theater" means and why it is harmful. Provide a specific example from the chapter or your own experience and explain what would make the transparency meaningful rather than performative.

Sample Answer Transparency theater refers to practices that create the appearance of openness and accountability without actually providing meaningful understanding. It is harmful because it gives stakeholders (users, regulators, the public) a false sense that they understand and can scrutinize a system, when in reality the disclosure is too vague, too technical, or too incomplete to enable genuine oversight. Example: A bank denies a loan and provides the explanation: "Your application was assessed using our advanced credit evaluation system, which considers multiple factors. Based on a holistic assessment, your application did not meet our approval threshold." This tells the applicant nothing actionable. They cannot identify which factors mattered, what the threshold is, or what they could change. The explanation satisfies a formal requirement (providing a response to the customer) without providing substantive information. Meaningful transparency would include: the specific features that most influenced the denial (e.g., "Your debt-to-income ratio of 45% exceeded the 40% threshold; your credit history of 8 months is below the 24-month minimum"), the relative importance of each factor, and concrete steps the applicant could take to improve their chances. Meaningful transparency enables understanding and action; transparency theater enables only compliance. *Key points for full credit:* - Defines transparency theater - Explains why it is harmful (false sense of oversight) - Provides a specific example - Explains what meaningful transparency would look like

18. The chapter discusses the "explainability-accuracy trade-off." Is this trade-off equally important in all domains, or are there domains where it matters more? Choose two domains discussed in the chapter (or from earlier chapters) and argue which side of the trade-off should be prioritized in each, and why.

Sample Answer The trade-off's importance varies by domain, depending on the stakes, the availability of recourse, and the degree to which human oversight is feasible. In **criminal justice** (bail, sentencing, parole), explainability should be prioritized over marginal accuracy gains. The stakes are extreme — liberty, incarceration, separation from family. Defendants have constitutional rights to due process, which require understanding the basis for decisions. A slightly less accurate but fully interpretable model (e.g., a logistic regression) enables judges, defense attorneys, and defendants to scrutinize, challenge, and understand the decision. The social cost of opacity in criminal justice — undermining due process, fragmenting accountability — outweighs the potential accuracy improvement from a black box model. In **medical imaging** (e.g., detecting tumors in radiology scans), accuracy may warrant higher priority. A missed cancer diagnosis has life-or-death consequences. If a deep neural network detects tumors with 95% accuracy and the best interpretable model achieves 88%, the 7-point gap represents lives saved. However, even here, explainability matters — clinicians need to understand *why* the system flagged an image to verify the result and maintain trust. The solution may be a hybrid: use the accurate black box model, but supplement it with LIME/SHAP explanations and maintain mandatory human review. *Key points for full credit:* - Chooses two specific domains - Argues for different priorities in each - Justifies based on stakes, rights, and oversight needs

19. Eli argues that proprietary secrecy — companies refusing to disclose their algorithms because of trade secret claims — is the most problematic form of the black box problem because it is a choice, not a technical limitation. The chapter notes that COMPAS is opaque not because risk scoring is inherently incomprehensible but because Northpointe/Equivant refuses to disclose its methodology. Evaluate Eli's position. When, if ever, are trade secret claims legitimate grounds for algorithmic opacity?

Sample Answer Eli's position has considerable merit. When a system makes consequential decisions about individuals — criminal sentencing, healthcare allocation, employment — the individuals affected have a strong claim to understanding how those decisions are made. Trade secret protection exists to promote innovation by protecting competitive advantages. But when the "competitive advantage" is a system that determines whether a person goes to jail, the balance should shift toward disclosure. Trade secret claims may be legitimate when: (a) the algorithm operates in a low-stakes, commercial domain (e.g., a recommendation engine for e-commerce) where individuals are not significantly harmed by opacity, and (b) full disclosure would enable competitors to replicate the system, undermining the company's business without benefiting the affected individuals. In these cases, the cost of disclosure exceeds the benefit. Trade secret claims are not legitimate when: (a) the system makes decisions about liberty, health, employment, or access to essential services, (b) the affected individuals have no alternative and cannot meaningfully consent to opacity, and (c) the opacity prevents auditing for bias, accuracy, and fairness. In these cases, the individual's right to understand and challenge the decision outweighs the company's commercial interest. A middle ground — such as disclosure to independent auditors under NDA, or court-ordered disclosure in specific cases — may balance both interests. *Key points for full credit:* - Engages with Eli's argument substantively - Distinguishes between high-stakes and low-stakes domains - Identifies conditions under which trade secret claims are/aren't legitimate - Considers middle-ground solutions

Section 4: Applied Scenario (5 points)

20. Read the following scenario and answer all parts.

Scenario: MediScore AI

A large hospital network deploys "MediScore AI," a deep neural network that predicts which emergency room patients are at high risk of deterioration within 24 hours. The system analyzes vital signs, lab results, medical history, medications, and free-text clinical notes. It assigns each patient a risk score from 0 to 100.

MediScore AI has an AUC (area under the ROC curve) of 0.92 — significantly better than the previous rule-based system (AUC 0.78) and comparable to experienced clinicians. However, the neural network has over 2 million parameters, and even the data science team cannot explain why a specific patient received a particular score.

The hospital provides clinicians with the risk score and a SHAP-based explanation: "Top contributing factors: (1) heart rate trend over past 6 hours (+15 points), (2) creatinine level (+12 points), (3) text in clinical notes mentioning 'confusion' (+8 points)."

A patient's family member asks: "Why was my father labeled high-risk? What exactly does that mean? Can we challenge the score?" The nurse consulting MediScore says: "The system flagged your father based on his vital signs and lab work. I trust the system — it's very accurate."

(a) Classify MediScore AI as a black box type (locked room, locked safe, or both). Justify your classification. (1 point)

(b) Evaluate the SHAP-based explanation provided to clinicians. Is it meaningful transparency or transparency theater? What does it provide, and what does it fail to provide? (1 point)

(c) Evaluate the nurse's response to the family member. Does it satisfy the requirements for meaningful transparency? What should the nurse have said instead? (1 point)

(d) Apply the explainability-accuracy trade-off to this scenario. MediScore AI is more accurate than the previous rule-based system (AUC 0.92 vs. 0.78). Is the accuracy gain worth the loss of explainability? What factors should the hospital consider in making this decision? (1 point)

(e) The family member asks whether they can challenge the score. Under what circumstances should patients or their families have the right to challenge an algorithmic risk assessment in healthcare? Propose a specific mechanism for contestation and explain what it would require. (1 point)

Sample Answer **(a)** MediScore AI is primarily a **locked room** — the neural network's 2 million parameters create technical opacity that even the data science team cannot fully penetrate. The model is not deliberately hidden; it is genuinely too complex for humans to interpret at the level of individual weights and interactions. However, if the hospital refuses to disclose the model architecture, training data composition, or performance metrics to external auditors, it becomes partly a **locked safe** as well — adding deliberate opacity to technical complexity. **(b)** The SHAP explanation is **meaningful but incomplete**. It provides: specific factors (heart rate trend, creatinine, clinical notes), their relative contribution (+15, +12, +8 points), and directional information (these factors *increased* the score). This is genuinely useful for clinicians — it tells them what the model "focused on" and allows them to verify whether the flagged factors are clinically relevant (e.g., is the creatinine level actually elevated? is the heart rate trend real or an artifact?). However, it fails to provide: (a) how the remaining ~65 points of the score were distributed, (b) what factors would need to change to *lower* the score (counterfactual explanation), (c) the confidence or uncertainty of the prediction, and (d) whether the model has known limitations for patients with this specific profile (e.g., different performance for elderly patients or patients with certain conditions). It is meaningful transparency, not theater — but it is partial transparency that could be improved. **(c)** The nurse's response is **insufficient**. "I trust the system — it's very accurate" conveys confidence but no information. The family member asked three questions: why (what caused the score), what it means (clinical implications), and whether it can be challenged (recourse). The nurse answered none of them. A better response: "Your father's risk score was elevated primarily because the system detected a concerning trend in his heart rate over the past several hours, along with a creatinine level that may indicate kidney stress, and a note from the physician about some confusion. The score is a tool that helps us prioritize attention — it doesn't mean anything bad will definitely happen, but it means we're watching him more closely. If you have concerns about the assessment, I can ask the attending physician to review it. The physician always makes the final clinical decisions, not the algorithm." **(d)** The accuracy gain (AUC 0.92 vs. 0.78) is substantial and likely translates to lives saved — better identification of patients at risk of deterioration means earlier intervention. However, the hospital should consider: (a) whether the accuracy gain holds equally across demographic groups (is MediScore biased?), (b) whether clinicians understand the SHAP explanations well enough to act on them appropriately (a score without understanding may lead to either blind trust or blind dismissal), (c) whether the hospital has protocols for overriding the algorithm when clinical judgment disagrees, and (d) whether the opacity undermines patient trust and the therapeutic relationship. A reasonable approach: deploy MediScore AI as a *decision support tool* that supplements rather than replaces clinical judgment, with mandatory human review for high-risk scores and regular bias audits. **(e)** Patients and families should have the right to challenge algorithmic risk assessments when the assessment influences consequential clinical decisions — treatment plans, monitoring intensity, or resource allocation. A contestation mechanism should include: (1) the ability to request a human clinical review of the algorithmic assessment by an attending physician who can override the score based on clinical judgment, (2) access to the SHAP explanation and the underlying data (vital signs, lab results) that informed the score, so the patient/family can identify potential errors (e.g., a misrecorded lab value), (3) a documented process for recording and learning from cases where the algorithm was overridden, and (4) an independent review mechanism (patient advocate or ethics committee) for disputes that cannot be resolved at the clinical level. This requires: training clinicians to communicate about algorithmic tools, establishing clear override protocols, and investing in patient-facing communication about how AI is used in their care.

Scoring & Review Recommendations

Score Range Assessment Next Steps
Below 50% (< 15 pts) Needs review Re-read Sections 16.1-16.3 carefully, redo Part A exercises
50-69% (15-20 pts) Partial understanding Review LIME/SHAP, GDPR Art. 22, and transparency theater
70-85% (21-25 pts) Solid understanding Ready to proceed to Chapter 17; review any missed topics
Above 85% (> 25 pts) Strong mastery Proceed to Chapter 17: Accountability and Audit
Section Points Available
Section 1: Multiple Choice 10 points (10 questions x 1 pt)
Section 2: True/False with Justification 5 points (5 questions x 1 pt)
Section 3: Short Answer 8 points (4 questions x 2 pts)
Section 4: Applied Scenario 5 points (5 parts x 1 pt)
Total 28 points