Chapter 13: Quiz

The Black Box Problem

Instructions: Complete all sections. Total: 100 points.

Part I: Multiple Choice (8 questions, 4 points each = 32 points)

1. Which of the following best describes "institutional opacity" in AI systems?

A) The mathematical complexity of deep neural networks that prevents easy explanation of decisions B) The experience of users who do not know that AI is involved in decisions affecting them C) A deliberate organizational choice not to explain AI decisions, even when explanation is technically possible D) The difficulty of explaining AI behavior to non-technical stakeholders

Answer: C

Rationale: Institutional opacity is distinct from technical opacity (A) and opacity to users (B). It arises from organizational choices — like trade secrecy claims — rather than inherent technical limitations.

2. In State v. Loomis (2016), the Wisconsin Supreme Court held:

A) That COMPAS was unconstitutional because it could not be independently verified B) That defendants have a constitutional right to examine any proprietary algorithm used in their sentencing C) That using COMPAS in sentencing was permissible as one factor among many, even without access to the algorithm D) That the trade secrecy protection for COMPAS was invalid because public safety concerns outweigh commercial interests

Answer: C

Rationale: The court held that the sentence did not rely solely on COMPAS and that the tool was one factor among many. The court did not require disclosure of the algorithm.

3. The "Rashomon effect" in statistical modeling refers to:

A) The phenomenon where models produce different outputs for similar inputs, revealing instability B) The existence of many models with similar predictive accuracy but very different internal logic C) The tendency of more complex models to overfit to training data while underperforming on new data D) The misleading effect of post-hoc explanations that misrepresent what a model actually does

Answer: B

Rationale: Breiman (2001) coined the Rashomon effect to describe the multiplicity of equally accurate but logically distinct models. This phenomenon is central to the argument that black-box models are not always necessary for high accuracy.

4. Which of the following correctly describes the purpose of SHAP values in AI explainability?

A) They retrain a simpler, interpretable model on the same data as the original complex model B) They create a locally faithful approximation of the model's behavior around a specific prediction C) They assign each input feature a contribution value for a specific prediction, based on game-theoretic principles D) They audit a model's outputs across demographic groups to identify patterns of disparate impact

Answer: C

Rationale: SHAP (SHapley Additive exPlanations) draws on Shapley values from cooperative game theory to assign each feature a contribution to a specific prediction. LIME (B) creates local approximations.

5. The Equal Credit Opportunity Act (ECOA) requires lenders to:

A) Use only interpretable models for credit decisions B) Provide adverse action notices stating the principal reasons for credit denial C) Disclose the full algorithm used for credit scoring upon applicant request D) Submit all credit scoring models to the CFPB for review before deployment

Answer: B

Rationale: ECOA and Regulation B require adverse action notices with specific factors. ECOA does not require model disclosure or mandate interpretable models.

6. Which statement about the accuracy-interpretability trade-off is most consistent with Rudin's (2019) argument?

A) The trade-off is real and unavoidable in almost all domains; opacity should be accepted as the cost of AI performance B) The trade-off is real in complex perception tasks but is often exaggerated in high-stakes tabular data applications where interpretable models can match black-box accuracy C) Interpretable models are always equally accurate to black-box models, and there is no genuine trade-off D) Post-hoc explanation methods eliminate the practical significance of the trade-off in all domains

Answer: B

Rationale: Rudin argues the trade-off is particularly exaggerated in tabular data applications like credit and recidivism prediction. She does not claim interpretable models always match black-box accuracy in all domains (C).

7. The EU Digital Services Act (DSA) requires "Very Large Online Platforms" to:

A) Open-source their recommendation algorithms B) Conduct annual algorithmic risk assessments and provide researcher data access C) Optimize their recommendation algorithms for user-defined metrics rather than engagement D) Submit recommendation algorithm changes to regulatory review before deployment

Answer: B

Rationale: The DSA requires risk assessments, independent audits, and structured researcher access, among other requirements. It does not require open-sourcing (A) or pre-deployment review of algorithm changes (D).

8. The "accountability gap" described in Chapter 13 refers to:

A) The difference in AI governance standards between developed and developing countries B) The gap between what AI systems can do and what their developers understand about their behavior C) The space that opens when no human is in a position to take responsibility for an AI-driven decision D) The discrepancy between an organization's stated AI ethics commitments and its actual practices

Answer: C

Rationale: The accountability gap specifically refers to the diffusion or evaporation of human responsibility that occurs when AI systems make consequential decisions that no individual human can fully explain or own.

Part II: True / False (5 questions, 4 points each = 20 points)

9. GDPR Article 22 provides an absolute right for any person to prevent any automated decision from being made about them by any organization.

Answer: FALSE

Rationale: Article 22 provides a right not to be subject to automated decisions that significantly affect the individual, but this right has exceptions: when the individual has consented, when the decision is necessary for a contract, or when it is authorized by member-state law. It is not an absolute right.

10. The Public Safety Assessment (PSA) developed by the Arnold Foundation uses a fully published algorithm, meaning the specific variables and weights used in scoring are publicly available.

Answer: TRUE

Rationale: The PSA's methodology, including the nine criminal history variables and their weights, is published in peer-reviewed literature and on the Arnold Ventures website. This contrasts with COMPAS, whose precise algorithm is proprietary.

11. LIME (Local Interpretable Model-agnostic Explanations) provides an exact mathematical explanation of why a deep neural network made a specific prediction.

Answer: FALSE

Rationale: LIME provides a local approximation of model behavior around a specific prediction using a simpler, interpretable model. It is an approximation, not an exact explanation of the neural network's internal computation.

12. Trade secrecy protections under US law have never been successfully invoked to prevent defendants from scrutinizing the algorithms used in their criminal cases.

Answer: FALSE

Rationale: The Loomis case is precisely an instance where trade secrecy was effectively invoked to prevent full scrutiny of the COMPAS algorithm, and the Wisconsin Supreme Court upheld the use of the tool despite the defendant's inability to examine it.

13. According to internal Facebook research revealed by the Frances Haugen disclosure, Facebook was unaware that its engagement-optimization algorithm tended to amplify divisive content until this research was published externally.

Answer: FALSE

Rationale: The Haugen documents showed that Facebook's own internal researchers had documented the algorithm's tendency to amplify divisive content and had presented these findings to senior leadership. The company was aware of the phenomenon internally; the opacity of the platform prevented external observers from documenting it.

Part III: Short Answer (4 questions, 8 points each = 32 points)

14. Explain the distinction between "technical opacity" and "institutional opacity" in AI systems. Why is this distinction important for governance and policy? Provide one original example of each type (not from the chapter text). (4–6 sentences)

Model Answer: Technical opacity arises from the inherent mathematical complexity of models like deep neural networks, whose internal computations cannot easily be translated into human-readable reasoning even by their developers. Institutional opacity arises not from technical necessity but from organizational choice — a company or government agency that could explain AI decisions but chooses not to, typically through trade secrecy or competitive sensitivity claims. The distinction matters for governance because the appropriate response to each is different: technical opacity may require investments in interpretable model design or post-hoc explanation methods, while institutional opacity primarily requires legal and regulatory interventions compelling disclosure. An example of technical opacity: a hospital's deep learning model for predicting sepsis risk produces predictions through complex feature interactions that even the model's developers cannot fully trace for a specific patient. An example of institutional opacity: a credit scoring company uses a logistic regression model whose formula could readily be explained to applicants, but declines to do so to protect its proprietary scoring methodology.

15. What is the "accountability gap" in AI governance, and what organizational practices does Chapter 13 recommend for closing it? Be specific. (4–6 sentences)

Model Answer: The accountability gap is the space that opens when AI systems make consequential decisions that no human being can fully explain, and as a result no individual or organizational unit can be meaningfully held responsible for outcomes. It arises because AI diffuses responsibility across multiple actors — the vendor, the deploying organization, the frontline worker, the supervisor — each of whom can point to others when harm occurs. Chapter 13 recommends closing this gap through several specific practices: requiring that human reviewers have the information and time to genuinely evaluate AI recommendations rather than rubber-stamping them; maintaining documentation sufficient to reconstruct the basis for consequential decisions; designating specific individuals and units as accountable for AI-influenced decisions, with real consequences for failures; and preserving audit trails that enable retrospective review when decisions are challenged. The key principle is that organizational accountability should be treated as a design requirement for AI-driven decision systems, not an afterthought.

16. Describe the key findings of ProPublica's analysis of COMPAS scores in Broward County, Florida. What can external audit of AI systems through output data establish, and what does it leave unanswered? (4–6 sentences)

Model Answer: ProPublica analyzed COMPAS scores for more than 7,000 people in Broward County, comparing predicted risk with actual two-year recidivism outcomes, and found that Black defendants were approximately twice as likely as white defendants to be falsely labeled high-risk (high false positive rate), while white defendants who went on to reoffend were more likely to have been labeled low-risk (high false negative rate for white defendants). This finding sparked a national debate about racial bias in algorithmic criminal justice tools. The ProPublica investigation demonstrates what external audit through output data can establish: it can document disparate outcomes across demographic groups with statistical confidence. However, output-based audit leaves the causal mechanism unanswered: it cannot determine whether the disparity originates in the algorithm's logic, in biased training data, in the correlation of the algorithm's inputs with race, or some combination. Without access to the model's internals, external audit can describe the symptom but cannot reliably diagnose the cause or prescribe effective remediation.

17. Chapter 13 describes five reasons why opacity in AI matters — beyond the simple fact that the model cannot be read. Choose any three of these reasons, explain each in your own words, and give a brief, concrete example of each. (5–7 sentences)

Model Answer (sample — multiple valid answers exist): First, opacity undermines due process: when a government uses an algorithm to impose a consequence on a person, and the algorithm cannot be scrutinized, the person cannot challenge the basis for the decision. A concrete example is the Loomis case, where a defendant could not contest the logic behind his COMPAS score even though it influenced his sentence. Second, opacity obstructs error correction: systems that cannot be understood cannot be effectively fixed when they fail. If a Medicaid eligibility algorithm incorrectly denies benefits to a class of applicants but the algorithm is opaque, administrators cannot identify the source of the error, assess its scope, or design a correction — they can only observe that something is wrong. Third, opacity undermines trust: when institutions make decisions through processes that neither the institutions nor affected individuals can explain, the legitimacy of those institutions is eroded. If a hospital patient learns that her treatment plan was shaped by an algorithm that no one can explain, her trust in the hospital's care is reasonably diminished, regardless of whether the algorithm's recommendation was correct.

Part IV: Applied Scenario (3 questions, 16 points total — 4+6+6)

Scenario: Meridian County uses a commercial AI system called "CareScore" to determine eligibility and care levels for Medicaid-funded home health services. CareScore was developed by a private vendor and is used in 12 states. The algorithm is proprietary. Meridian County's Department of Social Services has received complaints from several advocacy organizations that elderly and disabled recipients have had their care hours significantly reduced since CareScore was introduced, often without clear explanation. When caseworkers are asked why a recipient's care hours were reduced, they report that "the system determined the level" but cannot explain how.

18. (4 points) What type(s) of opacity does this scenario exhibit? Identify the specific forms of opacity present and explain why each applies.

Model Answer: This scenario exhibits at least two forms of opacity. Institutional opacity is present because CareScore's algorithm is proprietary — the vendor has chosen not to disclose its methodology. The county may or may not be able to obtain more information than it currently shares, but the vendor's trade secrecy claim is a deliberate organizational choice. Opacity to users is also present: the affected Medicaid recipients do not have meaningful information about how AI is determining their care levels; they receive a decision but not the reasoning behind it. There may also be elements of technical opacity if the underlying model is complex enough that even the vendor cannot easily explain specific decisions — though the scenario does not establish this.

19. (6 points) What due process concerns does this scenario raise, and under what legal framework might a recipient challenge the care reduction? Reference specific legal principles or precedents from Chapter 13.

Model Answer: This scenario raises serious due process concerns. Recipients of Medicaid home health services have a protected property interest in those benefits, and the government cannot reduce or terminate such benefits without procedurally adequate notice and an opportunity to be heard — requirements established in Goldberg v. Kelly (1970) and related cases. The Arkansas automated benefits case discussed in Chapter 13 is directly analogous: Arkansas's Medicaid algorithm reduced care hours without providing recipients with meaningful explanations, and a federal court found this violated due process. The same argument applies here: if the county cannot explain why CareScore reduced a particular recipient's hours, that recipient cannot meaningfully contest the reduction. An argument modeled on the Loomis due process claim could be raised: that using a proprietary algorithm to reduce government benefits without providing a basis for the reduction that the recipient can challenge is procedurally inadequate. Unlike Loomis, where the court was reluctant to impose transparency requirements on criminal sentencing, the benefits context may be more receptive to due process challenges, given existing doctrine on adequate notice in benefits termination cases.

20. (6 points) As a consultant brought in by the County, recommend three specific, actionable steps the county should take immediately to address the transparency and accountability problems in this scenario. Justify each step with reference to Chapter 13.

Model Answer:

Step 1: Require vendor disclosure of algorithm documentation as a condition of continued contract. The county should immediately invoke any audit rights in its vendor contract and demand documentation sufficient to explain how CareScore converts inputs into care-level determinations. If the current contract does not include adequate audit rights, the county should make this a condition of any contract renewal. This addresses institutional opacity by requiring the vendor to justify its claims of trade secrecy and provide at minimum a documented proprietary explanation available to county technical staff. Chapter 13's Section 13.5 establishes that institutional opacity is often a governance choice, and that procurement is the appropriate moment to require transparency.

Step 2: Establish a meaningful human review process for contested care reductions. CareScore's recommendations should be treated as recommendations, not final decisions. Caseworkers should be equipped with enough information about the factors CareScore considered to explain reductions in plain language to recipients, and a supervisor with appropriate technical knowledge should be designated to review contested cases. This addresses the accountability gap (Section 13.8) by ensuring that a human being — not the algorithm — is the formal decision-maker and can be held accountable for the outcome.

Step 3: Suspend CareScore reductions pending an independent fairness audit. The county should commission an external audit of CareScore's outcomes, examining whether care hour reductions are distributed equitably across demographic groups (age, disability type, race, language, geographic area). This audit should be conducted using the output-based methods described in Section 13.6, even in the absence of model access, to identify whether the algorithm is producing disparate outcomes for identifiable populations. If disparities are found, they should be disclosed and remediation required before the system is redeployed.

End of Chapter 13 Quiz

Answer Key Summary

Q	Answer	Q	Answer
1	C	11	FALSE
2	C	12	FALSE
3	B	13	FALSE
4	C	14	See model answer
5	B	15	See model answer
6	B	16	See model answer
7	B	17	See model answer
8	C	18	See model answer
9	FALSE	19	See model answer
10	TRUE	20	See model answer