45 min read

In This Chapter

Opening: The Man Whose Score Was a Secret
Learning Objectives
13.1 What Is a "Black Box" in AI?
13.2 Why Opacity Matters — The Stakes
13.3 Degrees of Interpretability
13.4 The COMPAS Opacity Problem in Depth
13.5 Institutional Opacity — When Transparency Is a Choice
13.6 The Audit Problem — You Can't Audit What You Can't See
13.7 Sector-Specific Transparency Obligations
13.8 The Accountability Gap — When No One Can Explain AI Decisions
13.9 The Business Case for Transparency
13.10 Toward More Transparent AI Systems
Discussion Questions

Case Study 01 Case Study 02 Key Takeaways Exercises Quiz Further Reading

Chapter 13: The Black Box Problem

Opening: The Man Whose Score Was a Secret

In 2013, a man named Eric Loomis was arrested in La Crosse, Wisconsin, following a drive-by shooting. He had not fired the weapon, but he was convicted of attempting to flee an officer and operating a vehicle without the owner's consent. At sentencing, Judge Scott Horne consulted a risk assessment generated by a software tool called COMPAS — Correctional Offender Management Profiling for Alternative Sanctions. The COMPAS system had assigned Loomis a high risk score. The judge, citing both the presentence report and the COMPAS score, sentenced Loomis to six years in prison — the maximum available.

Loomis's lawyers appealed. They argued that sentencing him partly on the basis of a proprietary algorithm whose workings he could not inspect or challenge violated his constitutional right to due process. The Wisconsin Supreme Court disagreed. In State v. Loomis (2016), the court held that while COMPAS had been considered, the sentence was not based solely on the score, and that using the tool did not violate due process. The decision left a troubling question unanswered: if a government imposes consequences on a person partly based on a secret calculation, what does "due process" mean?

The company that makes COMPAS, then called Northpointe and now Equivant, has refused to disclose its full algorithm, citing trade secrecy. Defendants have no way to examine whether the inputs are accurate, whether the algorithm's logic is legally defensible, or whether it contains biases that affect their scores. They receive a number and a category label — high, medium, or low risk — and that is all.

This is the black box problem in its most consequential form: an algorithm that determines, at least in part, whether a person goes to prison, and no one outside the company — not the defendant, not the defense attorney, not the judge, not independent researchers — can fully audit how it works.

This chapter is about that problem. It is not only a problem in criminal justice. The same dynamic operates in healthcare, credit, employment, insurance, public benefits, content moderation, and many other domains where AI-driven decisions shape human lives. The black box problem is not merely a technical puzzle. It is a governance challenge that goes to the heart of accountability, due process, and the ability of democratic institutions to oversee consequential decisions made about citizens.

Learning Objectives

By the end of this chapter, students will be able to:

Define "black box AI" and distinguish between the three primary forms of opacity: technical opacity, institutional opacity, and opacity to users.
Explain why interpretability matters in high-stakes AI applications, using examples from criminal justice, healthcare, credit, and employment.
Describe the spectrum of AI model interpretability and assess where different types of models fall on that spectrum.
Analyze the legal and ethical dimensions of the Loomis v. Wisconsin case and explain its implications for due process in the age of algorithmic governance.
Differentiate between internal model explanations and post-hoc explanation methods, and explain the limits of each.
Evaluate the business case for AI transparency and explain how opacity creates organizational liability and risk.
Identify sector-specific transparency obligations for AI systems in financial services, healthcare, criminal justice, and employment.
Apply the concept of the "accountability gap" to real-world scenarios where AI opacity diffuses human responsibility.

13.1 What Is a "Black Box" in AI?

The term "black box" comes from engineering and systems theory. It describes a device or system whose internal workings are not visible to the observer. You see what goes in and you see what comes out, but the transformation that occurs in between is hidden. Engineers studying airplane crashes use flight data recorders, which are colloquially called black boxes, to understand what happened inside the system when external observation was impossible.

In the context of artificial intelligence, a black box refers to any AI model or system where the reasoning process — the path from input to output — is not transparent, interpretable, or legible to stakeholders. An AI system is a black box when you can observe that a loan applicant was denied, but you cannot trace precisely why: which features mattered, how they were weighted, and what causal logic the model applied.

The Three Kinds of Opacity

Opacity in AI is not a single phenomenon. It is useful to distinguish among three distinct types, which have different causes and require different solutions.

Technical opacity arises from the inherent mathematical complexity of modern machine learning models. Deep neural networks, for instance, may contain hundreds of millions of parameters organized into dozens of layers of computation. When such a network classifies an image, identifies a fraudulent transaction, or assigns a recidivism risk score, the output emerges from an extraordinarily intricate sequence of matrix multiplications and non-linear transformations. Even the engineers who built the model cannot easily trace the reasoning from a specific input to a specific output. The behavior emerges from the model's structure, but that structure is not a set of human-readable rules — it is a numerical artifact, optimized through exposure to training data, whose internal logic resists straightforward explanation.

This is not a bug in the usual sense. These models are often extraordinarily capable precisely because they can capture complex patterns in high-dimensional data that simpler, more interpretable models cannot. A logistic regression model that predicts hospital readmissions on the basis of five clinical variables is easy to understand — and may be far less accurate than a deep learning model trained on thousands of variables including imaging, notes, lab values, and medication histories. The complexity that makes the system powerful is also what makes it opaque.

Institutional opacity arises not from technical necessity but from organizational choice. This is the opacity of companies and governments that could explain their AI decisions but choose not to. Trade secrecy is the most common shield. Northpointe's refusal to disclose the COMPAS algorithm is a paradigm case: the company can explain how the system works; it simply won't, because doing so would expose intellectual property. Similarly, many large technology companies use algorithmic systems to make consequential decisions about users — ranking content, determining advertising eligibility, flagging accounts for suspension — without disclosing the criteria or logic involved. This is opacity as competitive strategy, or in some cases as a way to avoid accountability for decisions that stakeholders might object to if they understood how they were made.

It is critical to understand that institutional opacity and technical opacity often coexist and are sometimes conflated to the company's advantage. A firm may argue that its system is too technically complex to explain when, in fact, the relevant decisions could be explained in terms a layperson could understand — the firm simply prefers not to explain them.

Opacity to users is perhaps the most pervasive and underappreciated form. This is the opacity experienced by people who are subject to AI-driven decisions without knowing it. A social media user who sees certain posts and not others may not know that an algorithm has curated their feed based on engagement predictions. A job applicant whose resume was screened by an automated system before any human ever saw it may not know that AI had a role. A patient whose treatment path is shaped by a clinical decision support tool may not know the tool is influencing their care. In each case, the person cannot meaningfully engage with, question, or contest the AI's role because they do not know it is playing one.

Why Modern Machine Learning Is More Opaque

Rule-based AI systems — sometimes called "expert systems" or "GOFAI" (Good Old-Fashioned AI) — were often interpretable by design. They encoded explicit human knowledge in the form of logical rules: IF the patient has elevated troponin AND reports chest pain THEN flag for cardiac evaluation. These systems were transparent because a human expert wrote down the rules, and any stakeholder could read them.

Modern machine learning takes a fundamentally different approach. Rather than encoding explicit rules, these systems learn patterns from data. The model's logic is not written — it is induced. What the model "knows" is distributed across thousands or millions of numerical parameters. No one wrote a rule that says "a criminal history of more than three arrests combined with an unstable residential history and a certain zip code adds 2.3 points to the risk score." The model arrived at whatever weighting it uses through an optimization process across training examples, and the resulting logic — if it can be called logic — is not legible as human reasoning.

The shift from symbolic to statistical AI has produced remarkable performance gains. It has also produced a fundamental legibility crisis, with profound implications for governance and accountability.

The Accuracy-Interpretability Trade-Off

It is commonly argued that there is an inherent trade-off between model accuracy and model interpretability — that to get the best predictions, you must accept opacity. This claim has significant policy implications, so it deserves scrutiny.

There are certainly domains where the trade-off appears real: image recognition, protein structure prediction, and some complex natural language processing tasks. In these domains, deep learning models substantially outperform simpler alternatives, and no interpretable model has come close to matching their performance.

But in many high-stakes decision-making domains — credit scoring, recidivism prediction, clinical risk stratification — the evidence that complex black-box models substantially outperform well-designed interpretable models is weaker than commonly assumed. Cynthia Rudin at Duke University has argued forcefully that in these domains, the accuracy-interpretability trade-off is largely a myth perpetuated by practitioners who reach for off-the-shelf black-box models rather than investing in the more painstaking work of building interpretable alternatives. We will return to this important argument in Section 13.3.

Vocabulary Builder

Black box: An AI system whose internal decision-making process is not visible or interpretable to external stakeholders.
Interpretability: The degree to which a human can understand the internal mechanisms of a model — why it produces the outputs it does.
Explainability: Broader than interpretability; refers to the ability to describe AI behavior in terms humans can understand, including through post-hoc approximation methods.
Opacity: The state of being non-transparent; the opposite of legibility.
Transparency: The quality of being open and understandable; in AI, refers to accessible documentation of how systems work and how decisions are made.
Model complexity: The number and structure of parameters in a model; higher complexity models are generally less interpretable but may be more accurate.

13.2 Why Opacity Matters — The Stakes

Opacity in AI is not merely a technical inconvenience. It produces concrete harms and creates structural deficits in accountability that ripple across the institutions that use AI systems. To understand why opacity matters, it helps to work through the specific values it threatens.

Due Process

Due process is the legal principle that the government must not deprive a person of life, liberty, or property without fair procedures. In the algorithmic age, this principle faces a new challenge: what does "fair procedure" mean when the decision-maker is an algorithm whose reasoning cannot be inspected or challenged?

The challenge is not hypothetical. Criminal sentencing, parole determinations, child welfare investigations, immigration adjudications, and public benefits determinations — all domains where the state wields enormous power over individuals — are being increasingly influenced by algorithmic tools. If a person cannot know how an algorithm reached its conclusion about them, how can they challenge errors in the inputs, contest biased assumptions embedded in the training data, or argue that the algorithm's logic is inappropriate for their circumstances?

The Loomis case illustrates the problem starkly. Loomis could challenge the facts in the presentence report — the documented record of his history and circumstances — but he could not challenge the algorithmic logic that transformed those facts into a risk score. The score was treated as authoritative, and its derivation was a trade secret. That is not process in any meaningful sense.

Equal Protection

The equal protection principle holds that similarly situated individuals must be treated similarly by the law, and that government may not discriminate on the basis of race, sex, or other protected characteristics. Opaque AI systems make equal protection violations extraordinarily difficult to detect and prove.

If an algorithm systematically assigns higher risk scores to Black defendants than to similarly situated white defendants — as ProPublica's 2016 analysis of COMPAS data suggested — the opacity of the algorithm makes it nearly impossible to determine why. Is the algorithm using race directly? (Most claim not to.) Is it using proxies for race — zip codes, school quality indicators, neighborhood characteristics — that correlate with race but appear facially neutral? Are the biases embedded in the historical criminal justice data that trained the model? Without access to the model's internals, researchers can observe disparate outcomes but often cannot trace the mechanism producing them. You can document that the system produces racially disparate results; you cannot easily explain why, or fix it.

Informed consent — the principle that people should understand and agree to consequential decisions affecting them — is rendered largely meaningless when AI is involved. How can a patient meaningfully consent to a care pathway shaped by an opaque clinical algorithm? How can a credit applicant meaningfully assess whether to provide personal information if they don't know how it will be used? The concept of consent presupposes the ability to understand what one is consenting to. Opacity severs that connection.

Error Correction

Systems that cannot be understood cannot be effectively corrected. If a model makes errors — and all models do — understanding why requires understanding how. The black box structure of complex AI systems makes it difficult to identify the source of errors, assess their scope, and design corrections. A data scientist who knows that a model is producing inaccurate risk scores for a specific demographic subpopulation but cannot examine how the model processes that group's features is in a very difficult position: they can observe the problem but cannot easily fix it.

Trust and Institutional Legitimacy

Public trust in institutions — courts, hospitals, financial institutions, government agencies — depends in part on the legibility of how those institutions make decisions. When institutions outsource decisions to systems that neither the institutions nor the affected parties can explain, they undermine the basis of that trust. Research consistently shows that people are more willing to accept AI-assisted decisions when those decisions can be explained, even if the explanation is not perfectly complete. Opacity is not merely a legal or ethical problem; it is a practical threat to institutional legitimacy.

The Professional Responsibility Problem

Physicians, attorneys, and judges bear professional and ethical obligations to exercise independent judgment on behalf of the people they serve. Each of these professions has developed extensive norms about when and how to rely on external tools and opinions. Can a physician responsibly rely on a clinical decision support tool that she cannot evaluate or challenge? Can a judge responsibly impose a sentence that is influenced by an algorithm whose logic she cannot scrutinize? Can a defense attorney adequately represent a client whose risk score was generated by a black box?

These questions are not rhetorical. The answer that these professions are beginning to work toward — usually in the form of professional guidelines and ethics opinions — is that professionals have an obligation to understand, to a reasonable degree, the tools they use in ways that affect the people they serve.

Domain-by-Domain Stakes

The stakes of AI opacity vary in severity across domains, but they are significant in every high-stakes application area:

Criminal justice: Opaque risk assessment tools influence pretrial detention, sentencing, and parole decisions. The stakes are liberty deprivation — a fundamental interest protected by the constitution. Errors or biases can result in people being imprisoned who should not be, or released who pose genuine public safety risks.

Healthcare: Clinical decision support tools and diagnostic algorithms inform decisions about diagnosis, treatment, and resource allocation. Opaque systems may embed biases based on historical health disparities, produce errors that go undetected because physicians trust the algorithm's authority, and make it difficult to identify why certain patient populations receive inferior care recommendations.

Credit and insurance: Opaque scoring models determine access to credit, the interest rates charged, and insurance premium levels. Because wealth-building depends heavily on credit access, systematic opacity-obscured bias in these systems can perpetuate or amplify economic inequality across generations.

Employment: AI-driven applicant tracking systems screen resumes and rank candidates before any human review. Hiring and promotion algorithms determine whose careers advance. Workers denied opportunities by opaque systems have no way to understand or contest those decisions.

Public benefits: Automated systems in some states determine eligibility for Medicaid, food assistance, and housing support. When these systems deny or terminate benefits — sometimes in error — the opacity of the system's logic makes it difficult for caseworkers to explain the decision, and difficult for affected individuals to appeal effectively.

Content moderation: Social media platforms make hundreds of millions of decisions daily about which content is amplified, reduced, or removed. These decisions shape public discourse, affect political speech, and determine whether communities are able to communicate freely. The opacity of content moderation systems makes accountability impossible and enables inconsistent, unexplained censorship.

13.3 Degrees of Interpretability

Opacity is not binary. AI systems exist on a spectrum of interpretability, and understanding where different systems fall on that spectrum is essential for making policy and procurement decisions.

Fully Interpretable Models

At the transparent end of the spectrum are models whose reasoning can be directly read and understood:

Linear regression models express the predicted outcome as a weighted sum of input features. Each feature's weight represents its marginal contribution to the prediction, holding all other features constant. A model predicting hospital readmission risk from patient age, number of prior admissions, and primary diagnosis category is immediately legible: for each additional prior admission, the readmission probability increases by X percentage points. The logic is transparent, mathematically precise, and fully auditable.

Decision trees represent the model's logic as a series of if-then rules organized in a tree structure. At each node, the algorithm asks a question about an input feature; the answer determines which branch to follow; the final leaf node delivers the prediction. A decision tree for loan approval might follow: IF income > $60,000 AND debt-to-income ratio < 0.4 AND no delinquencies in last 24 months THEN approve. Every path through the tree is a readable rule.

Rule lists (or scoring systems) are ordered sets of rules, sometimes compiled by methods like SLIM (Supersparse Linear Integer Models), that explicitly prioritize conditions. These are sometimes more practically interpretable than decision trees because they produce outputs like physicians and other experts are accustomed to reading.

These fully interpretable models have an important advantage beyond legibility: when they make errors, analysts can usually understand why. The interpretability of the model makes diagnosis and correction tractable.

Partially Interpretable Models

In the middle of the spectrum are models that are more complex but for which explanation tools can provide meaningful insight:

Logistic regression with many features retains its core interpretability (linear log-odds relationship between features and outcome) but becomes harder to interpret as the number of features grows, especially if features interact.

Gradient boosted trees (such as XGBoost and LightGBM) are ensemble methods that combine many decision trees. No single tree is the "model," but explanation methods like SHAP (SHapley Additive exPlanations) can decompose the model's prediction for a specific observation into the contribution of each feature. This provides meaningful, locally interpretable explanations.

Post-Hoc Explainable Models

Deep neural networks — the architectures behind most state-of-the-art AI performance in perception, language, and complex prediction tasks — are not inherently interpretable. However, a set of post-hoc explanation methods has been developed that attempts to approximate the model's reasoning:

LIME (Local Interpretable Model-agnostic Explanations) creates a locally faithful approximation of the model around a specific prediction, using a simpler model that is interpretable. It tells you approximately which features were most influential for this particular prediction, even if it cannot explain the model's global logic.

SHAP values (Shapley Additive exPlanations) draw on cooperative game theory to assign each feature a contribution value for a specific prediction. SHAP values are theoretically grounded and can be applied to any model, providing both local (instance-level) and global (model-level) explanations.

These methods are valuable, but they have important limitations. They provide approximations, not exact explanations. They may be unstable — providing different explanations for very similar inputs. And they explain what the model does, not why it learned to do it that way.

Fully Opaque Systems

Large modern neural networks — particularly large language models, image recognition systems, and complex multi-modal architectures — operate in a regime where even post-hoc methods provide limited insight. These models are so large and their internal representations so complex that explanation methods can only tell you about the model's surface behavior, not its internal structure. For a model with billions of parameters, SHAP values may explain the prediction, but they cannot explain whether the model has learned a spurious correlation, whether it relies on protected-class proxies, or whether its behavior will generalize appropriately to new populations.

The Rashomon Effect and the Trade-Off Question

In 2001, statistician Leo Breiman described what he called the "Rashomon effect" in statistical modeling: for most real-world prediction problems, there exist many models with very similar predictive accuracy but very different internal logic. The name comes from Kurosawa's film, in which the same events are described in incompatible ways by different witnesses — all of which are locally consistent but collectively irreconcilable.

Breiman's observation has profound implications for the accuracy-interpretability debate. If there are many models with similar accuracy, the choice among them is not determined by accuracy alone — and there may be excellent interpretable models among those near the accuracy frontier. The question becomes: are practitioners choosing complex models because they perform better, or because they are easier to implement with off-the-shelf tools and libraries?

Cynthia Rudin's landmark 2019 paper "Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead" made this argument with particular force. Examining several high-stakes domains, Rudin and her collaborators showed that carefully designed interpretable models could match the performance of black-box alternatives in many settings. The paper argued that the common practice of using a black-box model and then explaining it with post-hoc methods is unnecessary and potentially harmful — unnecessary because interpretable models can achieve similar accuracy, and harmful because post-hoc explanations can be misleading, may not accurately represent what the model actually does, and can create a false sense of understanding.

This is not a settled debate. In image recognition, natural language processing, and biological prediction tasks, the performance advantages of large neural networks over interpretable alternatives are substantial. But in high-stakes tabular data domains — where most consequential AI decisions about people are made — the case for insisting on interpretability before accepting opacity is much stronger than is often acknowledged.

The policy implication: For high-stakes decisions affecting human welfare, organizations should require evidence that black-box models provide substantial accuracy gains over well-designed interpretable alternatives before accepting opacity. The burden of proof should fall on those who propose opaque systems, not on those who demand interpretability.

13.4 The COMPAS Opacity Problem in Depth

The COMPAS system has become the central case study of AI opacity in criminal justice, and it warrants careful examination.

What COMPAS Is

COMPAS was developed by Northpointe (now Equivant) in the early 2000s as a case management and decision support tool for criminal justice agencies. It has been used for pretrial risk assessment, sentencing, and post-conviction supervision decisions in Wisconsin, California, New York, Florida, and many other jurisdictions. The system produces several risk scores — general recidivism, violent recidivism, and pretrial risk — along with a series of "criminogenic needs" subscores covering factors like criminal involvement history, substance abuse, and residential stability.

What Inputs COMPAS Uses

COMPAS draws on information from two primary sources: official records (criminal history, prior incarcerations, prior failures to appear) and a questionnaire administered to the defendant. The questionnaire covers items related to residential history, employment, family and social support, substance abuse, and social environment. Importantly, COMPAS does not directly input race as a variable. The company has stated explicitly that race is not used. However, many of the variables it does use — neighborhood stability, employment history, educational attainment — correlate with race because of historical patterns of racial inequality in employment, housing, and education.

What Northpointe Has Disclosed

Northpointe has published limited documentation about COMPAS. The company's "Practitioner's Guide" describes the conceptual domains the instrument covers and provides aggregate validation statistics — how well the scores predict recidivism in various populations. This guide explains the general logic of the scoring instrument in qualitative terms.

What Northpointe has not disclosed is the specific algorithm: the precise mathematical formula or decision logic by which responses to the questionnaire are converted into scores. The precise weights assigned to each item, the interactions among items, and the threshold logic for assigning individuals to risk categories remain proprietary.

The Due Process Argument in Loomis

Eric Loomis argued that his sentencing was unconstitutional because it relied on a proprietary tool he could not challenge. His attorneys raised several specific constitutional claims: that using COMPAS violated his right to be sentenced based on accurate information (because he couldn't verify the score's accuracy); that it violated his right to individualized sentencing (because the score was a population-level prediction applied to an individual); and that it violated equal protection because COMPAS scores may vary by race even when other factors are equal.

The Wisconsin Supreme Court rejected these arguments, holding that the judge had not relied exclusively on COMPAS, had access to extensive other information, and was aware of the instrument's limitations. The court held that COMPAS was permissible as one factor among many.

The court's reasoning is understandable — but it sidesteps the core problem. The question is not whether the judge used COMPAS as the sole determinant. The question is whether a government actor can impose a consequence on a person using a calculation that the person, and indeed the court itself, cannot fully scrutinize. The answer the Wisconsin Supreme Court gave is, effectively, yes — a troubling precedent.

The ProPublica Investigation

In 2016, ProPublica published an analysis of COMPAS scores for more than 7,000 people arrested in Broward County, Florida, comparing their COMPAS-predicted risk against their actual recidivism over two years. The investigation found that COMPAS was particularly likely to falsely flag Black defendants as future criminals (false positive rate approximately twice as high for Black defendants as for white defendants) and to incorrectly label white defendants as lower risk when they went on to reoffend (false negative rate roughly twice as high for white defendants). These findings ignited a national debate about algorithmic bias in criminal justice.

Northpointe disputed the analysis, and a subsequent academic debate established that the appropriate definition of fairness — and which statistical criteria a risk assessment instrument should satisfy — involves genuine trade-offs that cannot all be simultaneously satisfied. This is an important finding that we examine in depth in Chapter 15. But the ProPublica investigation was significant for another reason: it was only possible because journalists obtained individual-level data and conducted an external audit of outcomes. Without access to the model itself, that was the only audit available.

The Reform Movement

Several jurisdictions have moved to require greater transparency in the use of algorithmic risk assessment tools. New Jersey's bail reform movement, which largely eliminated cash bail in favor of risk-based release decisions, uses the Public Safety Assessment (PSA) developed by the Arnold Foundation — a tool that is significantly more transparent than COMPAS. Louisiana passed legislation in 2017 requiring disclosure of the algorithm used in any risk assessment tool deployed in criminal justice. California's Judicial Council has developed guidelines for the use of risk assessment instruments that emphasize transparency and limitations. These efforts represent meaningful progress, though they remain incomplete.

13.5 Institutional Opacity — When Transparency Is a Choice

One of the most important distinctions in the black box debate is between opacity that arises because a system genuinely cannot be explained and opacity that arises because an organization chooses not to explain it.

Trade Secrecy as an Accountability Shield

Trade secret law provides powerful protection for proprietary algorithms. In principle, this protection serves important economic functions: it encourages innovation by allowing companies to recoup their investments in developing valuable AI systems. In practice, trade secrecy has become a powerful tool for avoiding accountability. When Northpointe invokes trade secrecy to protect COMPAS, it is not merely protecting a competitive asset — it is preventing scrutiny of a system that the state uses to deprive people of liberty.

This creates a tension that existing law has not adequately resolved. Trade secret law and due process law exist in different legal regimes, and the courts have been reluctant to force a confrontation between them. The result is that proprietary vendors can effectively outsource government decision-making to systems that are immunized from accountability by intellectual property law.

Government AI Opacity

Institutional opacity is not only a problem with private vendors. Government agencies themselves often deploy AI systems without adequate transparency. FOIA requests (Freedom of Information Act) for documentation about government AI systems often return heavily redacted documents or generic procurement records that say little about how the systems actually work. Immigration enforcement, Social Security disability determinations, Medicaid eligibility, and tax enforcement have all incorporated algorithmic tools whose workings are not readily available to the public or to affected individuals.

In a constitutional democracy, government decision-making is subject to a set of transparency norms — notice-and-comment rulemaking, public records laws, judicial review — that are designed to keep the governed informed about how they are being governed. Algorithmic government largely circumvents these norms, because the "rules" are encoded in models rather than written in regulations, and because existing transparency mechanisms were designed for a pre-algorithmic era.

Automated Benefits Systems

Perhaps the most troubling cases of institutional opacity involve automated systems that deny people access to public benefits. In 2016, the state of Arkansas deployed an algorithm to determine the number of hours of home care that Medicaid recipients would receive. The algorithm significantly reduced care hours for many beneficiaries. When beneficiaries sought explanations, caseworkers couldn't provide them — they didn't know why the algorithm had produced the result it did. An investigation and subsequent lawsuit found that the algorithm contained errors that produced incorrect reductions in care, and that the state's failure to provide meaningful explanations violated due process. The Arkansas case illustrates how institutional opacity can cascade: when the organization deploying the AI system cannot explain its decisions, neither can the frontline workers who are supposed to communicate those decisions to affected individuals.

Social media platforms operate vast algorithmic systems that determine what billions of people see, read, and believe. These systems make consequential decisions about political speech, public health information, commercial advertising, and community formation — decisions that affect democratic processes and social cohesion. Yet they are almost entirely opaque to users, to researchers, and often to regulators.

When Facebook's algorithm changes and political content suddenly receives more or less reach, the effect is felt by millions of users and political campaigns — but the cause is a black box. When a small business loses its organic reach because of a platform algorithm change, there is typically no explanation and no recourse. The platforms control an extraordinarily consequential communications infrastructure and are accountable to essentially no one for how that infrastructure is operated.

The European Union's General Data Protection Regulation, which took effect in 2018, provides a meaningful contrast to the US approach. Article 22 of GDPR establishes a right not to be subject to a decision "based solely on automated processing" that "significantly affects" the individual, unless the individual has consented, the decision is necessary for a contract, or it is authorized by law. Where such automated decisions are made, Article 22 and associated recital guidance require that the data controller provide "meaningful information about the logic involved" and allow individuals to "obtain human intervention" and "contest the decision."

This is far from a complete solution — the "meaningful information" requirement has been interpreted variably across EU member states, and the "solely automated" qualifier creates a significant loophole (add a human rubber-stamp and the requirement may not apply). But it establishes a baseline: automated decision-making that significantly affects individuals carries an obligation to explain.

The United States has no federal equivalent to Article 22. Sector-specific requirements exist (adverse action notices in credit, for example), but there is no general right to explanation for consequential automated decisions.

13.6 The Audit Problem — You Can't Audit What You Can't See

Third-party auditing is a cornerstone of accountability in many industries. Financial audits verify that companies' financial statements accurately represent their position. Environmental audits verify compliance with emissions standards. In the AI context, algorithmic auditing — independent examination of AI systems to assess accuracy, fairness, and compliance — has emerged as a critical accountability mechanism. But it faces a fundamental obstacle: you cannot audit what you cannot see.

What a Proper Audit Requires

A genuine algorithmic audit requires access to the model itself (its architecture, parameters, and decision logic), the training data (to assess representativeness and identify biases), the feature engineering process (to understand what inputs are used and how they are constructed), the validation process (how the model's accuracy and fairness were assessed before deployment), and the deployment context (the population the model is applied to and how its outputs are used in decision-making). Without these elements, an auditor can examine the surface behavior of a system — its inputs and outputs — but cannot understand its internals or provide reliable assurance about its properties.

The Access Problem

Most AI-deploying organizations resist providing audit access at this level. Companies point to trade secrecy, competitive concerns, security risks, and the complexity of defining what "audit access" means for a modern ML system. The result is that even when regulations require algorithmic auditing — as New York City's Local Law 144 now does for hiring algorithms — the auditing that actually occurs is often quite limited, examining aggregate statistical outputs rather than the model itself.

External Audit Through API Probing

Because access to model internals is so often unavailable, researchers have developed techniques for "external" auditing — studying a system's behavior by probing it through its public-facing interface and analyzing the pattern of outputs. This approach is limited but can be revealing.

Researchers have used API probing to study ad targeting systems (finding, for example, that Facebook's advertising algorithm delivered job ads in ways that tracked gender and race stereotypes even when advertisers specified no demographic targeting), to study predictive policing systems, and to study automated content moderation tools. This research has been valuable, but it has important limitations: it can reveal that a system produces disparate outcomes, but it cannot usually explain why — because the model internals are not visible.

ProPublica's COMPAS Analysis

ProPublica's 2016 investigation into COMPAS is perhaps the most prominent example of external auditing from output data. The investigation used a dataset of COMPAS scores and two-year recidivism outcomes for Broward County defendants, obtained through public records requests. By analyzing the relationship between scores, recidivism outcomes, and race, ProPublica documented the differential false positive and false negative rates that sparked a national debate.

This was powerful journalism and revealed genuine problems. But it was an audit of outcomes, not of the model itself. ProPublica could document that Black defendants were more likely to be falsely classified as high risk than white defendants, but could not explain whether the disparity originated in the algorithm's logic, the training data, or some combination. The opacity of the model limited what the external audit could establish about causes — and therefore what remedies might be effective.

The Markup's Algorithmic Lending Investigation

The Markup, an investigative news organization focused on algorithmic accountability, conducted a landmark investigation into racial disparities in mortgage lending, published in 2021. The investigation used publicly available Home Mortgage Disclosure Act (HMDA) data — detailed records of loan applications and outcomes that lenders are required to report — to analyze whether lenders approved home loans at different rates for similarly situated applicants of different races.

The investigation found that major lenders were significantly more likely to deny home loan applications from Black and Latino applicants than from similarly situated white applicants, even after controlling for income, debt levels, and other financial factors. The analysis suggests racial disparities in algorithmic lending decisions but, again, could not reveal the internal mechanisms producing them, because the lending models are proprietary.

The Limits of External Audit

The pattern across these examples is consistent: external audit can detect disparate outcomes with reasonable reliability, but cannot explain their causes, cannot identify whether biases are embedded in the model or the training data, and cannot provide the foundation for specific remediation. In the language of medicine, external audit is like examining a patient's symptoms without being able to perform any diagnostic tests. You can document that something is wrong, but you often cannot prescribe an effective cure.

13.7 Sector-Specific Transparency Obligations

Different sectors have developed different regulatory frameworks for AI transparency, reflecting the different risks and regulatory histories of each domain.

Financial Services

In the United States, financial services regulation provides some of the most developed AI transparency requirements. The Federal Reserve's SR 11-7 guidance on model risk management, though predating the modern AI era, establishes that financial models should be transparent enough for independent validation — a principle that applies to credit scoring models and other algorithmic tools used by banks.

The Equal Credit Opportunity Act (ECOA) and its implementing regulation, Regulation B, require that lenders who deny credit provide applicants with adverse action notices explaining the principal reasons for denial. These notices must give specific factors — not just the algorithmic output — that contributed to the denial decision. Regulators have made clear that algorithmic models do not exempt lenders from this requirement. But in practice, generating meaningful adverse action notices from complex model outputs is technically challenging, and many notices remain formulaic and minimally informative.

The Consumer Financial Protection Bureau has increasingly focused on AI explainability in credit decisions, and the Office of the Comptroller of the Currency has emphasized model risk management requirements. Financial services is an area where regulatory pressure for AI transparency is relatively well-developed compared to other sectors.

Healthcare

The FDA has regulatory authority over clinical decision support software and AI-enabled medical devices. Requirements vary depending on whether a tool constitutes a "device" under the regulatory definition, and the FDA has been developing a regulatory framework for AI-based software as a medical device (SaMD). FDA clearance processes require documentation of model development, validation, and intended use — but the clearance dossier is generally not public, limiting external scrutiny.

Clinical decision support tools that inform clinical judgment but do not make decisions autonomously are subject to lighter-touch requirements, which has created a large category of AI tools in healthcare that have limited regulatory transparency obligations despite being widely used.

The healthcare sector is also notable for the growing role of clinical AI in high-stakes decisions about life and health — and for the relative immaturity of frameworks for ensuring that physicians understand and can critically evaluate the AI tools they use.

Criminal Justice

As described above, criminal justice has seen the emergence of state-level transparency requirements for algorithmic risk assessment tools. Louisiana, New Jersey, and other states have enacted or considered legislation requiring disclosure of the algorithms used in criminal risk assessment. These efforts are meaningful but incomplete — disclosure requirements are often limited to general documentation rather than the full mathematical specification of the model, and enforcement mechanisms are weak.

Employment

New York City's Local Law 144, which took effect in 2023, is a landmark in employment AI regulation. The law requires employers and employment agencies using "automated employment decision tools" to conduct annual bias audits and disclose the results publicly. It also requires that applicants be notified when such tools are used. The law has been criticized for the limited scope of auditing it requires (focused on demographic parity in outcomes rather than the model itself), but it represents the first significant municipal regulatory requirement for algorithmic transparency in employment.

Government AI

Federal agencies are subject to OMB guidance on AI use, which has increasingly emphasized transparency and accountability requirements. The AI in Government Act and various proposed bills have sought to impose documentation, registration, and impact assessment requirements on federal agencies using AI. Progress has been incremental, but the direction of travel is toward greater transparency requirements for government AI.

The EU AI Act

The European Union's AI Act, which entered into force in 2024, establishes a comprehensive risk-based framework for AI regulation in Europe. For "high-risk" AI applications — which explicitly include AI used in criminal justice, employment, credit, education, and critical infrastructure — the Act requires technical documentation, logging, transparency to users, and human oversight. Providers of high-risk AI must conduct conformity assessments and maintain detailed records. The Act represents the most comprehensive AI transparency regulation in force anywhere in the world.

The global picture is one of dramatic variation: some jurisdictions have meaningful transparency requirements for some AI applications; most of the world has little or nothing. Multinational organizations face the challenge of navigating this patchwork of requirements while building governance frameworks that are coherent and effective.

13.8 The Accountability Gap — When No One Can Explain AI Decisions

One of the most dangerous consequences of AI opacity is what might be called the accountability gap: the space that opens up when no human being is in a position to take responsibility for an AI-driven decision because no human being understands how the decision was made.

Responsibility Diffusion

Traditional organizational accountability works through a chain of delegated responsibility. A caseworker makes a decision; she is accountable to her supervisor; the agency is accountable to its director; the director is accountable to the governor or legislature; the legislature is accountable to voters. When a decision causes harm, there is a traceable chain of responsibility that allows accountability to be assigned.

Algorithmic decision-making tends to dissolve this chain. The algorithm is not a legal person — it cannot be held accountable. The vendor that built it says they only provided a tool. The organization that deployed it says they relied on the vendor's representations. The frontline worker who applied the output says she was following the system's recommendation. The supervisor says the algorithm was approved at a higher level. No one is responsible because everyone can point to someone else.

This is not hypothetical. When automated benefits systems deny claims incorrectly, the accountability gap is immediately apparent: caseworkers don't know why the system produced the result it did, supervisors can't override what they don't understand, and affected individuals can't obtain explanations. Harm occurs, but accountability evaporates.

Content Moderation

Social media content moderation provides another vivid illustration. Platforms make hundreds of millions of moderation decisions daily, mostly through AI systems that flag, reduce, or remove content. When a post is incorrectly removed — a journalist's reporting on a public health crisis, a community organization's outreach, a small business's advertising — the victim of the error typically receives a generic automated message. Appeals processes are frequently cursory and under-resourced. The people operating the appeals process often have no better insight into why the algorithm flagged the content than the person whose content was removed.

This is accountability theatre: a process that creates the appearance of accountability without its substance. The algorithm made the decision; no one who could explain or override it is meaningfully engaged.

The Organizational Response

The appropriate organizational response to the accountability gap is not to accept it as inevitable, but to design systems that maintain human accountability even when AI makes the initial recommendation. This requires:

Meaningful human review: Humans in decision chains must be capable of genuinely reviewing AI recommendations — which requires that the AI's reasoning be explicable, that reviewers have relevant expertise, and that they have time to exercise judgment rather than simply ratify algorithmic outputs.

Documentation obligations: Organizations should maintain records sufficient to reconstruct why consequential decisions were made, including what AI recommendations were received, what humans reviewed them, and what reasoning informed the final decision.

Clear lines of responsibility: Regardless of how extensively AI is used, specific human beings and specific organizational units should be designated as responsible for the outcomes of consequential decisions, and that responsibility should be real — with consequences for failures.

Audit trails: Systems should maintain logs that enable retrospective review when decisions are challenged.

13.9 The Business Case for Transparency

Transparency in AI is sometimes framed as a regulatory burden or a concession to stakeholder pressure. This framing misses the substantial business case for explainability.

The Trust Dividend

Research on AI acceptance consistently finds that people are more willing to trust and rely on AI systems whose decisions can be explained. A physician who understands why a clinical algorithm recommends a particular diagnosis is more likely to integrate the tool effectively into her practice — and more likely to exercise appropriate critical judgment when the algorithm may be wrong. A credit officer who can trace an algorithmic risk score to specific, understandable factors is better positioned to evaluate whether the score is appropriate for a specific applicant. Explainability is not merely an ethical nicety; it is a precondition for effective human-AI collaboration.

Risk Management

Opacity creates liability exposure. An organization that deploys an opaque AI system cannot effectively validate it, cannot monitor it for bias or error, and cannot quickly diagnose or fix problems when they emerge. When the system causes harm — and all AI systems eventually produce errors — the organization may face significant legal and reputational consequences. Explainable AI enables continuous monitoring, early problem detection, and more targeted remediation. The cost of investing in interpretability is almost certainly lower than the expected cost of undetected AI failures.

Regulatory Compliance

Regulatory pressure for AI explainability is increasing across sectors and jurisdictions. Organizations that build explainability into their AI development and governance frameworks now will be better positioned for compliance as these requirements expand. Organizations that have relied on opacity may face expensive retrofitting — or may face enforcement actions before they can comply.

Technical Benefits

Building interpretable models is not merely an ethical or regulatory exercise — it often produces better models. The discipline of designing models whose behavior can be explained and validated tends to surface data quality problems, feature engineering errors, and training data issues that can corrupt black-box models without detection. Interpretable models are also easier to maintain, update, and debug over their operational lifetime. The investment in interpretability has real returns in model quality.

The Regulatory Arbitrage Risk

Some organizations treat AI opacity as an opportunity for regulatory arbitrage — doing things through algorithmic systems that they could not do through explicit, human-legible policies. This is a dangerous strategy. As regulators develop greater sophistication in algorithmic oversight, and as tools for external audit improve, the value of opacity as a shield will diminish. Organizations that have built accountability and transparency into their AI governance will be in a much better position than those that relied on opacity to avoid scrutiny.

13.10 Toward More Transparent AI Systems

Moving toward more transparent AI in practice requires interventions at multiple levels: individual models, organizational processes, and regulatory frameworks.

Model Selection and Design

The first intervention is at the model selection stage. Before adopting a complex, opaque model, organizations should require a systematic assessment of whether interpretable alternatives can achieve adequate accuracy for the decision task. This is not always possible — some tasks genuinely require complex models — but it should be the starting question, not an afterthought.

Where interpretable models are viable, they should be preferred in high-stakes domains. The burden of proof should be on complexity: explain why a black-box model is necessary, rather than assuming it is acceptable.

Where opaque models are genuinely necessary, post-hoc explanation methods (LIME, SHAP, and their successors) should be employed — with the understanding that they provide approximations, not true explanations, and should be accompanied by extensive validation to ensure that the explanations are reliable.

Procurement Standards

Organizations procuring AI systems from vendors should require as a condition of procurement: full documentation of the model's development and validation process; disclosure of the types of features used (even if specific weights remain proprietary); evidence of fairness testing across relevant demographic groups; and commitment to cooperate with authorized third-party audits. These requirements should be contractually enforceable, with meaningful remedies for non-compliance.

Process Transparency

Even where the AI model itself remains complex and partially opaque, the process around the model can be made transparent. Process transparency means: informing individuals when AI is being used to make decisions about them; providing meaningful explanations of the factors that contributed to AI-influenced decisions, to the extent possible; establishing clear procedures for appeal and human review; and publishing aggregate information about AI decision-making patterns and outcomes.

Process transparency does not substitute for model interpretability — but it is a meaningful improvement over the current norm of silently deploying opaque AI without any disclosure.

The Explainability Interface

Chapter 15 examines in depth how to design effective explainability interfaces — communications to affected individuals about how AI-influenced decisions were made. The core principles are that explanations should be specific (not generic boilerplate), actionable (telling the affected person what they could do differently), accurate (genuinely reflecting the model's reasoning rather than a post-hoc rationalization), and honest about uncertainty.

These principles are demanding. Meeting them requires genuine investment in explanation design, testing with real users, and ongoing refinement. But they are the standard against which meaningful transparency should be measured.

Organizational Culture and Governance

Perhaps the most important interventions are cultural and governance-level: creating organizations where questions about model interpretability are taken seriously, where "the model said so" is never an acceptable final answer, and where accountability for AI-influenced decisions is taken as seriously as accountability for human decisions. This requires leadership commitment, appropriate resources, and incentive structures that reward careful AI governance rather than penalizing the time investment it requires.

Discussion Questions

The Wisconsin Supreme Court held in Loomis v. Wisconsin that a criminal defendant does not have a constitutional right to know the precise algorithm behind his risk score, only to know the score itself and the factors it considers. Do you find this argument persuasive? What does "due process" mean in an era of algorithmic governance, and does the Loomis holding adequately protect that value?
Cynthia Rudin argues that for high-stakes tabular data applications — including criminal risk assessment and credit scoring — the trade-off between interpretability and accuracy is largely a myth, and that practitioners should build interpretable models rather than using black-box models with post-hoc explanations. Consider the counterarguments: are there legitimate reasons why an organization might deploy an opaque model even when an interpretable alternative of comparable accuracy is available? What would those reasons have to look like to be ethically defensible?
Consider the three types of opacity: technical opacity (can't explain), institutional opacity (won't explain), and opacity to users (don't know AI is involved). In your view, which type creates the most serious ethical problems, and why? Should different types of opacity be addressed by different governance mechanisms?
Trade secrecy and due process are both legally protected interests in the United States. In the criminal justice context, these interests have come into direct conflict. How should the legal system resolve this conflict? Are there models from other contexts — pharmaceutical regulation, financial reporting, environmental compliance — that offer useful precedents for handling the tension between proprietary AI and public accountability?
Social media platforms argue that their recommendation algorithms are editorial choices protected by the First Amendment, and that requiring transparency about their algorithms would infringe on their free speech rights. Evaluate this argument. How, if at all, does a platform algorithm differ from a traditional editorial decision? Does the scale and consequence of algorithmic editorial choices change the analysis?
The EU AI Act establishes comprehensive documentation and transparency requirements for high-risk AI, while the United States relies on a patchwork of sector-specific requirements. What are the advantages and disadvantages of each approach? If you were advising a multinational organization deploying AI in both jurisdictions, what governance framework would you recommend?
The "accountability gap" describes the diffusion of responsibility that occurs when AI makes consequential decisions that no human can explain. Is this gap inherent in the use of AI for consequential decisions, or is it a governance failure that better-designed systems could address? What specific organizational practices would most effectively close the accountability gap?

Chapter 13 continues in the case studies and exercises that follow. Case Study 01 examines the constitutional challenge to algorithmic risk assessment in criminal justice in depth. Case Study 02 examines the opacity of social media recommendation algorithms and the role of the Frances Haugen disclosure. The key takeaways, exercises, and quiz consolidate and deepen the chapter's core arguments.