Case Study 2: The Boeing 737 MAX MCAS — What Happens Without AI Governance

DataField.Dev

Case Study 2: The Boeing 737 MAX MCAS — What Happens Without AI Governance

Introduction

On October 29, 2018, Lion Air Flight 610 crashed into the Java Sea thirteen minutes after takeoff from Jakarta, killing all 189 people on board. On March 10, 2019, Ethiopian Airlines Flight 302 crashed six minutes after takeoff from Addis Ababa, killing all 157 people on board. Both crashes were caused by the same automated system: the Maneuvering Characteristics Augmentation System (MCAS), a software system that Boeing designed to compensate for aerodynamic changes introduced by larger engines on the 737 MAX.

MCAS is not artificial intelligence in the machine learning sense. It is a deterministic software system — a set of programmed rules that respond to sensor inputs. But the failure of MCAS, and the organizational failures that allowed it to reach passengers, illustrates with devastating clarity what happens when automated decision-making systems are deployed without adequate governance. Every governance mechanism described in Chapter 27 — risk assessment, independent validation, transparent documentation, stakeholder oversight, incident response, and organizational accountability — was absent or compromised in the 737 MAX program.

This case study examines the MCAS failure not as an aviation engineering problem, but as a governance failure with direct parallels to the challenges organizations face in governing AI systems.

The Design Decision

The Boeing 737 MAX was developed as a fuel-efficient update to the 737 NG, Boeing's best-selling aircraft. The primary innovation was the installation of larger, more fuel-efficient CFM LEAP-1B engines. But the larger engines changed the aircraft's aerodynamic characteristics — specifically, they created a tendency for the nose to pitch up under certain flight conditions, a potentially dangerous behavior that could lead to an aerodynamic stall.

Boeing had two options:

Option A: Redesign the aircraft. A comprehensive aerodynamic redesign would address the pitch-up tendency through structural changes. This would be more expensive, take longer, and — critically — would likely require pilots to complete a new type rating, a costly and time-consuming certification process. Airlines that had ordered the 737 MAX expected it to require no additional pilot training beyond a brief differences course on an iPad. A new type rating would undermine one of the aircraft's key selling points.

Option B: Fix it in software. An automated system could detect the pitch-up condition and push the nose down, making the aircraft handle like the previous 737 NG despite its different aerodynamics. Pilots would not need to know about the system. They would not need additional training. The aircraft would feel the same.

Boeing chose Option B. The system it designed was MCAS.

Business Insight: The build-vs-fix decision Boeing faced has a direct analog in AI governance. Organizations frequently face a choice between addressing a problem at the architectural level (expensive, slow, thorough) and addressing it at the application level (cheaper, faster, riskier). When a machine learning model exhibits bias, the architectural fix might be to redesign the data pipeline, collect more representative training data, and retrain the model. The application-level fix might be to add a post-processing layer that adjusts outputs to meet fairness thresholds. Both can work. But the application-level fix is more fragile, harder to validate, and — if it fails — can fail in ways that are difficult to detect and difficult to explain.

What MCAS Did

In its original design, MCAS received input from a single angle-of-attack (AoA) sensor on the outside of the aircraft. When that sensor indicated that the nose was pitched too high — suggesting an approaching stall — MCAS activated automatically, pushing the nose down by adjusting the aircraft's horizontal stabilizer.

The system had several critical design characteristics:

Single point of failure. MCAS relied on input from one AoA sensor. The 737 MAX had two AoA sensors, but MCAS used only one — alternating between the left and right sensors on successive flights. If that sensor malfunctioned and provided erroneous data indicating a high angle of attack, MCAS would activate even when the aircraft was flying normally, pushing the nose down inappropriately.

Repeated activation. MCAS did not activate once and stop. If the erroneous AoA reading persisted, MCAS would activate repeatedly, each time pushing the nose further down. The cumulative effect of multiple activations could put the aircraft into a dive that was extremely difficult to recover from.

Limited pilot awareness. Pilots were not adequately informed about MCAS. The system was not described in the pilot operating manual for the 737 MAX. Pilots were not trained on it. The differences training that pilots received when transitioning from the 737 NG to the 737 MAX — a course that could be completed on a tablet in under an hour — did not mention MCAS.

Difficult override. While MCAS could theoretically be overridden by pilots through a specific procedure (deactivating the electric stabilizer trim using cutout switches), the procedure was not intuitive, required the pilot to recognize what was happening in a high-stress, time-critical situation, and had to compete with the pilot's natural instinct to pull back on the control column — which provided only temporary relief before MCAS reactivated.

The Governance Failures

The MCAS disaster was not a single failure. It was a cascade of governance failures at every level — design, validation, documentation, oversight, and accountability.

Failure 1: Inadequate Risk Assessment

Boeing's internal risk assessment for MCAS classified the system as a relatively minor flight control augmentation. This classification determined the level of design scrutiny, testing rigor, and regulatory review the system received. But the classification was based on MCAS's original, limited design parameters. As the system was modified during development — gaining more authority to move the stabilizer and relying on a single sensor — the risk assessment was not updated to reflect the increased potential for harm.

Governance parallel: This mirrors the failure to reassess risk when AI systems change. A model deployed as a low-risk internal analytics tool may evolve into a medium-risk or high-risk application as its outputs are used for increasingly consequential decisions. Without ongoing risk reassessment — a core component of the NIST AI RMF's Manage function — the governance framework fails to keep pace with the system's actual risk profile.

Failure 2: Insufficient Independent Validation

Boeing conducted testing of MCAS, but the testing was neither comprehensive nor independent. The system was tested against single-sensor failure scenarios, but the cumulative effect of repeated MCAS activations in response to a persistent sensor failure was not adequately analyzed. More critically, the testing was conducted by Boeing — the same organization that designed the system and had a powerful economic incentive for it to work.

Governance parallel: This is the violation of the "three lines of defense" principle from model risk management. The first line (developers) tested the system. But the second line (independent validation) was inadequate, and the third line (independent audit) was compromised by regulatory capture (discussed below). In AI governance, independent validation — conducted by people who did not build the model and who do not report to the same management chain — is essential precisely because developers have blind spots about their own work and incentives that can bias their assessments.

Failure 3: Regulatory Capture

The Federal Aviation Administration (FAA), the regulatory body responsible for certifying the 737 MAX as safe to fly, had progressively delegated more of its certification authority to Boeing itself through a program called Organization Designation Authorization (ODA). Under ODA, Boeing employees — who reported to Boeing management, were paid by Boeing, and were evaluated on Boeing's objectives — conducted safety assessments and certification activities on the FAA's behalf.

This arrangement created a structural conflict of interest. The people responsible for validating the system's safety worked for the company that had an overwhelming economic interest in the system being approved. The FAA, understaffed and facing budget constraints, relied on Boeing's assessments rather than conducting independent evaluation.

Governance parallel: This is the governance equivalent of asking the data science team that built a model to also conduct the fairness audit. It is why the chapter on AI governance emphasizes independence in validation and oversight. Internal governance functions — ethics committees, model risk analysts, internal audit — must have genuine independence from the teams they oversee. When the oversight function is captured by the function it is supposed to oversee, governance becomes performative.

Failure 4: Inadequate Documentation and Transparency

MCAS was not adequately documented for the people who most needed to understand it: the pilots who flew the aircraft. The system was not described in the pilot operating manual. It was not covered in differences training. Boeing's rationale was that MCAS operated in the background and pilots did not need to know about it — the system would handle the aerodynamic issue transparently.

This decision was catastrophic. When MCAS activated erroneously, pilots did not know what was happening. They did not know that an automated system was pushing the nose down. They did not know why their control inputs were being overridden. They did not know the specific procedure to disable the system. They were fighting an automated system they did not know existed.

Governance parallel: This is the transparency and explainability failure applied to automated systems. The OECD AI Principles require that people affected by AI-driven decisions should be able to understand the basis for those decisions. AI impact assessments require documentation of how the system works, what it does, and what its limitations are. When automated systems operate opaquely — when the people affected by them cannot understand or override them — the potential for harm is maximized.

Failure 5: Inadequate Incident Response

After the Lion Air crash in October 2018, it took nearly five months before the 737 MAX was grounded worldwide — and that grounding came only after the Ethiopian Airlines crash in March 2019. During that five-month interval, Boeing, the FAA, and airlines around the world continued to fly the 737 MAX with an MCAS system that had demonstrably caused a fatal crash.

Boeing issued an advisory to pilots after the Lion Air crash describing the runaway stabilizer procedure. But it did not ground the fleet. It did not mandate additional training. It did not disable MCAS. It did not rush an MCAS software fix. The organizational response to a catastrophic failure was a memo.

Governance parallel: This is the failure of incident response. Chapter 27 describes a four-tier incident severity system. The Lion Air crash — a fatal failure of an automated system — would have been a Level 4 (Critical) incident under any reasonable classification. The appropriate response would have included immediate suspension of the system pending investigation, root cause analysis, and remediation before resumption. Instead, the system continued to operate, and 157 more people died.

Failure 6: Accountability Gaps

In the aftermath of the crashes, the question of accountability proved vexingly difficult. Who was responsible? The engineers who designed MCAS? The managers who approved the single-sensor design? The certification engineers who classified the system's risk level? The FAA officials who delegated certification authority? Boeing's CEO? Its board of directors?

The accountability question revealed that responsibility was so diffused across individuals, teams, and organizations that no single party could be identified as the decision-maker who chose to accept the risk that killed 346 people. Everyone had contributed to the outcome. No one owned it.

Governance parallel: This is precisely why the chapter emphasizes RACI matrices and clear accountability assignments. When accountability is diffuse — when everyone is partially responsible and no one is fully accountable — the organization loses its ability to prevent and respond to failures. AI governance frameworks must specify, for every consequential decision, who is accountable for the outcome. Not who is responsible for the work. Who is accountable for the result.

The Aftermath

The consequences of the MCAS governance failure were staggering:

346 people killed in two crashes
20 months of grounding for the 737 MAX fleet worldwide (March 2019 to December 2020 in the US; longer in some countries)
$20 billion in estimated costs to Boeing — including compensation to airlines, settlements with victims' families, regulatory penalties, production disruptions, and order cancellations
Criminal charges — Boeing agreed to a $2.5 billion deferred prosecution agreement in January 2021, acknowledging that two former employees had deceived the FAA. In 2024, the Department of Justice moved to revoke the deferred prosecution agreement, potentially exposing Boeing to additional criminal liability.
Regulatory reform — The Aircraft Certification, Safety, and Accountability Act of 2020 reformed the FAA's delegation process, reducing manufacturers' ability to self-certify safety-critical systems
Reputational damage — Boeing's reputation for safety and engineering excellence, built over a century, suffered lasting damage
Leadership changes — Boeing's CEO was forced out in December 2019; subsequent leadership changes continued through 2024

Lessons for AI Governance

The 737 MAX MCAS failure is not an AI story. But it is a governance story — and the lessons translate directly:

1. Risk assessments must be updated as systems change. MCAS's risk classification was not updated when the system's authority was expanded. AI systems that evolve — through retraining, expanded use cases, or changing data — require risk reassessment at each significant change.

2. Single points of failure are governance failures. MCAS relied on a single sensor. AI systems that rely on a single data source, a single model, or a single human reviewer for consequential decisions have single points of failure that governance should identify and address.

3. Independence in validation is non-negotiable. When the people validating a system work for the same organization that built it, conflicts of interest are structural, not hypothetical. Independent validation — whether by internal audit, external auditors, or cross-functional review — is essential for high-risk systems.

4. The people affected by automated systems must understand them. Pilots did not know about MCAS. Users of AI systems often do not know that AI is making or influencing decisions about them. Transparency is not a nice-to-have — it is a safety requirement.

5. Incident response must be proportional to incident severity. The response to the Lion Air crash was inadequate — a memo instead of a grounding. Organizations must have the willingness and the authority to suspend AI systems when evidence of harm emerges, even when the economic costs are significant.

6. Accountability must be specific, not diffuse. When everyone is partially responsible, no one is accountable. AI governance frameworks must assign clear accountability for consequential decisions — not just the work of building the system, but the outcome of deploying it.

7. Economic incentives can corrupt governance. Boeing's incentive to avoid a new type rating, to maintain its delivery timeline, and to compete with Airbus influenced every governance decision — from risk classification to testing to documentation to incident response. Organizations must design governance structures that are resistant to economic pressure — through independence, separation of duties, and escalation mechanisms that reach beyond the immediate business unit.

Professor Okonkwo, who uses the 737 MAX case in her course every year, frames the lesson with characteristic directness: "Boeing did not have a technology problem. MCAS could have been made safe with a second sensor, better software limits, and adequate pilot training. Boeing had a governance problem. The organization's structures, incentives, and culture conspired to suppress the information, analysis, and decisions that would have prevented 346 deaths. When we talk about AI governance, this is what we are trying to prevent. Not this specific outcome — but this specific type of organizational failure."

Discussion Questions

The chapter defines six common governance pitfalls: governance theater, one-size-fits-all, the last mile problem, innovation antagonism, static governance, and under-resourcing. Which of these pitfalls best describes Boeing's governance failures with MCAS? Can you identify more than one?
Boeing faced a classic speed-versus-safety tradeoff. The 737 MAX was developed under intense competitive pressure from the Airbus A320neo. How should governance frameworks account for competitive pressure without being undermined by it?
The FAA's delegation of certification authority to Boeing is analogous to allowing AI developers to self-certify their own systems. Under what circumstances, if any, is self-certification appropriate for AI systems? What safeguards would be necessary?
After the Lion Air crash, Boeing had five months to ground the fleet, fix MCAS, or at minimum provide comprehensive pilot training. It did none of these things. What organizational factors might explain this inaction? How could governance structures be designed to prevent similar inaction in AI-related incidents?
The chapter discusses the distinction between compliance-driven and culture-driven governance. Based on the evidence in this case study, would you characterize Boeing's pre-crash safety governance as compliance-driven, culture-driven, or something else entirely? What evidence supports your assessment?
Compare Boeing's MCAS governance failures to the governance framework Athena established after the HR screening crisis. For each of the six governance failures identified in this case study (inadequate risk assessment, insufficient independent validation, regulatory capture, inadequate documentation, inadequate incident response, and accountability gaps), identify the specific element of Athena's framework that addresses it.

This case study draws on the US House Committee on Transportation and Infrastructure final report on the 737 MAX (September 2020), the Joint Authorities Technical Review (JATR) report, the Indonesian National Transportation Safety Committee (KNKT) final report on Lion Air Flight 610, the Ethiopian Aircraft Accident Investigation Bureau preliminary report on Ethiopian Airlines Flight 302, congressional testimony, and published investigative journalism including work by the Seattle Times, the New York Times, and Bloomberg.