Case Study 3.1: Utilitarianism and the Autonomous Vehicle Trolley Problem

The MIT Moral Machine Experiment and the Limits of "Best Outcomes"

Estimated reading time: 35–45 minutes Primary framework: Consequentialism (Utilitarianism) Secondary frameworks: Deontology, Capabilities Approach Themes: Innovation vs. Harm Prevention; Power and Accountability; Global Variation

Introduction: A Number That Demands a Philosophy

Every year, approximately 1.35 million people die in road traffic accidents globally. In the United States alone, the figure approaches 40,000 annually. The human cost — in lives, in grief, in disability, in economic devastation — is staggering and largely preventable.

Autonomous vehicles are frequently presented as a solution to this catastrophe. The argument is compelling: human error causes approximately 94 percent of serious crashes. Machines do not drink and drive. They do not text at the wheel. They do not experience road rage, fatigue, or distraction. A fully autonomous vehicle fleet, even one that is only modestly better than the average human driver, would prevent millions of deaths per decade.

This is consequentialism at its most persuasive: a technology that saves enormous numbers of lives, reducing human suffering on a massive scale. The utilitarian case for autonomous vehicles is, on its face, overwhelming.

But autonomous vehicles are also, necessarily, systems that will sometimes make lethal decisions. When a collision is unavoidable — when the vehicle must choose between hitting the pedestrian who has stepped into traffic and swerving into the barrier, between the child who has run into the road and the elderly couple in the crosswalk — the algorithm will execute a choice. Not a human choice made in a panicked fraction of a second, but a pre-programmed choice, made deliberately, encoded into software, and executed at scale across millions of vehicles.

Who makes that choice? On what basis? And who has the legitimacy to make it?

These questions are not merely philosophical. They are engineering specifications waiting to be written, legal liabilities waiting to be determined, and political controversies waiting to erupt. The Moral Machine experiment, conducted by a team of MIT and Harvard researchers and published in Nature in 2018, represents the most ambitious attempt to gather systematic evidence about how human beings across the globe think these choices should be made. What the data showed was both fascinating and deeply troubling for anyone hoping that utilitarian calculus could provide a clean answer.

Part 1: The Classic Trolley Problem and Its Limitations

In 1967, the philosopher Philippa Foot published a thought experiment that would become one of the most discussed problems in moral philosophy. Imagine a runaway trolley heading toward five people tied to the tracks. You are standing next to a lever that, if pulled, will divert the trolley to a side track, where only one person is tied. If you do nothing, five people die. If you pull the lever, one person dies. What do you do?

Most people — across cultures, ages, and philosophical backgrounds — say they would pull the lever. The arithmetic is straightforward: one death is better than five. Trolley cases have been replicated in psychology experiments, cross-cultural surveys, and philosophical seminars with consistent results. The utilitarian calculation is intuitive.

But philosopher Judith Jarvis Thomson noted something important: the intuition changes when the mechanism changes. In another version of the scenario, you are on a footbridge above the tracks. The only way to stop the trolley is to push a large man off the bridge — his body will derail it, saving the five. Same arithmetic, one life for five. But most people say they would not push the man. The action seems monstrous in a way that pulling the lever does not.

This divergence puzzles strict utilitarians. The outcomes are identical: one life for five. Why should the means matter if the ends are the same? But decades of research in moral psychology suggest that the divergence is robust, cross-cultural, and driven by something deeper than confusion. The physical directness of pushing someone to their death activates a different moral response than the mechanical indirection of lever-pulling. Whether this represents a morally relevant distinction or merely a cognitive quirk has been debated ever since.

The trolley problem's relevance to autonomous vehicles is not metaphorical — it is literal. Autonomous vehicles will encounter unavoidable collision scenarios. Unlike human drivers responding in milliseconds, autonomous vehicle algorithms must have pre-specified decision protocols for such cases. The question of how those protocols should be specified is precisely the trolley problem, repeated millions of times, across diverse contexts, encoded in software.

The key differences from the original trolley problem are worth noting. First, scale: we are talking about decisions embedded in millions of vehicles operating continuously, not one-off choices. Second, deliberateness: the decision is made in advance, by engineers and executives, not in the heat of the moment by a bystander. Third, accountability: the decision-maker is identified and potentially liable. These differences matter morally, and they are not captured by the original thought experiment.

Part 2: How the Moral Machine Experiment Worked

In 2018, Edmond Awad, Sohan Dsouza, Richard Kim, Jonathan Schulz, Joseph Henrich, Azim Shariff, Jean-François Bonnefon, and Iyad Rahwan published "The Moral Machine Experiment" in Nature — one of the most widely discussed empirical papers in AI ethics history.

The research team created an online platform, Moral Machine, that presented visitors with a series of unavoidable-collision scenarios for autonomous vehicles. In each scenario, the brakes had failed. The vehicle could continue straight, killing one group of characters, or swerve, killing another. Visitors chose which action they preferred the vehicle to take.

The scenarios varied across nine binary dimensions: - Saving more lives vs. fewer lives - Saving passengers vs. pedestrians - Upholding traffic law (pedestrians who have the signal to cross vs. jaywalkers) - Saving women vs. men - Saving younger people vs. older people - Saving pedestrians of higher vs. lower social status (doctors, athletes, executives vs. homeless people, criminals) - Saving humans vs. animals - Saving more fit/healthy vs. less fit/healthy individuals - Inaction vs. swerving (moral action type)

The platform was launched in 2016 and attracted 2.3 million participants from 233 countries, generating over 70 million individual decisions. It is the largest empirical study of human moral judgments ever conducted, and the data is freely available for secondary research.

The design of the experiment has been criticized — and the criticisms are important, as we will see — but the scale and diversity of the sample are extraordinary. This was not a WEIRD (Western, Educated, Industrialized, Rich, Democratic) population of college students. It included participants from across the income spectrum, from rural and urban areas, from countries with very different legal, cultural, and religious traditions.

What did the data show?

Part 3: What the Data Showed

Near-Universal Preferences

Across the full global sample, certain patterns were surprisingly consistent. Participants showed strong preferences for: - Saving more lives over fewer lives (the basic utilitarian calculus) - Saving humans over animals (even for dogs vs. criminals, human life generally prevailed) - Saving children over adults, and adults over elderly individuals - Avoiding inaction (i.e., if the vehicle must kill someone, the choice to swerve felt more acceptable than the choice to continue straight)

These near-universal preferences are useful for AI designers, as they suggest a floor of shared moral intuition that might provide some basis for programming decisions. But they are also insufficient, because real collision scenarios rarely present clean trade-offs between symmetric groups.

Major Cross-Cultural Variation

The more significant and troubling finding was the dramatic variation across cultural clusters. The researchers identified three major clusters of countries based on the similarity of their moral preferences:

The Western cluster (North America, Western Europe, much of Latin America) showed relatively high preferences for saving higher social status individuals (doctors over homeless people), young over old, and fitness indicators. This cluster also showed relatively strong preferences for sparing women over men.

The Eastern cluster (Japan, China, Taiwan, and much of East Asia) showed the smallest preference for sparing higher social status individuals — a more egalitarian distribution of moral weight across social strata. This cluster showed stronger in-group preferences and different age-based preferences than the Western cluster.

The Southern cluster (many Muslim-majority countries, much of South Asia) showed the strongest preferences for sparing pedestrians following traffic rules (as opposed to jaywalkers), reflecting a culture-specific emphasis on rule compliance. This cluster also showed the strongest preference for men over women in collision scenarios — a preference that Western audiences find deeply troubling but that reflects, or at least correlates with, different gender norms.

The magnitude of these differences is large enough to matter for policy. The preference for sparing higher social status individuals — significant in the Western cluster — is, from an egalitarian perspective, morally alarming: it suggests that a utilitarian calculus coded to reflect Western moral preferences would systematically spare the wealthy and well-employed at the expense of the poor and unemployed. The preference for sparing men over women in the Southern cluster is troubling from a gender equality perspective.

Young vs. Old: A Universal But Contested Preference

The near-universal preference for saving younger over older individuals deserves particular attention. Across all cultural clusters, participants preferred saving younger characters over older ones. This is, in one sense, a life-years utilitarian calculation — a younger person has more life-years remaining and thus more aggregate welfare at stake.

But it is also, from several other perspectives, ethically unacceptable. Older people do not have less right to life than younger people. A preference encoded in autonomous vehicle software to spare younger over older pedestrians would constitute a form of age discrimination with no parallel in any other domain of civil rights. The fact that the preference is widespread does not make it ethically permissible — popular moral judgments have been wrong before.

The near-universal preference for sparing children over adults is even more dramatically expressed. The intuition that a child's death is especially tragic is widely shared and can be given various justifications — life-years, vulnerability, potential, parental and social bonds. But it is also, as a programming principle, a discriminatory input that requires explicit justification, not just empirical prevalence.

Part 4: The Business and Legal Problem

The Moral Machine data created an immediate problem for autonomous vehicle manufacturers. If the goal is to program vehicles to match human moral preferences, whose preferences should be programmed? The three cultural clusters have genuinely different, sometimes contradictory preferences. A vehicle programmed for the Western cluster would behave differently in a collision in Japan than a vehicle programmed for the Eastern cluster. A vehicle programmed for the Southern cluster would make gender-differential decisions that would be illegal in many jurisdictions.

The legal dimensions are equally complex. In the United States, programming a vehicle to make decisions based on the age, gender, social status, or disability of pedestrians would likely violate federal civil rights law. The Equal Credit Opportunity Act prohibits using protected characteristics in credit decisions; analogous prohibitions likely apply to autonomous vehicle decision algorithms under the Civil Rights Act and ADA. "We're just following human moral preferences" is not a legal defense for discrimination.

This creates what might be called the democratic legitimacy problem of algorithmic ethics. If moral preferences genuinely vary across individuals and cultures, who has the authority to select the preferences to be encoded? The engineers writing the code? The executives approving the product? The shareholders? The regulators? The affected communities? The users who buy the cars?

Each answer generates different problems. Engineers lack democratic mandate and are not a representative population. Executives have commercial interests that may not align with ethical optimality. Regulators are slow, jurisdiction-bound, and technically limited. Affected communities — including the pedestrians who will be killed or spared by the algorithm — were never consulted. Users have a conflict of interest: they want the car to protect them, not the pedestrian.

Part 5: The Utilitarian Approach and Its Critics

The most natural first response to the Moral Machine problem is to apply straightforward utilitarianism: program the vehicle to minimize deaths. Not to minimize deaths of specific types of people, but simply to minimize the number of lives lost in unavoidable collision scenarios.

This approach has several advantages. It is simple, transparent, and consistent. It does not discriminate among lives on the basis of age, gender, social status, or other characteristics. It is defensible in most legal frameworks. And it aligns with the intuition that all lives count equally.

But the "minimize deaths" utilitarian approach generates its own problems.

First, it requires solving a complex prediction problem: in a given emergency scenario, which course of action will actually minimize deaths? This depends on the physics of the collision, the characteristics of the parties involved, and predictions about injury severity — all of which are uncertain, often in milliseconds.

Second, even a death-minimizing algorithm implicitly weights some lives over others in many scenarios. A vehicle that swerves to avoid a group of five pedestrians and kills the sole passenger is "minimizing deaths" — but it is also making the passenger's death a certainty in exchange for making the pedestrians' deaths probabilistic. Whether the passenger would consent to this bargain — and whether their non-consent is morally relevant — is exactly the trolley problem.

Third, and most importantly, the utilitarian approach does not actually resolve the question of who encodes the preference. "Minimize deaths" seems like a neutral, preference-free rule — but it embeds philosophical choices. It assumes that all lives are equal in value (a choice not all ethical traditions share). It assumes that death is the only relevant welfare consideration (ignoring injury severity, psychological trauma, family impacts). It assumes a specific decision procedure under uncertainty. None of these assumptions is neutral.

The most pointed critique of utilitarian autonomous vehicle ethics comes from the marketing dimension. If consumers know that the car they are buying is programmed to sacrifice them for others in certain scenarios, they may be reluctant to buy the car. This is not merely a commercial problem — it is an ethical one. A utilitarian mandate that produces a world without autonomous vehicles (because no one buys them) produces worse outcomes than a utilitarian mandate that produces an autonomous vehicle fleet with some moral impurities. The second-order consequences of ethical rules for autonomous vehicles include the adoption rate of the technology, which determines the aggregate welfare impact.

Part 6: Alternative Frameworks

The Deontological Alternative: Never Program to Kill

A deontological alternative to the utilitarian approach has been articulated by several philosophers, most prominently Sven Nyholm and Jilles Smids. They argue that autonomous vehicles should be programmed with the constraint that they must never be programmed to deliberately kill a specific person or class of persons. Killing may be a foreseen outcome of the vehicle's evasive actions, but it must not be the intended means.

This is a Kantian constraint: people may not be used merely as means, even in service of life-saving ends. A vehicle that is programmed to swerve away from a larger group, foreseeing that it will hit a smaller group, is not deliberately killing the smaller group — it is choosing the lesser harm. A vehicle that is programmed to calculate whether the pedestrian's or the passenger's life is worth more and kill the less valuable one is using a person merely as a means.

Critics argue that this distinction — between intended and foreseen harm — is metaphysically unstable in the autonomous vehicle context, because the algorithm's "intentions" are human-programmed in advance. But the deontological constraint has practical value: it prohibits the most disturbing implications of utilitarian autonomous vehicle ethics — the programming of social status discriminators, the age-weighting of lives, the gender-differential protection — without requiring a clean answer to every collision scenario.

The Capabilities Alternative: Protect the Most Vulnerable

A capabilities approach to autonomous vehicle ethics focuses not on aggregate welfare or individual rights but on the protection of human capabilities, with particular attention to the most vulnerable. This approach asks: which potential victims of autonomous vehicle collisions have the fewest alternative protections, the least ability to avoid harm, and the greatest dependency on the system for their safety?

The answer points toward pedestrians over passengers (passengers consented to the vehicle's risk profile; pedestrians did not), toward people with disabilities who may be less able to move out of harm's way, and toward people in communities with less political power to shape the regulatory environment governing autonomous vehicles.

A capabilities approach does not generate a complete decision algorithm, but it does generate a design principle: when uncertain, prioritize those with the least power to protect themselves. This is a different utilitarian calculus than "maximize lives saved" — it is a distribution-sensitive calculus that weights the welfare of the most vulnerable more heavily.

Part 7: What Car Manufacturers Actually Do — And Why They're Not Telling Us

The autonomous vehicle industry has been remarkably reticent about its ethical decision protocols. Companies including Tesla, Waymo, General Motors, and others have publicly deflected questions about trolley-problem scenarios on multiple grounds: the scenarios are unrealistic, the technology's primary goal is to avoid all collisions rather than optimize among them, and the companies lack the legal and ethical authority to make these decisions unilaterally.

There is some validity in each of these responses. Genuinely unavoidable collision scenarios are rare — the vast majority of autonomous vehicle ethical cases involve detection accuracy, system reliability, and edge case handling rather than the clean moral dilemmas of philosophy seminars. And companies are correct that they do not have unilateral authority to make life-and-death ethical decisions for society.

But the deflections are also evasions. Autonomous vehicles do have decision protocols for emergency scenarios, even if those protocols have not been publicly disclosed or deliberately designed with explicit ethical principles. The absence of deliberate ethical design is itself an ethical choice — it means that whatever preferences are encoded are the incidental products of engineering decisions made without moral oversight.

The industry's preference for opacity on this question has a self-interested explanation: explicit disclosure of ethical decision protocols creates legal liability. If a manufacturer declares that its vehicle will always protect passengers over pedestrians, it has created an explicit standard against which its behavior in any accident will be measured. Opacity protects against liability even as it prevents democratic deliberation about choices that affect public safety.

This opacity is itself an ethics problem. It represents exactly the kind of accountability gap that AI ethics frameworks are designed to address.

Part 8: Discussion Questions

The Moral Machine data shows genuine cross-cultural variation in moral preferences about autonomous vehicle decisions. Should vehicle manufacturers program different ethical rules for vehicles sold in different countries? What are the arguments for and against cultural customization of AI ethical rules?
The utilitarian argument for autonomous vehicles — they will save millions of lives — is compelling. Does this aggregate benefit justify deploying vehicles with imperfect, potentially discriminatory decision algorithms in the short term, on the grounds that the long-term welfare benefit is enormous? What conditions, if any, would make this argument acceptable?
The deontological position that vehicles should never be programmed to deliberately kill a specific person seems clear — but in the Moral Machine scenarios, all choices involve foreseeable deaths. Is there a meaningful moral distinction between "intending" a death and "foreseeing but not intending" a death in an algorithmically pre-programmed system?
The capabilities approach suggests weighting the welfare of the most vulnerable more heavily in autonomous vehicle design. Concretely, what would this mean? Who counts as "most vulnerable" in traffic scenarios — children, elderly people, pedestrians, cyclists, people with disabilities? And who should decide?
The case notes that manufacturers have been deliberately opaque about their ethical decision protocols in order to limit legal liability. Should regulators require public disclosure of autonomous vehicle ethical decision rules? What are the likely consequences — intended and unintended — of mandatory disclosure?

This case study connects to Section 3.2 (Consequentialism) and Section 3.9 (Combining Frameworks) of the main chapter. The Moral Machine experiment data is available at moralmachine.net. The original paper is: Awad, E., Dsouza, S., Kim, R., Schulz, J., Henrich, J., Shariff, A., Bonnefon, J.-F., and Rahwan, I. (2018). The Moral Machine Experiment. Nature, 563(7729), 59–64.