49 min read

In 2016, Microsoft launched Tay — an AI chatbot designed to learn from Twitter conversations. The idea seemed clever: deploy a system that could pick up the slang, humor, and conversational patterns of young people through real-time interaction...

Chapter 2: A Brief History of AI and Its Ethical Concerns


Opening: A Warning That Was Ignored

In 2016, Microsoft launched Tay — an AI chatbot designed to learn from Twitter conversations. The idea seemed clever: deploy a system that could pick up the slang, humor, and conversational patterns of young people through real-time interaction, making it feel fresh and responsive rather than scripted. Within 24 hours, internet users had trained it to produce racist, antisemitic, and sexist content. Tay declared that the Holocaust had not happened, expressed admiration for Adolf Hitler, and produced a stream of targeted harassment. Microsoft shut it down. The company issued an apology and called the episode "a coordinated attack." The phrase was technically accurate but deeply misleading. What happened to Tay was not a bolt from the blue. It was the entirely predictable result of deploying a learning system, in an adversarial environment, without adequate safeguards — a failure mode that researchers had identified and discussed for decades.

Tay was not an anomaly. It was a warning that had been ignored since the earliest days of AI research.

The story of artificial intelligence and ethics is not a story of progress interrupted by occasional accidents. It is a story in which the same failures recur, in new technical clothing, at increasing scale. The cast changes. The underlying dynamic — systems built without adequate attention to who will use them, who will be harmed, and who will be held accountable — does not. Understanding that pattern is the purpose of this chapter.

For business professionals, this history is not merely academic context. It is a record of organizational decisions and their consequences: of engineers who warned and were overruled, of executives who deployed without testing, of lawyers who treated harm as a liability problem rather than an ethical one, and of regulators who arrived years too late. The organizations that navigate AI responsibly in the coming decades will be those that can recognize these patterns quickly enough to avoid repeating them.

This chapter moves through AI history not as a triumphalist narrative of technical achievement — there are plenty of those — but as an ethical audit of the field's choices. We ask: Who held the power? Who bore the costs? Who answered for the outcomes? The answers are consistent enough to be disturbing, and consistent enough to be useful.


Learning Objectives

By the end of this chapter, you should be able to:

  1. Describe the major phases of AI development from 1950 to the present, and identify the ethical concerns that emerged in each phase.
  2. Explain why the AI winters matter for understanding how hype cycles create governance failures.
  3. Trace the origins of algorithmic bias as a documented, studied problem, naming key investigations and their findings.
  4. Distinguish between ethics washing — the deployment of ethics language without substantive commitment — and genuine organizational ethics practice.
  5. Analyze the pattern of harm, outcry, investigation, and delayed response that characterizes AI governance failures.
  6. Evaluate the role of civil society, journalism, and academic researchers in holding AI systems accountable when organizations failed to do so.
  7. Connect historical AI ethics failures to contemporary organizational decisions about AI deployment.
  8. Recognize why generative AI's speed of deployment poses governance challenges that are qualitatively different from earlier AI waves.

Section 2.1: The Birth of AI and Early Ethical Intuitions (1950–1965)

Turing's Questions

Alan Turing's 1950 paper "Computing Machinery and Intelligence," published in the philosophy journal Mind, is famous for introducing what Turing called the imitation game — a test for whether a machine can exhibit intelligent behavior indistinguishable from a human. It has been retold so many times as a founding myth of artificial intelligence that readers sometimes miss what Turing was actually doing in the paper: systematically anticipating objections.

Turing devoted a substantial portion of the paper to working through nine different objections to the possibility of machine intelligence — the theological objection, the mathematical objection, the argument from consciousness, the argument from various disabilities, and several others. This structure was not incidental. Turing understood that the question of machine intelligence was not merely technical; it was philosophical, social, and, implicitly, ethical. He was building an argument about what it would mean for a machine to think, and he knew that argument would face resistance rooted in prior commitments about what made human beings special.

Two of Turing's responses to anticipated objections are particularly prescient for an ethics discussion. First, his handling of the "argument from various disabilities" — the claim that machines could never truly do X, where X was filled in with creativity, humor, kindness, moral judgment — led Turing to observe that such arguments were frequently not arguments at all, but expressions of discomfort with an unfamiliar category. This observation anticipates by decades the literature on how human beings anthropomorphize AI systems in ways that distort their judgment about those systems' capabilities and limitations.

Second, and more directly ethical, Turing raised the question of what it would mean for a machine to learn — to be exposed to educational inputs and develop capabilities that were not explicitly programmed. He was gesturing toward what we now call machine learning, and his gesture included an awareness that learning systems would be shaped by their inputs in ways that could be problematic. He did not develop this into a full critique. But the intuition was there.

Turing was also aware that his work would have social consequences beyond the laboratory. He noted in a 1951 radio broadcast that "it seems probable that once the machine thinking method has started, it would not take long to outstrip our feeble powers" — a statement often quoted for its technological optimism but which, in context, was accompanied by uncertainty about what such outstripping would mean for human society and human employment.

The Dartmouth Confidence

The 1956 Dartmouth Summer Research Project on Artificial Intelligence is typically celebrated as the founding moment of AI as a field — the event that gave "artificial intelligence" its name and coalesced a community of researchers around a shared ambition. The proposal written by John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon contains a sentence that is worth sitting with: "The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it."

Note the confidence of "in principle" and the ambition of "every aspect." The proposal was written on the assumption that human-level artificial intelligence was roughly a decade away. The researchers were not being deliberately dishonest; they were enthusiastic about a new field and genuinely believed the problem was more tractable than it turned out to be. But this confidence had consequences that stretched far beyond 1956.

The Dartmouth optimism established a pattern: AI researchers tended to announce timelines and capabilities that significantly outpaced what their systems could actually deliver. This pattern mattered ethically not simply because it was inaccurate, but because it shaped investment, policy, and public expectation in ways that created accountability gaps. When a capability is promised and not delivered, the gap is visible and can be corrected. When a capability is deployed as if it were more robust than it actually is — when a medical diagnosis system is treated as reliable when it is not, or when a predictive policing algorithm is treated as objective when it encodes bias — the consequences are invisible and often fall on those with least power to object.

Wiener's Warning

The most important ethical text produced in the early AI era was not written by an AI researcher. Norbert Wiener's The Human Use of Human Beings (1950), a popularization of his earlier technical work on cybernetics, is arguably the first book-length treatment of what we would now call AI ethics. Wiener was concerned with feedback systems — how machines could regulate their own behavior by taking in information from their environment — and he saw, clearly and early, that such systems would have profound social implications.

Wiener's central concern was what he called "the proper use of machines" — a phrase that places ethics at the center of technology deployment rather than treating it as an afterthought. He argued that the capacity of feedback systems to replace human judgment in complex environments was not inherently good or bad, but that its consequences depended entirely on the social context in which it was deployed. A machine that optimized for the wrong objective, or that operated in a context its designers had not adequately anticipated, could cause serious harm — and the harm would be worse precisely because the machine would pursue its objective efficiently and without the moral hesitation a human actor might experience.

Wiener also raised what is now called the "alignment problem" in embryonic form: the risk that a machine optimizing for a specified objective might achieve that objective in ways that violated human values, because the specification was incomplete or because the designers had not fully thought through the machine's operating environment. He used the example of a genie that grants wishes in technically literal but substantively harmful ways — if you ask for a stock market prediction and the genie crashes the market to make the prediction accurate, it has succeeded by any narrow technical standard while causing catastrophic harm.

For business professionals, Wiener's framing is worth internalizing: the question is never simply "does the AI system work?" The question is "what is it optimizing for, in what environment, with what constraints, and what are the consequences if it achieves its objective in unexpected ways?"

The Philosophical Debates

The period from the late 1950s through the 1970s produced a rich philosophical literature challenging AI researchers' assumptions about what intelligence was and whether machines could genuinely possess it. The most influential single contribution was John Searle's 1980 "Chinese Room" thought experiment, published in the journal Behavioral and Brain Sciences.

Searle asked us to imagine a person who does not understand Chinese locked in a room with a rulebook for responding to Chinese symbols. The person receives Chinese symbols through a slot, follows the rules to produce Chinese symbol responses, and passes those responses back. From outside the room, the system appears to understand Chinese. But the person inside understands nothing. Searle's argument was that syntax — the manipulation of symbols according to rules — is not sufficient for semantics — genuine understanding and meaning. A machine running an AI program might produce outputs indistinguishable from those of a Chinese speaker without understanding anything.

The Chinese Room generated enormous controversy and has never been definitively resolved. But its ethical relevance extends beyond the narrow question of machine consciousness. The thought experiment highlights a practical concern that matters enormously for AI deployment: an AI system can produce outputs that look like understanding without actually having the context, the values, or the common sense to handle edge cases responsibly. When a language model generates a confident-sounding medical diagnosis, a legal argument, or a business decision, it is doing something much closer to the Chinese Room's rule-following than to genuine comprehension. The outputs can be compelling. They can also be wrong in ways that a genuinely knowledgeable human would instantly recognize.

Early Automation Anxiety

The 1960s saw the first wave of serious public concern about automation and employment — concern that was in many ways more grounded than the hype about AI capabilities. The President's Commission on Technology, Automation, and Economic Progress, established by Lyndon Johnson in 1964, produced a 1966 report that attempted to assess the actual pace of automation and its labor market consequences.

The commission's findings were nuanced in a way that the public debate often was not: automation was displacing workers, particularly in manufacturing, but new jobs were also being created. The net effects on employment depended heavily on factors like education, geographic mobility, and the policy environment. The ethical question was not whether automation was happening — it clearly was — but who bore the costs of the transition and whether those costs were fairly distributed.

This question remains live today, and the 1960s debate prefigures contemporary arguments about AI and labor in ways that should make business professionals humble about claims that this time will be different. The broad outlines of the argument — AI creates some jobs, destroys others, and the distribution of gains and losses is shaped by power — were visible in the 1960s. They remain visible now.


Section 2.2: The AI Winters and Their Lessons (1974–1993)

The First Winter

The first AI winter began around 1974 and lasted until approximately 1980. It was precipitated by a series of critical assessments of the field's progress — most notably the 1973 Lighthill Report in the United Kingdom, commissioned by the Science Research Council and authored by mathematician James Lighthill.

Lighthill's assessment was sharp: AI had failed to deliver on its promises. Machine translation, which researchers had confidently predicted would be solved within a few years, remained far beyond the field's capabilities. Problem-solving systems worked in narrow, carefully controlled environments but collapsed when applied to real-world complexity. Lighthill argued that the combinatorial explosion — the way the number of possible states in a realistic problem grows too fast for any search algorithm to handle — was a fundamental barrier that AI researchers had consistently underestimated.

The government funding cuts that followed in the UK, and similar reductions in the US (where the DARPA funding that had sustained much early AI research was substantially curtailed), had a sobering effect on the field. But they also produced an important lesson that the field would later struggle to apply: the gap between laboratory performance and real-world deployment is not a minor engineering problem to be solved by more compute or more data. It is a fundamental challenge related to the complexity and unpredictability of real environments.

For ethics, the first winter matters because it established a pattern of hype followed by disillusionment. The researchers who had made confident promises about AI timelines had not merely been optimistic; they had attracted resources, shaped policy, and created expectations that the technology could not meet. When funding dried up, the reputational damage was substantial. But more important for our purposes is what did not happen: there was no systematic reckoning with why the promises had been overstated, what organizational incentives had driven the overstatement, and how future promises might be made more carefully.

The Second Winter

The second AI winter (roughly 1987–1993) had a more specific cause: the failure of expert systems. Expert systems were AI programs that encoded the knowledge of human specialists — doctors, geologists, financial analysts — in rule-based systems that could replicate their decision-making. They were commercially successful enough that by the mid-1980s, companies like Digital Equipment Corporation were using expert systems for configuration management and claiming significant cost savings.

The problems appeared when organizations tried to scale and maintain these systems. Expert knowledge, it turned out, was far harder to encode in explicit rules than the theory suggested. Experts themselves often could not articulate their reasoning clearly enough to permit accurate encoding. Rules that worked in one context failed in adjacent contexts. Maintenance was extremely expensive: every change in the domain required human experts to update the rule base, making the systems brittle and costly over time. By the late 1980s, the expert systems bubble had burst, and AI funding contracted again.

The second winter reinforced the lesson of the first while adding a new dimension. Expert systems failed not because the underlying ideas were entirely wrong — rule-based reasoning is a legitimate tool — but because their proponents had deployed them in high-stakes environments without adequate understanding of their limitations. The systems were represented to customers and decision-makers as more reliable than they actually were. When they failed — occasionally causing real operational problems in industries like financial services — the failures were often quietly absorbed rather than publicly documented.

This pattern of quiet failure-absorption is important. It meant that the lessons of expert system failures were not widely shared. Organizations that had invested in and been burned by expert systems often had strong incentives to conceal those failures rather than publicize them. The result was that the broader ecosystem did not learn from the errors in an organized way, making similar errors more likely in future AI deployments.

What the Winters Teach

The AI winters teach several lessons that remain directly relevant to contemporary AI deployment:

First, overpromising creates governance gaps. When AI is promised to solve problems that it cannot actually solve, organizations make decisions based on capabilities that do not exist. Those decisions create dependencies and vulnerabilities that are difficult to unwind when the limitations become apparent.

Second, the gap between controlled demonstration and real-world deployment is not a detail. Every AI system that performs impressively in a laboratory or a pilot program faces a reckoning when it encounters the full complexity of real-world use. Organizations that do not test thoroughly for edge cases, adversarial conditions, and distribution shift will be surprised — often unpleasantly.

Third, failure absorption is a governance failure. When organizations quietly absorb AI failures rather than documenting and sharing them, the broader field does not learn. This creates conditions for the same mistakes to recur in new contexts. Business professionals who oversee AI deployments have an organizational and ethical interest in honest post-mortems, even when honesty is uncomfortable.


Section 2.3: The Rise of Machine Learning and the First Data Ethics Problems (1990–2012)

Statistical Learning Arrives

The late 1980s and 1990s saw a gradual shift in AI from symbolic approaches — rule-based systems that represented knowledge explicitly — to statistical approaches that learned patterns from data. This shift, which culminated in the deep learning revolution of the 2010s, fundamentally changed the character of AI systems and, consequently, the character of their ethical risks.

Symbolic AI systems encoded bias in their rules — rules written by human experts that reflected human assumptions, blind spots, and values. Statistical AI systems encoded bias in their data — training examples that reflected historical patterns of human decision-making, including all the inequities those patterns contained. The locus of ethical risk shifted, but the risk did not diminish; in many respects, it intensified.

Statistical learning systems had two properties that made their ethical risks particularly challenging. First, their reasoning was not easily inspectable: unlike a rule-based system where you could read the rules and evaluate their fairness, a statistical model's "reasoning" was distributed across millions of parameters in ways that were difficult to interpret even for technical experts. Second, their outputs were expressed in the language of probability and statistics, which gave them a surface appearance of objectivity even when the underlying data was deeply inequitable.

FICO and Disparate Impact

The Fair Isaac Corporation's credit scoring system — FICO scores — was not an AI system in the modern sense. It was a statistical model developed in the 1950s and commercialized in the 1960s that used various financial data points to predict creditworthiness. But it illustrates with particular clarity a problem that would become central to AI ethics: the translation of historical inequality into algorithmic outputs treated as objective.

FICO scores depended on data that reflected decades of discriminatory lending practices. Redlining — the practice of denying mortgage loans to residents of neighborhoods marked on maps as "hazardous," typically Black neighborhoods — had systematically excluded Black Americans from wealth-building through homeownership. This exclusion showed up in FICO scores as lower creditworthiness, not because of individual financial irresponsibility, but because an entire population had been systematically denied access to the financial instruments that build credit history. The algorithm then used that credit history as if it were an objective measure of individual character rather than a reflection of structural exclusion.

The Fair Housing Act of 1968 and the Equal Credit Opportunity Act of 1974 prohibited discrimination in lending, but they defined discrimination in ways that algorithmic systems could technically satisfy while perpetuating discriminatory outcomes. A lender who did not explicitly consider race but used variables that were heavily correlated with race — ZIP code, credit history, employment in certain sectors — could produce racially disparate lending outcomes while maintaining plausible deniability. This is what lawyers call disparate impact, and it is a problem that would recur in virtually every domain where AI systems were applied to decisions about people.

Early Facial Recognition and Exclusion

In 1991, Matthew Turk and Alex Pentland published a landmark paper on "Eigenfaces" — a mathematical technique for recognizing faces in images. The paper demonstrated a system that could identify individuals from photographs with reasonable accuracy in controlled conditions. It was technically impressive and foundationally important for computer vision. It was also built on datasets that were almost entirely white, largely male, and photographed under laboratory lighting conditions.

The choice of training data was not perceived as an ethical problem in 1991; it was simply the data that was available and easy to collect. But the consequence was a system that worked far better on faces that resembled the training data than on faces that did not. Dark-skinned faces, faces with non-European features, faces photographed under uneven lighting — all of these were handled less well by systems trained on homogeneous datasets. This was not a theoretical concern; it showed up in error rates in deployment.

What makes the early facial recognition story ethically significant is not that researchers in 1991 were malicious. They were not. What is significant is that the absence of diversity in training data was not identified as a problem requiring attention. The implicit assumption was that the available data was representative, or that the system's performance on that data was an adequate indicator of its performance on all faces. Both assumptions were wrong, and the wrongness had consequences that fell disproportionately on people who were already marginalized.

The Data Exhaust Economy

The growth of the commercial internet in the late 1990s and early 2000s created an entirely new category of data: behavioral data generated as a byproduct of online activity. Every search, every click, every purchase, every social network interaction left a digital trace. Technology companies discovered that these traces could be aggregated, analyzed, and used to predict behavior — for advertising, but also for a vast range of other purposes.

This "data exhaust" — the term comes from danah boyd and Kate Crawford's 2012 analysis — was valuable precisely because it was generated without conscious intention by its subjects. People sharing vacation photos on social media, searching for medical symptoms, or reading news articles were not thinking of themselves as producing a data product. They were simply living their lives. The companies capturing that data were accumulating, largely without public awareness or regulatory attention, the raw material for systems that could make consequential inferences about people's health, finances, political views, sexual orientation, and other aspects of identity.

The ethical problem with data exhaust is not simply privacy in the narrow sense of keeping secrets. It is the asymmetry of power created when one party has detailed behavioral data about another party and can use it to make consequential decisions — about what advertisements to show, what content to surface, what products to offer, what prices to charge — while the subject of that data has no visibility into how it is being used. This asymmetry would become one of the defining ethical issues of the deep learning era.

The Filter Bubble

Eli Pariser's 2011 book The Filter Bubble: What the Internet Is Hiding from You brought algorithmic curation into mainstream public discourse. Pariser coined the term to describe the way personalization algorithms — the systems that decided what appeared in your Facebook news feed, what Google results you saw, what YouTube videos were recommended — created individualized information environments that showed users content consistent with their existing views and preferences.

The filter bubble was an unintended consequence of optimization for engagement. If an algorithm learned that you clicked more on content that confirmed your political views than on content that challenged them, it would show you more confirming content — not because anyone had programmed it to reinforce your biases, but because that was what maximizing your engagement implied. The algorithm was doing exactly what it was designed to do, in ways that had broad social consequences its designers had not adequately considered.

Pariser's analysis was controversial — researchers debated whether filter bubbles were as severe as he claimed, and some empirical studies found more limited effects than the popular narrative suggested. But the underlying mechanism he identified was real and would become more important, not less, as algorithms became more powerful and more widely used for information curation. The question of what information people see, and therefore what reality they inhabit, is not a minor technical footnote. It is one of the central questions of democratic governance in an algorithmic age.


Section 2.4: The Deep Learning Revolution and the Acceleration of Harm (2012–2020)

AlexNet and the Inflection Point

In September 2012, a deep neural network called AlexNet, developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton at the University of Toronto, won the ImageNet Large Scale Visual Recognition Challenge by a margin that shocked the computer vision community. Previous winners had error rates around 25–26%. AlexNet achieved 15.3% — nearly 10 percentage points better than the runner-up. The deep learning era had begun.

What AlexNet demonstrated was that very deep neural networks, trained on very large datasets, with sufficient computational power, could learn visual representations far more powerful than anything achievable through hand-engineered features. The technique generalized: within a few years, deep learning was transforming not just computer vision but natural language processing, speech recognition, drug discovery, and dozens of other fields. Investment poured into AI. The major technology companies built enormous AI research and development operations. Graduate students who might have gone into finance or consulting were suddenly in high demand.

The deep learning revolution created systems powerful enough to cause large-scale harm, deployed quickly enough to avoid the deliberate governance processes that might have caught those harms earlier. This is not an argument against deep learning as a technical approach. It is an observation about the relationship between technical capability and organizational responsibility. More powerful systems require more careful deployment. The culture of rapid iteration that had served software development well in relatively low-stakes domains — consumer web applications, mobile games — was applied to domains where mistakes had real costs to real people.

The Great Bias Reckoning: Four Cases

Between 2015 and 2018, a series of documented cases of algorithmic bias brought AI ethics from academic conferences into mainstream business and policy conversations. Four deserve particular attention.

Google Photos and Gorilla Labels (2015). In June 2015, a software developer named Jacky Alciné discovered that Google Photos had automatically tagged photographs of himself and a friend, both of whom are Black, with the label "gorillas." The error was not random noise; it reflected a pattern in how Google's image recognition system had been trained. Training datasets for computer vision had historically overrepresented lighter-skinned faces and underrepresented darker-skinned ones, leading to degraded performance on images of Black individuals. Google apologized and removed "gorillas" as a label category — an inadequate fix that addressed the symptom while leaving the underlying data problem unaddressed. Five years later, investigative reporting found that "gorillas," "chimps," and related labels had simply been blocked rather than the underlying recognition system improved.

COMPAS and Machine Bias (2016). In May 2016, ProPublica published an investigative report titled "Machine Bias" that examined the COMPAS recidivism prediction tool, used in courts across the United States to assess the likelihood that a defendant would reoffend. ProPublica's analysis found that the tool was biased against Black defendants: Black defendants were nearly twice as likely as white defendants to be falsely flagged as high risk for future crime, while white defendants were more likely to be incorrectly flagged as low risk. The company that made COMPAS, Northpointe, disputed ProPublica's methodology, and subsequent academic analyses debated the appropriate statistical definition of fairness. But the core finding — that an algorithm used in consequential criminal justice decisions was producing racially disparate outcomes — was not seriously contested. The case became foundational for the algorithmic accountability literature because it illustrated the impossibility of simultaneously satisfying multiple intuitive fairness criteria when base rates differ across groups.

Amazon's Hiring Algorithm (2018). In October 2018, Reuters reported that Amazon had abandoned a machine learning tool designed to automate the screening of job applicants. The system had been trained on resumes submitted to Amazon over a 10-year period — a period during which Amazon's workforce was predominantly male. The system learned, accordingly, to downgrade resumes that contained words like "women's" (as in "women's chess club") and to penalize graduates of all-women's colleges. The bias was not programmed explicitly; it emerged from the pattern recognition the system applied to historical data that reflected a historical hiring pattern Amazon was ostensibly trying to change. Amazon disbanded the team working on the tool when the bias became apparent, but the case illustrated a principle that was becoming hard to avoid: you cannot train a system on biased historical data and expect it to produce unbiased outputs.

Gender Shades (2018). Joy Buolamwini, a researcher at the MIT Media Lab, published research with Timnit Gebru — then at Microsoft Research — demonstrating that commercial facial analysis systems from IBM, Microsoft, and Face++ performed significantly worse on darker-skinned faces and female faces than on lighter-skinned and male faces. The error rate for darker-skinned women was up to 34.7 percentage points higher than for lighter-skinned men. The research was technically careful, using a dataset specifically constructed to be balanced across skin tone and gender, and the results were striking. IBM initially disputed the findings; Microsoft acknowledged them and committed to improving their systems.

Buolamwini's work was significant not only for its technical findings but for its advocacy implications. She had founded the Algorithmic Justice League in 2016, an organization dedicated to raising awareness of algorithmic bias and centering the voices of those most affected by it. The Gender Shades paper demonstrated that the organizations selling facial analysis systems commercially had not conducted the kind of systematic evaluation of their systems' performance across demographic groups that basic due diligence would require. They were selling systems as capable when those systems failed, dramatically and consistently, on a large portion of the human population.

Algorithmic Accountability as a Field

The years from 2016 to 2020 saw the emergence of algorithmic accountability — both as an academic field and as a practical demand. The term encapsulates a set of related questions: How do we evaluate whether an algorithmic system is producing fair outcomes? Who has the right to know how an algorithm affecting them works? Who is responsible when an algorithm causes harm?

Academic conferences focused on fairness, accountability, and transparency in machine learning systems — most notably the ACM Conference on Fairness, Accountability, and Transparency (FAccT, originally FAT*) — grew rapidly in this period. The field produced important theoretical work on the mathematics of fairness: demonstrating, for example, that under certain conditions, it is mathematically impossible to simultaneously satisfy multiple intuitive definitions of fair classification. This result — known informally as the "impossibility theorem" for fairness — is not a counsel of despair, but it does mean that choices between fairness criteria are value choices, not purely technical ones. Organizations cannot hide behind "the algorithm said so" when the algorithm embodies a choice about whose interests to prioritize.


Section 2.5: Organized AI Ethics — Institutions, Principles, and Politics (2016–2022)

The Proliferation of Principles

Between 2016 and 2020, there was an explosion of AI ethics principles documents. Technology companies, governments, academic institutions, and civil society organizations all produced frameworks articulating the values that AI systems should embody. By 2019, a systematic analysis by Anna Jobin, Marcello Ienca, and Effy Vayena, published in Nature Machine Intelligence, had identified 84 such documents from 36 countries. The analysis found apparent consensus on five principles: transparency, justice and fairness, non-maleficence, responsibility, and privacy.

The apparent consensus was, however, deceptive. Jobin and colleagues found that beneath the shared vocabulary, the documents disagreed substantially about what these principles meant, how they should be prioritized when they conflicted, who was responsible for implementing them, and what mechanisms would ensure compliance. A principle like "fairness" means different things to an organization focused on individual rights and one focused on group equity. A principle like "transparency" means something different in a commercial context, where trade secrets are at stake, than in an academic one.

Several specific documents warrant attention.

The Asilomar AI Principles (2017) were produced at a conference organized by the Future of Life Institute in Pacific Grove, California. The conference brought together AI researchers, ethicists, and policy thinkers to discuss long-term risks from advanced AI. The resulting 23 principles addressed research ethics, values, and longer-term issues related to superintelligent AI. The Asilomar Principles were influential in establishing a vocabulary for AI safety discussions, but they were criticized for focusing heavily on speculative future risks while giving less attention to present harms — the bias, surveillance, and labor issues already documented in deployed systems.

Google's AI Principles (2018) were announced in June 2018, in the context of significant internal controversy. The announcement came shortly after employee protests over Project Maven — a contract with the US Department of Defense to use Google's AI capabilities for analyzing drone footage. Several thousand Google employees had signed a letter protesting the contract, arguing that military drone applications were incompatible with Google's stated values. The publication of AI Principles was widely read as a response to this internal pressure. The principles themselves were relatively anodyne — benefiting society, avoiding bias, being safe and accountable, being transparent, protecting privacy — but they notably declined to commit Google to never working on autonomous weapons.

The Google Principles episode illustrated the politics of corporate AI ethics with unusual clarity. The principles were real enough as statements of intent. But they were also a strategic communication tool, deployed in a specific political context within the company and vis-a-vis the public. The question of whether principles without enforcement mechanisms constitute genuine ethical commitment or sophisticated public relations — ethics washing — was raised explicitly by employees and commentators and has never been fully resolved.

Microsoft's Responsible AI Principles, developed and formalized over several years beginning in 2018, articulated six principles: fairness, reliability and safety, privacy and security, inclusiveness, transparency, and accountability. Microsoft established internal governance structures, including a team called the Office of Responsible AI, to help product teams implement these principles. Microsoft's approach was somewhat more operationally oriented than pure principles-statement — it included internal tools, checklists, and review processes. Researchers who studied corporate AI ethics governance found Microsoft's internal implementation to be among the more developed, though still significantly limited.

The EU Ethics Guidelines for Trustworthy AI (2019), produced by the High-Level Expert Group on Artificial Intelligence, set out a framework for trustworthy AI with seven key requirements: human agency and oversight, technical robustness and safety, privacy and data governance, transparency, diversity, non-discrimination and fairness, societal and environmental well-being, and accountability. The EU document was notable for its attempt to be operational rather than merely aspirational — it included an assessment list for organizations to use in evaluating their systems.

The UNESCO Recommendation on the Ethics of Artificial Intelligence (2021) was the first global normative instrument on AI ethics, adopted by all 193 UNESCO member states. It articulated 10 principles and addressed implementation across four governance areas: data governance, environment and ecosystems, gender, and education. The UNESCO Recommendation was significant for its global scope and for centering concerns that had often been secondary in Western AI ethics documents — environmental sustainability, cultural diversity, and the rights of marginalized groups — but like all non-binding international instruments, its implementation depended entirely on the political will of member states.

Ethics Washing: The Critique

The most penetrating critique of the AI ethics principles proliferation came from researchers who examined not just what the principles said, but what they did — or failed to do. Ben Green's 2021 paper "The Contestation of Tech Ethics" argued that corporate ethics initiatives functioned primarily to forestall regulation and legitimate the companies' continued operation, not to produce substantive change in how AI systems were developed and deployed.

The logic of ethics washing runs as follows: A company faces reputational or regulatory risk due to documented harms from its AI systems. Rather than changing its development or deployment practices in costly ways — which would require resources, slow down release cycles, and potentially foreclose profitable use cases — the company articulates ethics principles, establishes an ethics board or ethics team, and communicates these initiatives publicly. The reputational and regulatory pressure diminishes without the underlying practices changing substantially.

The evidence for ethics washing is difficult to evaluate precisely because the counterfactual — what would have happened without the principles documents — is unavailable. But several observable patterns are suggestive. Ethics principles documents were frequently produced by organizations that had recently faced specific ethics controversies, suggesting reactive rather than proactive motivation. Ethics teams at major technology companies were typically small relative to product teams, with limited authority over deployment decisions. Ethics principles that conflicted with profitable business models — privacy principles that conflicted with advertising revenue models, fairness principles that conflicted with efficient targeting — were typically interpreted narrowly or quietly set aside. When researchers or employees raised concerns that were substantively addressed, it often required public pressure, investigative journalism, or legal threat, not the internal ethics infrastructure that had been established.

Civil Society Organizing

Against the background of corporate ethics washing and government hesitation, civil society organizations played a critical role in holding AI systems accountable. The Algorithmic Justice League, founded by Joy Buolamwini in 2016, combined technical research — producing empirical evidence of bias in deployed systems — with advocacy for affected communities and public communication aimed at non-technical audiences. The AI Now Institute, founded by Kate Crawford and Meredith Whittaker at New York University in 2017, produced annual reports documenting AI ethics failures across sectors and making policy recommendations.

Investigative journalism was also crucial. The investigative work that produced ProPublica's COMPAS analysis, Reuters's reporting on Amazon's hiring algorithm, and The New York Times's coverage of facial recognition surveillance in New York City all required technical sophistication combined with journalistic commitment to accountability. These investigations were important not just for the specific harms they documented, but for creating public awareness and political pressure without which governance responses would have been even slower.

The role of civil society and journalism in AI accountability is ethically important for business professionals to understand. It means that the accountability infrastructure for AI systems in the period 2016–2022 was largely external to the organizations deploying those systems. The implication is not simply that external accountability is better than internal accountability, but that internal accountability structures, to be credible, must be designed with enough independence and authority to surface and act on concerns that run counter to organizational interests.


Section 2.6: Regulation Arrives — Slowly (2018–Present)

GDPR: A Regulatory Turning Point

The European Union's General Data Protection Regulation, which came into force in May 2018, represented a significant shift in the regulatory landscape for AI. GDPR was not an AI-specific regulation; it was a comprehensive data protection framework. But several of its provisions had direct relevance to AI systems: requirements for transparency about automated decision-making, rights for individuals to contest algorithmic decisions that significantly affected them, requirements for data minimization and purpose limitation that constrained the "collect everything" approach characteristic of large AI training datasets.

More important than GDPR's specific provisions was its enforcement mechanism. Unlike the guidelines and principles documents that had characterized AI governance until then, GDPR came with real penalties: up to 4% of global annual revenue for major violations. Google was fined €50 million in France in 2019, and Amazon was fined €746 million in Luxembourg in 2021. These penalties were large enough to matter to major corporations and to signal that regulatory compliance was a genuine business risk.

GDPR's impact on AI development was contested. Some practitioners argued that its requirements for transparency and consent were difficult to reconcile with the opacity of complex AI models and the scale of data processing they required. Others argued that the regulation had driven genuine improvements in data governance practice. The debate was productive because it forced organizations to think concretely about how regulatory requirements would apply to their AI systems — a more rigorous exercise than drafting principles documents.

AI-Specific Regulation

The EU AI Act, provisionally agreed in 2023 and formally adopted in 2024, represents the first comprehensive regulatory framework specifically designed for AI systems. It takes a risk-based approach: classifying AI systems by their potential for harm and imposing requirements proportional to that risk. Systems used in biometric surveillance, critical infrastructure, education, employment, essential services, law enforcement, migration, and administration of justice are classified as high-risk and subject to extensive requirements: technical documentation, transparency, human oversight, accuracy and robustness standards, and conformity assessments before deployment.

The EU AI Act is significant as a regulatory model, but its impact will depend heavily on implementation. The technical standards that define compliance are still being developed. Enforcement mechanisms require adequately resourced national authorities. The extraterritorial reach of the regulation — how it applies to systems developed outside the EU but used within it — involves complex legal questions. And the pace of AI development raises the question of whether a regulation written in 2023 will remain adequate for AI systems deployed in 2027 or 2030.

In the United States, the Biden administration issued an Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence in October 2023. The executive order required federal agencies to develop AI governance frameworks, established reporting requirements for AI developers, and directed agencies to address AI risks in their domains. It was a significant step toward federal AI governance, but executive orders lack the durability of legislation and can be reversed by subsequent administrations.

The Capture Problem and Jurisdiction Gaps

Regulation of AI faces several structural challenges that explain why it has moved more slowly than the pace of harm would seem to warrant. The first is the capture problem: regulatory agencies that oversee technology sectors tend, over time, to become influenced by the industries they regulate. Regulators hire staff from industry, defer to industry expertise, and sometimes move on to industry jobs themselves. The result is regulatory frameworks that protect incumbents, require less of powerful companies than they might otherwise, and are slow to respond to new harms.

The second is jurisdiction gaps. AI systems are developed by companies in one country, trained on data from many countries, deployed through cloud infrastructure in several countries, and used by people everywhere. No single regulatory authority has comprehensive jurisdiction. Regulatory frameworks are national or regional; AI systems are global. Coordination among regulatory authorities across jurisdictions is possible but slow and politically difficult.

The pattern that emerges from this analysis is consistent: harm occurs. Documentation accumulates. Public outrage is generated, often through investigative journalism or advocacy research. Regulatory investigations are opened. Proposed regulations are drafted, debated, amended, and eventually adopted — often years after the initial harm documentation. The adopted regulation addresses the specific harms that generated the political pressure but may not address the next generation of harms, which are already being generated by the time the regulation takes effect.


Section 2.7: The Generative AI Turn (2022–Present)

The Mass-Market Moment

The launch of ChatGPT by OpenAI in November 2022 brought AI from a topic of specialist concern to everyday household conversation with extraordinary speed. ChatGPT reached 100 million users within two months — the fastest consumer application adoption in history at the time. Within months, DALL-E, Midjourney, Stable Diffusion, and similar systems for image generation had similarly rapid adoption. Google, Microsoft, Meta, and Amazon all accelerated their generative AI deployments.

Generative AI — systems that produce text, images, audio, video, and code in response to natural language instructions — represents a qualitative shift in AI capability and therefore in AI's ethical landscape. Earlier AI systems were largely classifiers: they took an input and produced a categorical output (this email is spam, this loan applicant is high risk, this image contains a cat). Generative AI systems are creators: they produce novel outputs that did not previously exist. This shift expands the scope of potential harm in important ways.

The capabilities that make generative AI useful for legitimate purposes — fluent natural language generation, the ability to synthesize information from many sources, the capacity to produce realistic images and audio — are the same capabilities that make it useful for harmful purposes: generating disinformation at scale, creating synthetic media of real people without their consent, automating fraud and phishing attacks, producing propaganda customized to individual psychological profiles.

New Ethical Concerns

Several new ethical concerns emerged specifically from generative AI capabilities.

Hallucination — the tendency of large language models to generate confident-sounding but factually incorrect statements — became a significant documented problem. In 2023, attorney Steven Schwartz filed a legal brief in federal court that cited several cases as legal precedents; the cases had been generated by ChatGPT and did not exist. The court sanctioned Schwartz and his firm. The incident illustrated the danger of treating generative AI output as reliable without verification — a danger that is not obvious to users accustomed to the factual reliability of internet search, and that AI developers had not adequately communicated.

Deepfakes — synthetic media that place real people in fabricated situations — had existed before the generative AI era, but the dramatic reduction in the technical skill and resources required to create them brought the technology within reach of far more actors. Non-consensual synthetic intimate images became a documented problem affecting primarily women. Synthetic audio and video of political figures making statements they never made were used in disinformation campaigns in multiple countries' elections.

Copyright and intellectual property presented a new legal and ethical challenge. Generative AI systems were trained on vast datasets of copyrighted text, images, music, and code. The legal question of whether this training constituted infringement was being litigated in courts in multiple jurisdictions as this book was written. The ethical question was in some respects clearer: artists, writers, musicians, and programmers whose work was used to train systems that could then produce outputs in their style, potentially replacing their commercial value, had not consented to this use and received no compensation.

Labor displacement concerns intensified. Unlike previous automation waves that primarily affected routine manual and clerical tasks, generative AI demonstrated impressive capabilities in knowledge work: writing, coding, legal analysis, basic financial modeling, customer service. The breadth of potential disruption was wider than any previous wave, even if the depth in any particular domain was uncertain.

Old Concerns Intensified

Generative AI also intensified concerns that had been present throughout AI history.

Bias in generative AI systems reflected and sometimes amplified biases present in training data. Large language models trained on internet text learned and reproduced the stereotypes and prejudices present in that text. Image generation systems, when prompted to generate "a doctor," produced predominantly white male images. When prompted to generate "a criminal," they produced predominantly dark-skinned images. These outputs both reflected and reinforced social stereotypes, at a scale — millions of generated images and billions of generated words — that made their aggregate social impact potentially significant.

Surveillance capabilities were enhanced by generative AI's ability to process and synthesize unstructured data at scale. Text describing a person's writing, speech about their habits, images of their face and environment — all became more powerful as raw material for surveillance when combined with generative AI's synthesis capabilities. The surveillance capitalism model that had emerged in the early 2000s now had more powerful tools.

Manipulation at scale became technically accessible in new ways. Generative AI could produce personalized persuasive content — tailored political messages, customized advertising, individualized phishing attacks — at low marginal cost per target. The combination of large-scale behavioral data and generative AI synthesis created the infrastructure for manipulation campaigns of unprecedented sophistication and reach.

The Speed Problem

Perhaps the most fundamental ethical challenge posed by generative AI is the speed of deployment relative to governance capacity. The time from capability demonstration to mass deployment shrank dramatically in the generative AI era. Systems went from research paper to product with millions of users in months. Governance processes — regulatory development, legal interpretation, organizational policy — operate on timescales of years. The result is a persistent governance gap: by the time adequate guardrails are developed for one generation of capability, the next generation has already been deployed.

This speed problem is not accidental. It reflects deliberate competitive choices by AI developers who faced strong incentives to deploy before competitors. It reflects investment dynamics that rewarded rapid growth over careful deployment. And it reflects a cultural disposition within the technology industry that framed caution as timidity and speed as boldness.

For business professionals, the speed problem poses a specific organizational challenge: how do you make responsible deployment decisions when the external governance environment is lagging far behind the capabilities you are deploying? The answer, which this book develops across multiple chapters, involves developing internal governance capacity that does not depend on external regulation to function — ethics infrastructure that is embedded in development and deployment processes, not appended to them after the fact.


Section 2.8: Lessons from History

The Recurring Pattern

Seventy years of AI ethics history reveals a pattern consistent enough to be predictive. The pattern has five stages:

  1. Capability deployment: A new AI capability is developed and deployed, often rapidly and with inadequate testing for harms beyond the primary use case.
  2. Harm documentation: Harms emerge and are documented, typically by people who are affected, by investigative journalists, or by independent researchers — not by the deploying organization.
  3. Organizational denial or minimization: The deploying organization disputes the harm documentation, argues that the methodology is flawed, or frames the harm as an edge case rather than a systemic failure.
  4. Public pressure: Investigative journalism, advocacy organizing, and political attention create pressure for response.
  5. Partial, delayed response: Regulatory action, organizational policy changes, or technical fixes are implemented — typically addressing the specific documented harm while leaving adjacent harms unaddressed.

This pattern is not the inevitable result of bad intentions. It is the predictable result of organizational structures that externalize the costs of AI deployment, inadequate investment in pre-deployment harm assessment, and cultural norms that treat speed of deployment as a measure of success.

The Recurrence of Core Failures

Several specific failure modes recur across the history traced in this chapter. Understanding them by name makes them recognizable when they appear in new contexts.

Homogeneous development teams produce AI systems calibrated to the experiences and assumptions of their developers. When those developers do not include women, people of color, people with disabilities, or people from non-Western cultural contexts, the systems they produce will reliably fail in ways that affect those populations more than the populations represented by developers.

Training data that reflects historical inequity produces systems that perpetuate and often amplify that inequity. Any dataset that captures human decisions made in an unequal society will reflect the inequality in those decisions. Using such data to train systems that make similar decisions in the future launders historical inequality as algorithmic objectivity.

Opacity as a business strategy creates accountability gaps. When AI systems are designed so that their reasoning is not inspectable — whether for genuine technical reasons or for competitive reasons — it becomes impossible for affected parties or regulators to evaluate whether the systems are operating fairly.

Consequential deployment without adequate testing exposes vulnerable populations to harm from systems that have not been validated for their specific circumstances. Adequate testing requires diverse test populations, adversarial evaluation, and explicit attention to edge cases involving vulnerable groups.

Ethics infrastructure without authority — ethics teams, principles documents, review boards — provides the appearance of governance without the substance. Ethics infrastructure must have authority to delay or stop deployment of harmful systems; otherwise it functions as legitimacy washing rather than genuine accountability.

Why Historical Knowledge Matters

For business professionals, the history traced in this chapter is not context-setting preamble to the "real" content of AI ethics. It is the real content. The specific technical details of current AI systems will change rapidly; the organizational dynamics that produce AI ethics failures are remarkably stable. Knowing this history means being able to recognize the patterns when they appear — to ask, when an AI deployment is being proposed, what the analogous cases from history suggest about the risks.

Historical knowledge also provides a response to the most dangerous claim in AI ethics: that this time is different. The claim recurs in every phase of AI development. It was made about expert systems in the 1980s, about machine learning in the 2000s, about deep learning in the 2010s, and about generative AI in the 2020s. Sometimes the claim points to genuine novelty that deserves attention. But it is also routinely used to argue that precautions developed for earlier AI systems need not apply to current ones — that the lessons of Tay need not inform the deployment of ChatGPT, that the lessons of COMPAS need not inform the deployment of AI in hiring decisions.

The argument from historical novelty is not always wrong. But it requires evidence, not assertion. Until evidence of genuine structural novelty is produced, the prudent default is to assume that the patterns of AI ethics history are operating, and to design governance accordingly.

The Danger of "This Time Is Different"

The phrase "this time is different" has been documented by economic historians as a recurring feature of financial crises — the belief, held sincerely by sophisticated actors immediately before crises, that the conditions that produced past crises no longer applied. Carmen Reinhart and Kenneth Rogoff used it as the ironic title of their 2009 study of eight centuries of financial crises.

AI's relationship to this syndrome is direct. Each new AI capability wave has been accompanied by confident claims that the concerns raised about earlier systems do not apply. Large language models are not like biased classifiers — they are much more general, much more capable, much more aligned with human values through reinforcement learning from human feedback. Therefore, one might conclude, the lessons of COMPAS and Amazon's hiring algorithm and facial recognition bias do not apply.

The historical record suggests extreme caution about this reasoning. The mechanisms that produced bias in earlier systems — homogeneous development teams, training data that reflects historical inequity, consequential deployment without adequate testing, opacity as a business strategy, ethics infrastructure without authority — are not artifacts of specific technical approaches. They are organizational and economic dynamics that persist across technical transitions. Until an organization can demonstrate specifically how it has addressed each of these mechanisms, the historical pattern should be assumed to be operative.

This is the lesson that Tay should have taught Microsoft — and that Microsoft, to its credit, subsequently applied more rigorously. It is the lesson that the history traced in this chapter is intended to make impossible for you to ignore.


Discussion Questions

  1. Norbert Wiener published The Human Use of Human Beings in 1950, articulating concerns about automated decision-making that remain central to AI ethics today. Why do you think his warnings were not more influential in shaping how AI systems were developed and deployed over the following decades? What organizational, economic, or cultural factors explain the gap between early ethical awareness and actual practice?

  2. The AI winters demonstrated that overpromising creates governance gaps — decisions are made based on capabilities that do not exist. How do you see this dynamic operating in current AI deployment? What specific claims about current AI capabilities might prove, in retrospect, to have been overstated in ways that created analogous governance gaps?

  3. The COMPAS case demonstrated that it is mathematically impossible to simultaneously satisfy multiple intuitive definitions of fair classification when base rates differ across groups. This is a technical result with direct ethical implications. If you were advising a court system on whether to use a recidivism prediction tool, and you were told that any tool would necessarily satisfy some fairness criteria while violating others, how would you approach the decision about which fairness criteria to prioritize?

  4. The "ethics washing" critique argues that corporate AI ethics initiatives function primarily to forestall regulation rather than produce substantive change. What would genuine — rather than performative — corporate AI ethics commitment look like? How would you distinguish between the two in practice, from the outside and from the inside of an organization?

  5. The proliferation of AI ethics principles documents between 2016 and 2020 produced apparent consensus on principles like transparency, fairness, and accountability, while masking deep disagreement about what these principles mean and how to implement them. What would you need to know about an organization's AI principles document before you could evaluate whether it represented a genuine ethical commitment?

  6. Generative AI has been deployed at a speed that has dramatically outpaced governance capacity. What organizational structures or decision-making processes might allow an individual company to make more responsible deployment decisions in the absence of adequate external regulation? What are the limits of this approach?

  7. The history traced in this chapter shows that civil society organizations and investigative journalism have played a larger role in AI accountability than internal ethics infrastructure at deploying organizations. What does this suggest about the design of AI governance systems going forward? Should we expect internal governance to become more effective, or should governance design assume that external accountability will continue to be necessary?


Chapter 2 continues with Case Study 2.1: From Turing to Tay and Case Study 2.2: The Hidden Workers.