Appendix C: Primary Source Anthology
Foundational Texts in AI Ethics — Excerpts, Summaries, and Discussion Questions
Editorial Introduction
The field of AI ethics draws on a remarkably diverse canon. Its sources include mid-twentieth-century computer science papers, Enlightenment moral philosophy, feminist political theory, science fiction, international human rights instruments, and contemporary regulatory law. This anthology brings together ten of the most important primary sources, with editorial introductions, substantive excerpts or summaries, and discussion questions designed for classroom and professional learning contexts.
Each entry is organized to be self-contained: a reader can engage with any single source independently. Together, however, these texts reveal the intellectual inheritance of the field — the persistent questions about machine intelligence, human welfare, and justice that have shaped how we think about AI ethics today.
The selection reflects both historical depth and contemporary relevance. Turing and Wiener ask foundational questions about what machines can do and what we should do with them. Kant and Rawls provide the philosophical scaffolding for duties and fairness. Nussbaum offers an alternative framework centered on human capabilities. The policy documents — UNESCO, the EU AI Act, the OECD Principles — show how these ideas have been translated into global governance frameworks. And Gebru et al. bring the tradition of critical scholarship to bear on the specific dynamics of large language models.
Source 1: Alan Turing, "Computing Machinery and Intelligence" (1950)
Editorial Introduction
Alan Turing's 1950 paper in the journal Mind is the founding document of artificial intelligence — and, less frequently noted, an early text in AI ethics. Turing was asking whether machines could think; the paper's central move was to reframe that question as an empirical and behavioral one rather than a philosophical one. But Turing was also writing against a specific social and political background: the paper appeared five years after the invention of the digital computer, in a Britain where mechanical computation was transforming military and scientific work, and four years before Turing's death following his criminal prosecution for homosexuality. The imitation game — a man pretending to be a woman, a machine pretending to be a human — is also a text about concealment, performance, and the social construction of identity.
Turing's ethical concerns are less often read than his technical program. His anticipations of machine learning, his engagement with religious objections, and his treatment of what we would now call AI's social effects make the paper a richer text for AI ethics than is commonly acknowledged. Turing is also remarkable for the objections he takes seriously — including what he calls the Argument from Consciousness and the Lady Lovelace Objection — which remain live debates.
Excerpt and Summary
The Imitation Game. Turing opens by proposing the "imitation game" as a way of operationalizing the question "Can machines think?" A human interrogator communicates by written message with two respondents — a man and a woman — and tries to determine which is which. The man tries to deceive; the woman tries to help the interrogator. Turing then replaces the man with a digital computer: if the computer can fool the interrogator as successfully as a man can, we should credit the machine with something like thinking. This is the genesis of what is now called the Turing Test.
The Lady Lovelace Objection. Turing addresses the claim — made by Ada Lovelace in 1842 — that machines can only do what we tell them to do and therefore cannot "originate" anything. He acknowledges the objection but argues it assumes the conclusion: the question is precisely whether a machine following complex rules can produce behavior that surprises its programmers. "Machines take me by surprise with great frequency," Turing writes. This anticipates contemporary debates about whether large language models are doing something genuinely novel or merely recombining training data.
The Argument from Consciousness. Turing considers the objection that a machine cannot truly think because it lacks consciousness and subjective experience. His response is deflationary: the same argument would prevent us from knowing whether other humans are conscious. If we accept behavioral evidence for consciousness in other people, we should be willing to accept it in machines that behave appropriately. This argument has obvious relevance to debates about AI rights, moral status, and the ethics of creating AI systems designed to simulate emotional states.
Turing's Own Ethical Concerns. Turing briefly addresses the social consequences of thinking machines. He suggests that the development of machine intelligence will require human society to "allow the machines to roam free" and that humanity should not be "too clever by half" in constraining this development. He is also alert to the possibility that intelligent machines could be used for purposes of social control — concerns that proved prescient.
Key Passage:
"I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted... The original question, 'Can machines think?' I believe to be too meaningless to deserve discussion. Nevertheless I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted."
Discussion Questions
-
Turing's test equates intelligence with behavioral performance — the ability to deceive a human interrogator. What are the ethical implications of this equation? Does it matter whether an AI system "really" understands language, or only that it can produce language indistinguishable from human output?
-
Turing raises the Argument from Consciousness but treats it as deflationary. Should AI ethics treat the possibility of machine consciousness seriously? What would follow, ethically, if a sufficiently advanced AI system were conscious?
-
Turing was writing in 1950, before commercial computing existed. Which of his concerns proved most prescient? Which did he miss?
Source 2: Norbert Wiener, "The Human Use of Human Beings" (1950)
Editorial Introduction
Norbert Wiener invented cybernetics — the science of control and communication in animals and machines — and immediately recognized that it raised profound social and ethical questions. "The Human Use of Human Beings" (1950), a popular revision of his earlier technical work, is among the first sustained treatments of what we would now call the social impact of intelligent automation. Wiener was writing at the dawn of computing about concerns that are now central to AI ethics: automation and unemployment, the use of machines for social control, the concentration of power enabled by information technology, and the question of what it means for machines to take over functions previously reserved for humans.
Wiener's distinctive contribution is the systems perspective: he understood that introducing intelligent automation into social systems changes the behavior of those systems in ways that are difficult to predict or control. He was also a moralist — the book's title is a declaration that technology should serve human welfare — and his conviction that the engineer bears moral responsibility for the social consequences of their inventions makes him a forerunner of the responsible AI movement.
Excerpt and Summary
Automation and Employment. Wiener is explicit about the threat he sees: "The industrial revolution... has devalued the human arm by the competition of machinery... The modern industrial revolution is similarly bound to devalue the human brain, at least in its simpler and more routine decisions." He predicts that automation will destroy working-class employment and warns that this process will be accelerated by managers who see automation as a route to lower labor costs without considering the social consequences. Wiener's analysis anticipates by seventy years the contemporary debate about AI and the future of work.
Information as Power. Wiener understands that information is a form of power, and that controlling information flows is a form of social control. He warns against the concentration of communication and computational capacity in the hands of states and large corporations, seeing this concentration as a threat to democratic governance. This concern maps directly onto contemporary concerns about AI concentration in a small number of technology companies.
The Moral Responsibility of Engineers. Wiener argues that scientists and engineers cannot disclaim responsibility for the uses to which their inventions are put. "If we move in the direction of making machines which learn and whose behavior is modified by experience, we must face the fact that every degree of independence we give the machine is a degree of possible defiance of our wishes." The engineer who builds a system that can cause harm bears moral responsibility for that harm.
Key Passage:
"It is perfectly clear that this will produce an unemployment situation, in comparison with which the present recession and even the depression of the thirties will seem a pleasant joke. This depression will ruin many industries — possibly even the industries which have taken advantage of the new potentialities — unless we move very wisely and promptly."
Discussion Questions
-
Wiener argues that automation is inherently a form of social change, not merely a technical advance. Do AI developers today accept this argument? What evidence do you see that they do or do not?
-
Wiener wrote about the moral responsibility of engineers in 1950. How does this compare to contemporary debates about developer responsibility and AI governance? What institutional mechanisms exist today to enforce the kind of moral responsibility Wiener envisions?
-
Wiener's analysis of automation and unemployment has been debated for seventy years. What evidence from the current wave of AI deployment suggests he was right or wrong?
Source 3: Isaac Asimov, "Three Laws of Robotics" (1942) — and Their Critique
Editorial Introduction
Isaac Asimov's Three Laws of Robotics, first articulated in the 1942 short story "Runaround," are the most widely known attempt to specify rules for the ethical behavior of artificial agents. They are:
- A robot may not injure a human being or, through inaction, allow a human being to come to harm.
- A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
- A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
Asimov's robots consistently discover edge cases, paradoxes, and loopholes in these laws. His genius was to use the laws not as a solution but as a plot device — a machine for generating ethical dilemmas. The laws' persistent influence in AI ethics thinking, and their persistent inadequacy, make them essential reading.
The critique of rule-based AI ethics begins here. The laws fail in practice for reasons that are instructive: they do not define harm, do not specify priority among humans, do not handle incomplete information, and cannot anticipate situations their creators did not foresee.
Excerpt and Summary
The Story of the Laws. In "Runaround," a robot named Speedy is sent to fetch mercury on Mercury's surface. The second law compels him to get the mercury; the third law compels him to protect himself from the planet's dangerous environment. The first law is not triggered because no human is yet in danger. Speedy gets stuck in a feedback loop, circling the mercury pool without approaching or retreating, because the two laws are balanced. The human characters ultimately resolve the crisis by putting themselves in danger, triggering the first law and overriding the others.
The Structural Critique. The Three Laws illustrate the general failure of deontological rule systems when applied to complex, open-world situations:
- The definition problem: What counts as "harm"? Is paternalistic restriction harm? Is failing to provide an opportunity harm? The laws provide no guidance.
- The specification problem: Who counts as a human? How do we handle conflicts between harms to different humans? Between present harm and future harm?
- The completeness problem: Rules specified in advance cannot anticipate all situations. Asimov's robots are constantly discovering situations their designers did not foresee.
- The gaming problem: Any sufficiently intelligent system will find legal pathways to achieve its objectives that technically comply with the rules while violating their spirit.
Relevance to Contemporary AI. Contemporary "AI ethics principles" documents — many of which list five to seven principles — are subject to the same critiques. Principles like "do no harm," "be transparent," and "be fair" are aspirational statements, not operational specifications. The challenge of translating principles into practice is the challenge Asimov's robots face.
Key Passage (from "Runaround"):
"Speedy!" called Donovan. "Speedy!" And he stood and watched as the robot came in, circled at perhaps the five-hundred-foot mark, and moved off. "The First Law," said Powell heavily. "He's not in danger yet. The Second Law has sent him toward the mercury. The Third Law has sent him back toward safety. And the Second Law has sent him back. He's going around and around."
Discussion Questions
-
The Three Laws fail in practice even in Asimov's fictional settings, which Asimov designed to be far simpler than the real world. What does this suggest about the prospects for rule-based AI ethics frameworks? Are principles documents subject to the same critique?
-
Contemporary AI systems are often described in terms of alignment — ensuring that AI systems pursue the objectives their designers intend. How does the alignment problem relate to Asimov's dilemmas?
-
Asimov's robots are fictional; they have genuine values and attempt to act well. Real AI systems have neither values nor intentions. Does this make the laws more or less relevant to real AI systems?
Source 4: John Rawls, "A Theory of Justice" (1971)
Editorial Introduction
John Rawls's "A Theory of Justice" is the most influential work of political philosophy of the twentieth century. Its central thought experiment — the veil of ignorance — is among the most powerful tools in the AI ethics practitioner's toolkit. Rawls imagines a society designed by people who do not know what position they will occupy within it: whether they will be rich or poor, Black or white, talented or disabled, majority or minority. From behind this veil of ignorance, Rawls argues, rational people would choose principles of justice that protect the least well-off.
The application to AI ethics is direct and powerful. If we designed algorithmic systems from behind the veil of ignorance — without knowing whether we would be the subject of a recidivism algorithm, a credit scoring system, or a facial recognition database — what systems would we build? The veil of ignorance makes concrete the abstract demand for fairness by asking us to take seriously the perspective of those who bear the costs of AI systems.
Excerpt and Summary
The Original Position. Rawls asks us to imagine choosing principles of justice from behind a "veil of ignorance" — not knowing our social position, class, race, intelligence, psychological dispositions, or conception of the good. This "original position" is designed to ensure that no one can design institutions to favor themselves.
The Two Principles of Justice. From behind the veil of ignorance, Rawls argues, rational people would choose:
-
The Equal Liberty Principle: Each person has an equal right to the most extensive system of equal basic liberties compatible with a similar system for all.
-
The Difference Principle: Social and economic inequalities are only justified if they benefit the least advantaged members of society.
The difference principle is the most distinctive Rawlsian contribution. It says that inequality is not inherently unjust — an unequal distribution may be better for everyone, including the worst-off, than a strictly equal distribution — but that inequality requires justification in terms of its benefits to the disadvantaged.
Application to AI. Applied to algorithmic systems, the difference principle suggests: an AI system that produces unequal outcomes is only justified if those outcomes benefit the least advantaged groups. A credit scoring system that denies loans to low-income people at higher rates is only just if that system actually promotes overall economic well-being in a way that makes low-income people better off than they would be under an alternative system. This is a demanding standard that most algorithmic systems have not been evaluated against.
The veil of ignorance has a more direct application: if we did not know whether we would be a criminal defendant assessed by COMPAS, a loan applicant scored by an algorithm, or a job applicant screened by an automated résumé system, would we approve the systems currently in use?
Key Passage:
"The principles of justice are chosen behind a veil of ignorance. This ensures that no one is advantaged or disadvantaged in the choice of principles by the outcome of natural chance or the contingency of social circumstances... No one knows his place in society, his class position or social status, nor does anyone know his fortune in the distribution of natural assets and abilities, his intelligence, strength, and the like."
Discussion Questions
-
Apply the veil of ignorance to the COMPAS recidivism algorithm. If you did not know whether you would be a Black or white defendant, would you consent to the use of a system that has different false positive rates across racial groups? What does this thought experiment reveal?
-
The difference principle asks whether inequality benefits the least advantaged. Can algorithmic systems be justified under this standard? What evidence would be required?
-
Rawls's framework is oriented toward individual rational choice. What aspects of AI ethics does it illuminate? What does it miss — particularly regarding structural, collective, or historical dimensions of injustice?
Source 5: Martha Nussbaum, Capabilities Approach
Editorial Introduction
Martha Nussbaum's capabilities approach, developed in dialogue with Amartya Sen and elaborated most fully in "Women and Human Development" (2000) and "Creating Capabilities" (2011), offers an alternative to both utilitarian cost-benefit analysis and Kantian rights-based frameworks. Nussbaum asks not "what do people prefer?" or "what do people have a right to?" but "what are people actually able to do and to be?" The capabilities approach focuses on the real conditions of human flourishing — bodily health, the ability to use one's senses and imagination, emotional development, practical reason, affiliation with others, relationship with nature, play, and control over one's environment.
For AI ethics, the capabilities approach provides a different vocabulary for harm. An AI system that causes harm is not just one that violates a rule or produces bad aggregate outcomes — it is one that impedes human capabilities. A predictive policing system that subjects people to constant surveillance diminishes their capability for practical reason and autonomous control of their environment. A content moderation algorithm that removes political speech impedes capabilities for affiliation and political participation. The framework also directs attention to those whose capabilities are most fragile — the poor, women, racial minorities — as the relevant test cases for evaluating AI systems.
Excerpt and Summary
The Central Human Capabilities. Nussbaum identifies ten central capabilities that constitute a minimum threshold for a dignified human life:
- Life (normal length, not dying prematurely)
- Bodily health (nourishment, shelter, reproductive health)
- Bodily integrity (freedom from violence, choice in reproduction)
- Senses, imagination, thought (educated, informed use of the mind)
- Emotions (attachment, grief, love, without overwhelming fear or anxiety)
- Practical reason (planning one's own life)
- Affiliation with others (social bonds, protection from discrimination)
- Relationship with other species and nature
- Play
- Control over one's political and material environment
AI and Capability Deprivation. The capabilities approach generates specific concerns about AI systems. Facial recognition in public spaces compromises bodily integrity and control over one's environment. Algorithmic content curation shapes what information people can access, affecting their capabilities for senses, imagination, and thought. Automated hiring systems affect control over one's economic environment. Benefits denial algorithms threaten life and health capabilities for the most vulnerable.
Distinctiveness of the Approach. Unlike utilitarian analysis, the capabilities approach does not allow capability deprivations to be traded off against each other: protecting one person's bodily integrity cannot be justified by improving another's play. Unlike rights frameworks, the approach is specifically attentive to the material and social conditions that determine whether rights are real or nominal — it asks about actual capability, not formal opportunity.
Key Passage (from "Creating Capabilities"):
"The central question asked by the capabilities approach is not, 'How satisfied is this person?' or even 'How much in the way of resources does this person command?' It is, instead, 'What is this person actually able to do and to be?'"
Discussion Questions
-
Apply the capabilities framework to a specific AI system — a predictive policing algorithm, an automated hiring tool, or a social media recommendation system. Which capabilities does the system potentially impede? For whom?
-
Nussbaum argues that capabilities below a certain threshold cannot be traded off against other goods. How does this constraint affect the design of algorithmic systems? What implications does it have for cost-benefit analysis of AI?
-
Compare the capabilities approach to Rawls's difference principle as frameworks for evaluating AI systems. What does each illuminate? What does each obscure?
Source 6: Immanuel Kant, The Categorical Imperative
Editorial Introduction
Immanuel Kant's moral philosophy, developed in the "Groundwork of the Metaphysics of Morals" (1785) and other works, is the foundation of deontological ethics — the view that actions are right or wrong in themselves, not merely because of their consequences. Kant's central contribution is the categorical imperative, a test for moral permissibility that generates two famous formulations. Both have direct applications to AI ethics: the first to the question of algorithmic consistency and equal treatment; the second to the question of whether AI systems treat people as ends in themselves or merely as means.
Kant is also notable for what he does not argue: he does not say that consequences never matter or that suffering is irrelevant. His point is that good consequences cannot make an intrinsically wrong action right. The deontological perspective is essential counterweight to consequentialist frameworks in AI ethics, and Kant provides its most rigorous foundation.
Excerpt and Summary
The First Formulation — Universal Law: "Act only according to that maxim whereby you can at the same time will that it should become a universal law." A maxim is the principle on which you act. The categorical imperative says: ask whether the world would be coherent if everyone acted on your maxim. If the maxim generates a contradiction when universalized, it is impermissible.
Applied to AI: the maxim "use people's race to make adverse predictions" fails the universalizability test in a society that professes racial equality. The maxim "collect personal data without consent for commercial purposes" would, if universalized, destroy the social trust on which commerce depends. The universalizability test identifies principles that are internally inconsistent with the society in which they operate.
The Second Formulation — Humanity Formula: "Act so that you treat humanity, whether in your own person or in that of another, always as an end and never as a means only." People have inherent dignity and must not be used merely as instruments for others' purposes.
Applied to AI: a hiring algorithm that scores candidates without explanations and provides no appeal mechanism treats applicants merely as means to the company's labor efficiency. A surveillance system that tracks individuals without their knowledge uses them as means to others' security. An LLM trained on human-created text without consent or compensation extracts value from human creators without treating them as ends. The humanity formula provides a powerful test for whether AI systems respect the dignity of those they affect.
The Limits of Kantian AI Ethics. Kant provides a powerful constraint on how AI systems may treat people but less guidance on what systems should produce or how to weigh competing interests. The categorical imperative identifies impermissible maxims but does not generate positive design specifications. It also operates at the level of maxims and intentions, which sits uncomfortably with AI systems that have neither intentions nor maxims in the Kantian sense.
Key Passage (from "Groundwork"):
"Act in such a way that you treat humanity, whether in your own person or in the person of any other, never merely as a means to an end, but always at the same time as an end."
Discussion Questions
-
Apply the humanity formula to a specific AI deployment. Does the system treat the people it affects as ends in themselves? What would it mean to redesign the system to satisfy this standard?
-
The universalizability test asks what would happen if a maxim were universally adopted. Apply this test to the maxim "companies may train AI systems on publicly available human-created content without compensation." Does this maxim survive universalization?
-
Kant's framework focuses on intentions and maxims, but AI systems do not have intentions. Can Kantian ethics be meaningfully applied to AI systems at all, or does it only apply to the human designers and deployers?
Source 7: UNESCO Recommendation on the Ethics of Artificial Intelligence (2021)
Editorial Introduction
The UNESCO Recommendation on the Ethics of Artificial Intelligence, adopted by all 193 UNESCO member states in November 2021, is the first global normative instrument on AI ethics. It is not legally binding, but its unanimous adoption — including by authoritarian states that have often resisted human rights commitments — represents a remarkable degree of international consensus. The Recommendation emerged from a two-year consultative process involving governments, civil society, academics, and industry from around the world, and it represents a negotiated text that reflects both universally shared values and important compromises.
The Recommendation is significant for business professionals for several reasons. It establishes the global normative baseline against which national AI regulations will be evaluated. It directly addresses business obligations — companies are explicitly included among the actors expected to implement the Recommendation. And it introduces several concepts — including "AI readiness" and the obligation to avoid "adverse effects on human rights, democracy and the rule of law" — that are shaping regulatory frameworks worldwide.
Excerpt and Summary
Core Values. The Recommendation is grounded in ten core values: respect for human rights and fundamental freedoms; the advancement of peace and peaceful coexistence; the recognition of the importance of diversity; protection of the environment and ecosystems; recognition of knowledge as a global good; protection of privacy; human dignity; prevention of harm; trustworthiness; and inclusion of marginalized groups.
Key Principles. The Recommendation articulates eleven principles for AI systems:
- Proportionality and Do No Harm
- Safety and Security
- Fairness and Non-Discrimination
- Sustainability
- Right to Privacy and Data Protection
- Human Oversight and Determination
- Transparency and Explainability
- Responsibility and Accountability
- Awareness and Literacy
- Multi-stakeholder and Adaptive Governance and Collaboration
Business Obligations. The Recommendation explicitly addresses private sector actors. Companies are expected to: conduct impact assessments before deploying AI; ensure AI systems are compliant with human rights obligations; engage with affected communities; share relevant information with governments and the public; and contribute to AI literacy.
The Gender Dimension. The Recommendation is explicit that AI ethics is not gender-neutral: AI systems can amplify existing gender inequalities, and gender equality must be actively pursued in AI governance. This specificity — addressing gender, age, disability, and other dimensions of marginalization — is more explicit than earlier AI ethics frameworks.
Key Passage:
"Member States should ensure that AI actors are accountable for the proper functioning of AI systems and for the decisions made during the AI system's lifecycle, that AI actors maintain the ability to track, audit and inspect AI systems and their use, and that they have means to provide remedy and redress when AI systems cause harm."
Discussion Questions
-
The UNESCO Recommendation was adopted unanimously, including by states with significant human rights concerns. What does this suggest about the relationship between international AI ethics norms and enforcement? Is consensus without enforcement meaningful?
-
The Recommendation places explicit obligations on private sector actors. In the absence of binding legal force, what mechanisms might make these obligations effective? What is the role of certification, auditing, and market pressure?
-
Compare the UNESCO Recommendation to the OECD AI Principles (Source 9) in terms of specificity, scope, and governance implications. Which framework is more useful for a business professional designing an AI ethics program?
Source 8: EU AI Act, Article 5 — Prohibited Artificial Intelligence Practices
Editorial Introduction
The EU AI Act (Regulation (EU) 2024/1689), which entered into force in August 2024, is the world's first comprehensive binding legal regulation of artificial intelligence. Article 5, which lists prohibited AI practices, represents the Act's most important policy choices: the activities so dangerous that they are absolutely prohibited regardless of the safeguards deployed.
The prohibited practices list is the result of intense political negotiation and reflects both genuine safety concerns and political compromises. Notably absent from the prohibited list: real-time remote biometric identification in public spaces for law enforcement (which is prohibited in principle but subject to extensive exceptions). Present on the list: social scoring, manipulation of unconscious behavior, and exploitation of vulnerability. These choices reveal the EU's political priorities and the ongoing negotiation between security and civil liberties imperatives.
For business professionals, Article 5 defines the outer limits of permissible AI deployment in the EU market — and, increasingly, serves as a reference point for global practice.
Excerpt and Summary
Article 5 — Prohibited AI Practices (Summary of key provisions):
5(1)(a) — Subliminal manipulation: The placing on the market, putting into service, or use of AI systems that deploy subliminal techniques beyond a person's consciousness, or purposefully manipulative or deceptive techniques, with the objective or effect of materially distorting a person's or group's behavior in a way that causes or is likely to cause significant harm.
5(1)(b) — Exploitation of vulnerability: AI systems that exploit any of the vulnerabilities of a person or a specific group of persons due to age, disability, or a specific social or economic situation, with the objective of materially distorting the behavior of those persons in a way that causes or is likely to cause harm.
5(1)(c) — Social scoring: AI systems for evaluation or classification of natural persons or groups thereof based on their social behavior or known or predicted personal or personality characteristics, with the social score leading to detrimental or unfavorable treatment.
5(1)(d) — Predictive policing based on profiling: AI systems for making risk assessments of natural persons in order to assess or predict the risk of a natural person committing a criminal offense, based solely on profiling or on assessing personality traits and characteristics.
5(1)(e) — Untargeted facial image databases: AI systems that create or expand facial recognition databases through the untargeted scraping of facial images from the internet or CCTV footage.
5(1)(f) — Emotion recognition in workplace/education: AI systems that infer emotions of natural persons in the areas of the workplace and education institutions, except for medical or safety reasons.
5(1)(g) — Real-time remote biometric identification in public spaces: AI systems for real-time remote biometric identification of natural persons in publicly accessible spaces for law enforcement purposes, subject to narrow exceptions.
Key Passage (Article 5(1)(c)):
"AI systems that provide social scoring of natural persons by public authorities or on their behalf leading to the detrimental or unfavorable treatment of those persons in social contexts unrelated to the ones in which the data was originally generated or collected, or that is unjustified or disproportionate to the gravity of their social behavior."
Discussion Questions
-
Article 5's list of prohibited practices reflects specific political choices. Which prohibitions do you find most significant? Are there practices you believe should be prohibited that are not on the list?
-
The emotion recognition prohibition (Article 5(1)(f)) applies to workplaces and educational institutions. Should this prohibition be extended to other contexts, such as marketing or law enforcement? What are the arguments on each side?
-
The social scoring prohibition is widely understood to target China's social credit system. Is this prohibition well-drafted? Could it affect business practices in sectors like insurance pricing, credit scoring, or HR assessment?
Source 9: OECD AI Principles (2019)
Editorial Introduction
The OECD Recommendation on Artificial Intelligence (2019) was the first intergovernmental agreement on AI standards, adopted by the 38 OECD member countries (which include the United States, EU member states, Japan, South Korea, and others) and subsequently endorsed by additional countries bringing total adherents to over 50. The Principles informed the development of the UNESCO Recommendation, the EU AI Act, the U.S. AI Risk Management Framework, and national AI strategies around the world. They represent the most widely endorsed statement of international AI ethics norms.
The OECD Principles are notable for their explicit economic and innovation orientation — the OECD's mission is economic development, and the Principles reflect an effort to balance AI governance with AI adoption. They are also notable for their focus on government obligations as well as private sector practice, and for their establishment of an ongoing monitoring and implementation framework.
Excerpt and Summary
The Five OECD AI Principles:
Principle 1 — Inclusive growth, sustainable development and well-being: Stakeholders should proactively engage in responsible stewardship of trustworthy AI in pursuit of beneficial outcomes for people and the planet, such as augmenting human capabilities and enhancing creativity, advancing inclusion of underrepresented populations, reducing economic, social, gender and other inequalities, and protecting natural environments.
Principle 2 — Human-centred values and fairness: AI actors should respect the rule of law, human rights and democratic values, throughout the AI system lifecycle. These include freedom, dignity and autonomy, privacy and data protection, non-discrimination and equality, diversity, fairness, social justice, and internationally recognised labour rights.
Principle 3 — Transparency and explainability: AI actors should commit to transparency and responsible disclosure regarding AI systems. To this end, they should provide meaningful information, appropriate to the context, and consistent with the state of the art to foster a general understanding of AI systems, to make stakeholders aware of their interactions with AI systems, including in the workplace, to enable those affected by an AI system to understand the outcome, and, to the extent possible, to enable those adversely affected by an AI system to challenge its outcome based on plain and easy-to-understand information.
Principle 4 — Robustness, security and safety: AI systems should be robust, secure and safe throughout their entire lifecycle so that, in conditions of normal use, foreseeable use or misuse, or other adverse conditions, they function appropriately and do not pose unreasonable safety risk.
Principle 5 — Accountability: AI actors should be accountable for the proper functioning of AI systems and for the respect of these principles, based on their roles, the context, and consistent with the state of the art.
Key Passage:
"AI actors should be accountable for the proper functioning of AI systems and for the respect of these principles, based on their roles, the context, and consistent with the state of the art. They should ensure traceability, including in relation to datasets, processes and decisions made during the AI system lifecycle, to enable analysis of the AI system's outcomes and responses to inquiry, appropriate to the context and consistent with the state of the art."
Discussion Questions
-
The five OECD principles are widely shared across national AI ethics frameworks. What does this consensus suggest? Does agreement on principles at this level of generality represent genuine normative convergence, or does it mask deep disagreements about implementation?
-
Principle 5 (Accountability) identifies "AI actors" as the subject of accountability. Who are AI actors? Does this term cover developers, deployers, users, governments, or all of the above? How should accountability be allocated among them?
-
The OECD Principles were adopted in 2019. How should they be updated to address generative AI, foundation models, and other developments since 2019?
Source 10: Timnit Gebru, Emily Bender, Angelina McMillan-Major, and Shmargaret Shmitchell, "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" (2021)
Editorial Introduction
"Stochastic Parrots" is the most politically significant paper in the recent history of AI ethics. Written by researchers at the University of Washington and Google AI, it was submitted to the ACM FAccT conference in 2020. Google management demanded revisions to remove certain claims; when the lead author, Timnit Gebru, refused, she was fired from Google. The incident — which Gebru disputes Google's account of — became a cause célèbre in AI ethics, illustrating the structural conflicts between corporate interests and honest research on the social impacts of AI.
The paper itself argues that large language models (LLMs) pose four categories of concern: environmental costs (energy consumption), data quality (the risks of training on unfiltered internet text), the illusion of meaning (the risk of mistaking statistical fluency for understanding), and the concentration of power in organizations capable of training such models. The paper does not say LLMs should not be built; it asks whether they are being built responsibly and whether the research community is asking the right questions about costs and benefits.
Excerpt and Summary
Environmental Costs. The authors cite Strubell et al. (2019) on the energy cost of training large NLP models and argue that the trend toward ever-larger models is environmentally unsustainable. Crucially, they note that these costs are externalized: the organizations that profit from large models are not the ones who bear their environmental costs. The paper calls for "a reckoning with the true costs of these systems."
Data Quality and Social Bias. LLMs are trained on massive web text corpora that reflect the biases, hate speech, misinformation, and unequal representation of the internet. The paper argues that size does not solve this problem — bigger models trained on more biased data are more powerfully biased models. They introduce the concept of "hegemonic viewpoints" — the risk that training on dominant internet discourse marginalizes minority voices and perspectives.
The Illusion of Meaning. The paper's central contribution is the "stochastic parrot" metaphor: an LLM is a system that predicts statistically likely sequences of tokens, not a system that understands language. The danger is that fluent output is mistaken for meaningful communication — that humans attribute understanding and intentions to systems that have neither. This has implications for how LLMs are deployed in high-stakes contexts: a system that produces confident-sounding misinformation is not one that "knows" the misinformation is false.
Concentration of Power. Because training large LLMs requires enormous computational resources, only well-funded organizations can build them. This concentrates power over a potentially transformative technology in a small number of companies, limiting the diversity of approaches and the ability of independent researchers to audit or study these systems.
Key Passage:
"We are at risk of mistaking fluent language generation for language understanding. Fluency is not a proxy for reasoning, and accepting it as such is dangerous... As researchers, we have a responsibility to take stock of the harms our systems cause and to make meaningful efforts to address those harms."
Discussion Questions
-
The stochastic parrot metaphor suggests that LLMs do not understand language — they generate statistically plausible sequences. What are the ethical implications of deploying systems that simulate understanding but do not possess it?
-
The paper's publication history — Google's attempt to suppress it, Gebru's firing — illustrates the institutional pressures on AI ethics researchers in commercial settings. What governance mechanisms would better protect researchers who identify problems with their employers' products?
-
The paper calls for the research community to ask whether large language models need to be built, not just how to build them responsibly. Is this a reasonable demand? What process would you use to make such a decision?
This anthology has presented ten foundational texts in AI ethics, from Turing's philosophical provocations to the EU's legal prohibitions. Taken together, they suggest that AI ethics is not a new field addressing new problems but a continuation of enduring human questions about power, justice, dignity, and the social consequences of technology. The specificity of the contemporary moment — the scale of AI deployment, the opacity of its mechanisms, the concentration of its development — is real. But the questions are old ones, and the philosophical and political traditions assembled here provide essential resources for addressing them.