Index

Contributors to AI Literacy

27 min read

> "We can only see a short distance ahead, but we can see plenty there that needs to be done."

Learning Objectives

Trace major eras of AI development from 1950s to present
Explain AI winter cycles and why they happened
Identify key breakthroughs and their significance
Connect historical patterns to current AI hype
Evaluate whether current excitement is 'different this time'

In This Chapter

Overview
2.1 The Dream of Thinking Machines (1940s–1950s)
2.2 The Golden Age and the First Winter (1960s–1970s)
2.3 Expert Systems and the Second Winter (1980s–1990s)
2.4 The Data Revolution and Deep Learning (2000s–2010s)
2.5 The Transformer Era (2017–Present)
2.6 Patterns in the History: What the Cycles Teach Us
2.7 Chapter Summary
AI Audit Project Checkpoint
Spaced Review
What's Next

"We can only see a short distance ahead, but we can see plenty there that needs to be done." — Alan Turing, "Computing Machinery and Intelligence" (1950)

Overview

In Chapter 1, we wrestled with the question of what artificial intelligence actually is. You may have noticed that the answer keeps shifting — what counts as "AI" today would have been science fiction in 1950, and what was considered cutting-edge AI in 1985 now looks like a glorified spreadsheet. That constant redefinition isn't a bug in how we talk about AI. It's a feature of the story itself.

This chapter walks you through the history of artificial intelligence, from the first audacious proposals to build thinking machines through multiple cycles of breathless optimism and crushing disappointment, all the way to the transformer architectures powering today's AI systems. But this isn't just a timeline to memorize. The history of AI is a case study in how technology actually develops — not in a smooth upward curve, but in lurching bursts of excitement, overreach, and painful correction.

Understanding this history gives you something powerful: a framework for evaluating the AI claims you encounter today. When someone tells you that artificial general intelligence is five years away, or that AI will replace all jobs by 2030, you'll know that similar predictions have been made — and broken — before. You'll also know why this moment might genuinely be different in some ways.

Learning Paths

All readers: Sections 2.1 through 2.7 — the full historical arc is essential context for everything that follows.

If you're short on time: Focus on Sections 2.1, 2.4, 2.5, and 2.6. You'll get the bookends (where AI started, where it is now) and the pattern recognition that matters most.

If you're hungry for more: The primary source box in Section 2.1 invites you to read Turing's original 1950 paper, which is surprisingly accessible and entertaining.

2.1 The Dream of Thinking Machines (1940s–1950s)

The idea that we might build machines that think is older than computers themselves. But the story of AI as a real field — with researchers, funding, and a name — begins in a remarkably compressed period in the mid-twentieth century.

The Intellectual Groundwork

In 1943, Warren McCulloch and Walter Pitts published a paper proposing that neurons in the brain could be modeled as simple on-off switches, and that networks of these switches could, in principle, compute anything computable. They weren't building anything yet — this was pure theory. But they planted a seed: the brain might be understandable as a kind of machine, and machines might be buildable as a kind of brain.

Meanwhile, the world had just witnessed the power of actual computing machines. During World War II, Alan Turing and his colleagues at Bletchley Park had used electromechanical devices to crack the Enigma code. Across the Atlantic, ENIAC — one of the first general-purpose electronic computers — was being built at the University of Pennsylvania. For the first time, machines could do things that previously required human calculation.

Turing's Question

In 1950, Alan Turing published what may be the most important paper in AI's history: "Computing Machinery and Intelligence." Rather than getting bogged down in the philosophical question "Can machines think?" — which he considered too vague to be useful — Turing proposed a practical test. Imagine a human judge having a text conversation with two entities hidden behind a screen: one human and one machine. If the judge can't reliably tell which is which, Turing argued, we should be willing to say the machine is exhibiting intelligent behavior.

Primary Source Spotlight

Turing's 1950 paper is remarkably readable for a seventy-five-year-old academic article. He anticipated — and responded to — objections that people still raise today: "Machines can't be creative," "Machines don't have consciousness," "Machines can only do what we program them to do." His responses are witty, rigorous, and sometimes surprising. The paper is freely available online, and it's worth reading at least the first few pages. You'll find that our current debates about AI echo Turing's conversation almost word for word.

This Turing Test (a name Turing himself never used — he called it "the imitation game") became one of the most famous thought experiments in computing. It also set up a tension that runs through the entire history of AI: is the goal to simulate human behavior, or to actually replicate human understanding? We'll see this question resurface in almost every era.

The Dartmouth Conference

In the summer of 1956, a small group of researchers gathered at Dartmouth College in Hanover, New Hampshire, for a workshop that would give the field its name. John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon — names that would become legendary in computing — proposed the workshop with a stunningly confident claim:

"Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it."

Read that again. They weren't saying machines might someday approximate a few aspects of intelligence. They were claiming that every feature of intelligence could be precisely described and then simulated. This was not cautious optimism. This was a declaration of war on the unknown.

The 1956 Dartmouth workshop didn't produce any breakthroughs. What it produced was something arguably more important: a community. Researchers who had been working in isolation on logic, neural networks, language, and problem-solving realized they were working on the same fundamental project. They needed a name for it, and McCarthy's suggestion stuck: artificial intelligence.

Check Your Understanding

Before reading on, consider: The Dartmouth proposal claimed every aspect of intelligence could be precisely described and simulated. What assumptions does that claim rest on? What might it be leaving out?

2.2 The Golden Age and the First Winter (1960s–1970s)

The decade following Dartmouth was electric with optimism. Early AI programs seemed to prove that machines really could think — or at least do things that looked a lot like thinking.

Early Triumphs

In 1966, Joseph Weizenbaum at MIT created ELIZA, a program that simulated a psychotherapist by using pattern matching to turn a user's statements into questions. If you typed "I'm feeling sad," ELIZA might respond, "Why do you say you are feeling sad?" It was, by modern standards, absurdly simple. But people who interacted with ELIZA found themselves pouring out their feelings to it, sometimes insisting it truly understood them — even after Weizenbaum explained how it worked.

This was an early brush with a phenomenon we'll encounter throughout this book: humans are remarkably willing to attribute understanding to systems that produce the right outputs, regardless of what's happening internally. Recall from Chapter 1 the distinction between a system that can do something and a system that understands what it's doing. ELIZA could do therapy-sounding conversation. It understood nothing.

Other programs from this period were more substantive. The General Problem Solver (GPS), developed by Herbert Simon and Allen Newell, could solve logic puzzles and simple mathematical proofs. SHRDLU, created by Terry Winograd at MIT, could understand and respond to English commands about a simulated world of colored blocks — "Pick up the red block and put it on the blue one."

The predictions flowed freely. In 1965, Herbert Simon declared: "Machines will be capable, within twenty years, of doing any work a man can do." In 1967, Marvin Minsky told Life magazine that within a generation, "the problem of creating 'artificial intelligence' will substantially be solved."

The Approach: Symbolic AI

Most AI research in this era followed an approach now called symbolic AI (sometimes called "Good Old-Fashioned AI" or GOFAI). The basic idea was elegant: represent knowledge as symbols and rules, then manipulate those symbols logically. If you want a machine to reason about birds, you give it rules: "Birds have feathers," "Birds can fly," "A robin is a bird." The machine applies logical deduction: "A robin has feathers. A robin can fly."

This approach had a beautiful clarity to it. The reasoning was transparent — you could follow exactly why the machine reached its conclusions. And for well-defined problems in neat, bounded domains, it worked brilliantly.

The trouble was the real world.

The Wall

By the early 1970s, researchers began hitting problems that refused to yield. The block-moving program SHRDLU worked beautifully in its tiny simulated world, but that world contained exactly the kinds of objects and relationships it had been programmed to understand. Ask it about anything outside that world and it was helpless. Natural language understanding turned out to be far harder than expected — not because language is complex (though it is), but because understanding language requires understanding the world that language describes.

Machine translation was a vivid example. In the 1950s and 1960s, the U.S. government had poured money into automatic translation of Russian, expecting quick results. The famous (possibly apocryphal) story goes that a system translated "The spirit is willing but the flesh is weak" into Russian and back, producing "The vodka is good but the meat is rotten." Whether or not that specific example is real, the underlying problem was genuine: translation requires understanding context, culture, and meaning, not just swapping words.

In 1973, the British government commissioned a report from mathematician James Lighthill to evaluate AI research. The Lighthill Report was devastating. It concluded that AI had failed to deliver on its grand promises and that most of its successes were limited to toy problems that wouldn't scale to real-world complexity. The report led to severe funding cuts for AI research in the UK.

The First AI Winter

The Lighthill Report wasn't the only cause, but it was a crystallizing moment. By the mid-1970s, the field of AI entered what would later be called its first AI winter — a period of reduced funding, diminished expectations, and widespread skepticism. The term captures both the coldness of the funding climate and the sense of dormancy. AI hadn't died, but it had been forced to hibernate.

What had gone wrong? Looking back, we can identify a pattern that would repeat:

Overpromise: Researchers made bold predictions based on early successes
Oversimplify: They underestimated the gap between solving toy problems and real-world ones
Overspend: Funders invested based on the promises
Underdeliver: When results didn't materialize at the promised pace, disappointment was severe
Overcorrect: Funding dried up even for the promising work that was making genuine progress

This pattern — the AI hype cycle — will repeat. Watch for it.

Retrieval Practice

Pause for a moment. In Chapter 1, we discussed several common misconceptions about AI. How does the ELIZA story illustrate one of those misconceptions? Which misconception, specifically?

2.3 Expert Systems and the Second Winter (1980s–1990s)

AI's first winter eventually thawed, and the reason was money — specifically, the money to be made by putting AI research into commercial products.

The Rise of Expert Systems

In the late 1970s, a new approach gained traction: expert systems. The idea was pragmatic. Instead of trying to build general intelligence, why not capture the knowledge of human experts in specific domains and encode it as a set of rules that a computer could apply?

The landmark system was MYCIN, developed at Stanford in the early 1970s. MYCIN could diagnose bacterial infections and recommend antibiotics. It worked by applying roughly 600 rules of the form "IF the patient has fever AND the culture shows gram-positive cocci, THEN consider staphylococcal infection." In controlled tests, MYCIN performed as well as or better than many human physicians.

This was different from the dreamy ambitions of the 1960s. Expert systems didn't claim to think. They claimed to be useful. And for a while, they were spectacularly successful — at least commercially. Companies like Digital Equipment Corporation (DEC) deployed expert systems to configure computer orders, reportedly saving millions of dollars annually. A company called IntelliCorp went public in 1983 and saw its stock price soar. The Japanese government launched the Fifth Generation Computer Project, a $400 million effort to build AI-powered supercomputers. Suddenly, AI was a business.

The process of building an expert system required knowledge engineering — a painstaking process where a specialist (the "knowledge engineer") would interview domain experts, extract their rules and heuristics, and encode them into the system. This process was laborious but systematic.

Think about this in terms of our anchor examples

Consider what an expert system version of MedAssist AI might have looked like in 1985. A knowledge engineer would sit down with radiologists for months, extracting rules: "If the shadow on the X-ray is circular and greater than 3 centimeters and the patient is over 50 and a smoker, consider lung cancer with probability 0.7." Hundreds or thousands of such rules, carefully coded. The system would be transparent — you could trace every conclusion back to specific rules. But it would also be brittle. What happens when a case doesn't fit neatly into any rule? What about the subtle visual patterns that experts recognize but can't articulate as explicit rules?

Today's MedAssist AI works fundamentally differently, as we'll see in the next two sections. But the expert system approach highlights a tension that persists: transparency versus capability. The expert system tells you exactly why it made its diagnosis. The modern deep learning system may be far more accurate, but explaining why it flagged a particular scan is much harder.

The Collapse

By the late 1980s, the expert systems boom was unraveling. The systems were expensive to build and even more expensive to maintain. Every time the domain changed — new drugs, new regulations, new products — the rules had to be manually updated. They were brittle: they could handle cases they'd been explicitly programmed for, but unexpected situations produced nonsensical results. And the hardware they required was expensive and proprietary, just as cheaper, more flexible desktop computers were flooding the market.

The Japanese Fifth Generation Project ended in 1992 without achieving its goals. Companies that had invested heavily in AI expert systems wrote off their investments. The AI industry, which had reached a billion dollars annually by 1985, collapsed. The second AI winter had arrived.

Check Your Understanding

Compare the two AI winters. What was similar about the patterns that led to each one? What was different about the specific technologies involved?

What Survived the Winters

Here's something that often gets lost in the dramatic narrative of boom and bust: important research continued during both AI winters. Statistical methods for natural language processing advanced steadily. Probabilistic approaches to reasoning under uncertainty were developed. And a mathematical technique called backpropagation — a method for training neural networks by adjusting their internal connections based on errors — was refined and popularized in the 1980s, even as expert systems grabbed all the headlines.

These quieter lines of research would eventually converge to create the modern AI landscape. The winters killed hype. They didn't kill progress.

2.4 The Data Revolution and Deep Learning (2000s–2010s)

The current era of AI didn't emerge from a single breakthrough. It emerged from a convergence of three forces that had been building for decades: more data, more computing power, and better algorithms. Each was necessary. None was sufficient alone.

The Data Explosion

The internet changed everything. By the mid-2000s, the amount of digital data being generated was growing exponentially. Every Google search, every Facebook photo, every Amazon purchase created data. Suddenly, researchers didn't have to painstakingly hand-craft training datasets. The world was generating training data constantly, at a scale that would have been unimaginable in 1990.

This mattered because the statistical approaches to AI that had been developing quietly during the winters were data-hungry. Symbolic AI needed rules written by humans. Machine learning — the approach that would come to dominate — needed examples. Lots and lots of examples.

The Computing Revolution

At the same time, computing power was catching up with the ambitions of AI researchers. Moore's Law — the observation that computing power roughly doubles every two years — had been steadily delivering faster processors for decades. But the real game-changer was an unexpected one: graphics cards.

Graphics Processing Units (GPUs), originally designed to render video game graphics, turned out to be almost perfectly suited for the kind of parallel mathematical operations that neural networks require. In 2012, a team from the University of Toronto used GPUs to train a deep learning model called AlexNet that dramatically outperformed all other approaches in a major image recognition competition called ImageNet. AlexNet didn't just win. It won by a margin that stunned the field.

What Is Deep Learning?

We'll explore how machine learning works in detail in Chapter 3. But here's the historical context you need now.

A neural network is a computing system inspired (loosely) by the structure of the brain. It consists of layers of interconnected nodes that process information. "Deep" learning just means neural networks with many layers — sometimes dozens or even hundreds. Each layer extracts progressively more abstract features from the data. In image recognition, for instance, early layers might detect edges and colors, middle layers might identify shapes and textures, and later layers might recognize objects like faces or cars.

The key insight of deep learning is that the network learns its own features. Unlike expert systems, where humans had to manually specify what to look for, a deep learning system discovers what's important in the data on its own. This is enormously powerful — but it also means the system's reasoning becomes opaque. The network might become excellent at identifying cancerous tumors in medical images, but explaining which features it's using and why is genuinely difficult.

Retrieval Practice

Think back to the symbolic AI approach from Section 2.2. In what specific way is deep learning the opposite of symbolic AI when it comes to how knowledge is represented?

Landmarks of the Deep Learning Era

The period from 2012 to 2017 produced a series of achievements that steadily eroded the boundary between "things only humans can do" and "things machines can also do":

2014: Generative Adversarial Networks (GANs) learned to generate realistic images by pitting two neural networks against each other
2015: A deep learning system achieved superhuman performance on the ImageNet visual recognition challenge
2016: Google DeepMind's AlphaGo defeated Lee Sedol, one of the world's best players at Go — a game so complex that brute-force search (the approach that had conquered chess) was considered impossible. AlphaGo used deep learning combined with reinforcement learning
2017: A paper titled "Attention Is All You Need" introduced the transformer architecture. This may be the single most consequential AI paper of the twenty-first century

Each of these achievements was met with a mix of genuine astonishment and overheated predictions. After AlphaGo, headlines declared that AI had conquered the last bastion of human strategic thinking. They neglected to mention that AlphaGo could do exactly one thing: play Go. It couldn't play checkers, hold a conversation, or make a sandwich.

The CityScope Connection

This era is also when systems like CityScope Predict became conceivable. Predictive policing tools emerged in the 2010s, powered by the same machine learning approaches that were recognizing images and playing Go. The pitch was seductive: feed historical crime data into a deep learning system, and it will predict where and when crimes are likely to occur, allowing police to deploy resources more efficiently.

The historical irony should be apparent. Historical crime data reflects historical policing patterns. If police concentrated their presence in certain neighborhoods — whether due to genuine crime rates, racial profiling, or a mix of both — then the data would show more crime in those neighborhoods. Feed that data to a machine learning system, and it will obligingly predict more crime in those same neighborhoods, justifying continued concentrated policing. The machine isn't finding truth in the data. It's finding patterns, and patterns can encode decades of institutional bias.

We'll explore this deeply in Part IV. For now, notice how the historical context matters: understanding when and how these tools were developed helps you evaluate the claims made about them.

2.5 The Transformer Era (2017–Present)

The 2017 paper "Attention Is All You Need," published by researchers at Google, introduced the transformer — an architecture that would reshape the entire AI landscape within just a few years.

What Made Transformers Different

Previous approaches to processing language treated words sequentially — reading text one word at a time, like a person reading aloud. Transformers introduced an "attention mechanism" that allows the model to consider all parts of the input simultaneously, weighing which words are most relevant to each other regardless of their position in the sentence. This made transformers dramatically faster to train and better at capturing long-range relationships in text.

The details of how attention works are genuinely interesting, and we'll touch on them in Chapter 3. What matters historically is this: transformers made it practical to train language models on enormous datasets, producing models with billions of parameters (the internal settings that the model adjusts during training).

The Rise of Large Language Models

The result was a new class of AI system: the large language model (LLM). OpenAI's GPT-2 (2019) generated text that was often indistinguishable from human writing. GPT-3 (2020) could write essays, answer questions, translate languages, and even write code — all without being explicitly trained for any of those tasks. Google's BERT transformed how search engines understand queries. And in late 2022, ChatGPT brought large language models into mainstream public awareness almost overnight.

The pace of development has been genuinely extraordinary. Consider the timeline: from "Attention Is All You Need" in 2017 to ChatGPT in 2022 is just five years. From ChatGPT to multimodal models that can process text, images, audio, and video is roughly two more years. Each new generation of models has been substantially more capable than the last.

Priya's World

This is the landscape that Priya — our recurring student character — inhabits. When Priya sits down to write a paper and considers using an AI writing tool, she's interacting with the product of this specific historical moment: decades of failed approaches, two AI winters, the data explosion, the deep learning revolution, and the transformer breakthrough all compressed into a tool that produces fluent, authoritative-sounding text.

Knowing this history changes how Priya might think about the tool. It produces text that sounds knowledgeable because it has processed enormous amounts of human-written text and learned to predict what words typically follow what other words. It hasn't understood the material the way Priya understands it when she reads her textbook and thinks critically about it. The system is doing something unprecedented and genuinely impressive. It is also doing something fundamentally different from understanding.

Check Your Understanding

Why does the distinction between "processing patterns in text" and "understanding meaning" matter practically? Think of a specific situation where this distinction would lead to different outcomes.

The Current Moment

As of the writing of this textbook, AI development is moving at a pace that makes any specific claims about current capabilities risky — they may be outdated by the time you read this. But several broad observations seem durable:

Scale is the driving force. Current AI progress is heavily driven by making models larger, training them on more data, and using more computing power. Whether this scaling approach will continue to yield improvements is one of the most debated questions in the field.
Generative AI has captured public attention. Systems that generate text, images, code, and video have moved AI from a behind-the-scenes technology to something millions of people interact with daily.
The gap between demonstration and deployment remains. AI systems that perform impressively in controlled settings often struggle with the messiness of real-world deployment — a pattern that echoes every previous era.
The stakes are higher. Unlike previous AI booms, the current wave involves systems that are being deployed at massive scale in consequential domains: healthcare, criminal justice, hiring, lending, education.

2.6 Patterns in the History: What the Cycles Teach Us

Step back from the specific technologies and dates, and the history of AI reveals patterns that can help you navigate the present.

Pattern 1: The Hype Cycle Is Real

Every era of AI has followed a recognizable hype cycle: initial breakthrough leads to inflated expectations, which leads to disappointment when reality falls short, which leads to a period of quieter, more realistic progress. This pattern isn't unique to AI — technology analyst firm Gartner has documented it across many technologies — but AI's cycles have been particularly dramatic.

The question isn't whether there's hype in the current AI moment. There obviously is. The question is whether there's also substance underneath the hype. History suggests both are true simultaneously: the most hyped technologies often do eventually transform the world, just not on the timeline or in the way that the early hype predicts.

Pattern 2: The Hard Problems Are Harder Than They Look

Every era has announced the imminent conquest of problems that turned out to be far more difficult than expected. Natural language understanding, common sense reasoning, genuine creativity, general-purpose intelligence — each has been predicted to be "five to ten years away" for approximately seventy years.

This doesn't mean these problems will never be solved. It means you should be skeptical of confident timelines. When someone tells you that human-level AI will arrive by a specific date, remember that equally credentialed experts have been making — and missing — similar predictions since the 1950s.

Pattern 3: The Breakthroughs Come from Unexpected Directions

Nobody in the expert systems era predicted that video game graphics cards would become the engine of AI progress. Nobody working on symbolic AI in the 1970s imagined that the internet would provide the data that machine learning needed. The transformer architecture came not from the most prestigious AI lab but from a team at Google working on machine translation.

This pattern suggests humility about predicting what comes next. The next big AI breakthrough may come from a direction nobody is currently watching.

Pattern 4: Demonstration Is Not Deployment

MYCIN diagnosed infections as well as doctors — and was never used clinically. SHRDLU understood English commands — in a world of colored blocks. AlphaGo conquered the most complex board game — and nothing else. Demonstrations of capability in controlled settings have consistently run ahead of practical deployment in messy, real-world contexts.

When you see an impressive AI demonstration, ask: What happens when the inputs are messy, incomplete, or adversarial? What happens at scale? What happens when the stakes are real?

Evidence Evaluation

Consider this claim: "AI has achieved superhuman performance in medical diagnosis." Now apply the patterns from this section. What questions should you ask before accepting or rejecting this claim? What would you need to know about the specific study, the specific medical task, and the conditions under which performance was measured?

Pattern 5: "Is It Different This Time?" Is Always the Right Question

Here's the tension at the heart of this chapter. Every AI boom has featured people saying, "This time is different." And every time, at least some of those people were right in some ways and wrong in others.

The current moment has features that genuinely distinguish it from previous booms: - The systems work on real-world problems at genuine scale, not just toy demonstrations - The commercial applications are generating enormous revenue - The technology is being used by hundreds of millions of people - The underlying architecture (transformers) has proven remarkably versatile

It also has features that echo previous booms: - Predictions of imminent human-level intelligence - Enormous investment based on expectations that may be unrealistic - Experts who disagree fundamentally about what's happening and where it's going - A gap between what systems can do in demonstrations and how they perform in deployment

Your job as an AI-literate person isn't to decide whether it's "different this time" as a yes-or-no question. Your job is to understand enough history to evaluate specific claims with appropriate skepticism and to recognize which patterns are repeating and which might genuinely be breaking.

2.7 Chapter Summary

The history of AI is not a smooth march of progress. It's a story of audacious dreams, genuine breakthroughs, embarrassing failures, and recurring cycles of hype and disappointment. Here's what to carry forward:

The idea of machine intelligence dates to the 1940s–1950s, crystallizing at the 1956 Dartmouth Conference, which gave the field its name and its ambition.
Symbolic AI (1960s–1970s) attempted to represent intelligence through rules and logic. It worked brilliantly on small, well-defined problems and failed on the complexity of the real world, leading to the first AI winter.
Expert systems (1980s) commercialized AI by encoding domain expertise as rules. Their brittleness and maintenance costs led to the second AI winter.
Machine learning and deep learning (2000s–2010s) shifted AI from hand-coded rules to learning from data, powered by the convergence of big data, GPU computing, and better algorithms.
Transformers and large language models (2017–present) represent the current state of the art, enabling systems that process and generate text, images, and other media with unprecedented fluency.
The hype cycle — overpromise, underdeliver, overcorrect — has repeated in every era. Recognizing this pattern is itself a form of AI literacy.

AI Audit Project Checkpoint

Chapter 2 Task: Research the technological history of your chosen AI system.

For the AI system you selected in Chapter 1, investigate:

Technology genealogy: What type of AI does your system use? Can you trace it back to one of the eras discussed in this chapter? (For example, is it based on deep learning? Does it use an expert-system-like rules engine? Is it powered by a large language model?)
Timeline: Create a brief timeline of when your system's core technology was developed, when your specific system was first deployed, and any major updates or changes since then.
Hype vs. reality: Find two claims made about your system — one by its creators or promoters, and one by a critic or independent evaluator. How do these claims compare?

Add these findings to your AI Audit Report. You should have approximately one page of historical context for your chosen system.

Spaced Review

These questions revisit concepts from Chapter 1 to strengthen your retention:

In Chapter 1, we defined AI as a spectrum of technologies rather than a single thing. How does the history in this chapter illustrate why a spectrum definition is more useful than a single definition?
Recall the distinction from Chapter 1 between narrow AI and the idea of general AI. Where in this chapter's timeline did researchers first start distinguishing between these two ideas, and what failures prompted that distinction?
Chapter 1 introduced the idea that AI literacy is a civic skill. Having now read about seventy years of AI hype cycles, how does historical knowledge specifically strengthen someone's AI literacy?

What's Next

You now have a sense of where AI came from and why the current moment looks the way it does. But we've been talking about AI's development at a high level — breakthroughs, failures, eras. In Chapter 3, we'll zoom in on the fundamental question: how do machines actually learn? We'll explore supervised, unsupervised, and reinforcement learning using everyday analogies, demystify neural networks without any math, and start developing your ability to evaluate claims about what AI systems have and haven't learned to do. The history you've just absorbed will help you understand not just the how but the why — why modern machine learning works the way it does, and why it took seventy years of false starts to get here.