44 min read

In the fall of 2003, Vivian Park was a 33-year-old assistant professor of political science at a mid-tier research university, teaching an introductory survey methods course and quietly building a research agenda on public opinion formation in...

Learning Objectives

  • Trace the historical development of political polling from straw polls to modern mixed-mode surveys
  • Explain why the 1936 Literary Digest disaster and the 1948 Dewey-Truman upset are foundational moments in polling history
  • Describe the transition from quota sampling to probability sampling and its significance
  • Identify the causes and consequences of the cell phone crisis in survey research
  • Analyze how the rise of online panels has transformed and challenged polling methodology
  • Connect historical polling failures to the theme of Measurement Shapes Reality

Chapter 2: A Brief History of Polling and Political Measurement

Opening Scene: Vivian Park's First Disaster

In the fall of 2003, Vivian Park was a 33-year-old assistant professor of political science at a mid-tier research university, teaching an introductory survey methods course and quietly building a research agenda on public opinion formation in immigrant communities. She was methodical, precise, and deeply skeptical of the polling industry she studied. Academic survey research, she believed, was rigorous. Commercial polling was sloppy, driven by deadlines and budgets rather than by methodological integrity.

Then her state held a special election for governor, and a consortium of local media organizations hired a well-known national polling firm to track the race. The final poll showed the Democratic candidate leading by 7 points. The Republican won by 3. The error was 10 points---in a poll conducted three days before the election.

Vivian watched the post-election recriminations with a mixture of professional fascination and personal frustration. The pollster blamed "late-breaking movement." The media blamed the pollster. Nobody blamed the methodology, because nobody in the conversation understood the methodology well enough to critique it.

That evening, Vivian sat in her campus office and pulled up the poll's cross-tabs. Within an hour, she had identified three problems: the sample underrepresented rural voters, the likely voter screen was calibrated to a higher-turnout election than the special election turned out to be, and the Spanish-language component of the survey had been conducted with a poorly translated instrument that produced unreliable responses among the state's growing Latino population.

She wrote a detailed methodological critique and submitted it to the state's largest newspaper as an op-ed. The paper published it. The polling firm did not respond publicly, but Vivian's phone rang the next morning. It was a television news director who had been part of the consortium that hired the pollster. "We need someone who can explain this to us," he said. "Could you consult for us on the next election?"

That consulting relationship became a recurring engagement, which became a small survey research operation run out of Vivian's university office, which became---after she took a leave of absence that turned into a permanent departure from academia---Meridian Research Group.

This is a chapter about how we got here: how political measurement evolved from informal tallies at county fairs to a multibillion-dollar industry that shapes campaigns, media coverage, and public understanding of democracy. It is also a chapter about failure, because the history of polling is defined as much by its spectacular mistakes as by its methodological advances. Every major innovation in polling methodology was born from a crisis---a moment when the old methods broke down, the old assumptions proved wrong, and the profession was forced to reckon with the gap between what it thought it was measuring and what it was actually measuring.

The theme of this chapter is Measurement Shapes Reality, and it runs through every era of polling history. The way we measure public opinion does not just reflect that opinion---it constructs it, frames it, amplifies some voices and mutes others. Understanding this history is not merely academic. It is the foundation for everything you will learn about survey design (Chapter 7), sampling (Chapter 8), and data collection (Chapter 9) in the chapters ahead.

2.1 Before Polling: The Prehistory of Political Measurement

Counting Votes, Counting Noses

The idea that you might want to know what ordinary people think about their government is, historically speaking, quite new. For most of human history, rulers did not much care about public opinion, and the concept of systematically measuring it would have been incomprehensible. Monarchs consulted advisers. Generals assessed troop morale through informal observation. Religious authorities claimed to speak for the moral sentiments of the community.

The earliest forms of political measurement were simply vote counts. The Athenian assembly used a show of hands; Roman elections involved complex systems of tribes and centuries casting group votes. But these were measurements of decisions, not opinions. The distinction matters: a vote records what someone chose to do at a specific moment, under specific circumstances. An opinion is something broader, less fixed, and far more difficult to measure.

The idea that public opinion existed as a coherent, measurable phenomenon emerged gradually in the eighteenth and nineteenth centuries, alongside the growth of democracy, literacy, and mass media. Newspapers began conducting informal surveys of their readers. Politicians began paying attention to crowd sizes, petition signatures, and letters to the editor as indicators of public sentiment. But none of this was systematic, and none of it met any standard of representativeness.

The Straw Poll Era (1824-1936)

The first recognizable precursor to modern polling was the straw poll---an informal, unscientific survey conducted by a newspaper or organization, typically by approaching people in public places and asking their candidate preference. The term itself comes from the practice of throwing straw in the air to see which way the wind was blowing.

The earliest recorded straw poll in American politics was conducted by the Harrisburg Pennsylvanian newspaper in 1824, which reported that Andrew Jackson was the preferred candidate among attendees at a public gathering in Wilmington, Delaware. Over the next century, straw polls became a staple of election coverage. Newspapers would station reporters at train stations, county fairs, and factory gates to ask passersby about their voting intentions. The results were published as news, often with confident predictions attached.

Straw polls had obvious problems. They sampled whoever happened to be available, which meant they systematically overrepresented certain groups (people who traveled by train, people who attended fairs, people who worked in accessible industries) and underrepresented others (women before suffrage, racial minorities, the elderly, the homebound). They had no mechanism for weighting or adjusting their samples. And they were easily manipulated: a candidate's supporters could flood a straw poll location and skew the results.

Despite these flaws, straw polls often produced reasonably accurate results in elections where the outcome was lopsided. When one candidate was winning by 10 or 15 points, even a badly constructed sample would pick up the signal. The problems became apparent only in close races or when the biases in the sample were correlated with the biases in the electorate---which, as it turned out, was exactly what happened in the most famous polling disaster in American history.

💡 Intuition: Straw polls worked tolerably well for the same reason that a broken clock is right twice a day: in blowout elections, almost any method of measurement will pick up the signal. It is in close, contested races---the ones that matter most---that methodological rigor becomes essential.

2.2 The Literary Digest Disaster (1936)

The World's Largest Poll

In 1936, the Literary Digest, a popular weekly magazine, conducted what was then the largest opinion survey in history. The Digest had been running presidential straw polls since 1916 and had correctly predicted the winner in every election. Its method was simple and seemingly robust: mail millions of postcards to Americans drawn from telephone directories, automobile registration lists, and the magazine's own subscriber rolls, then tally the responses.

For the 1936 election between the incumbent Democratic president, Franklin D. Roosevelt, and the Republican challenger, Kansas Governor Alf Landon, the Digest mailed out approximately 10 million postcards. About 2.4 million people responded---a sample size that dwarfed anything attempted before or since. The result: the Digest predicted that Landon would win with 57 percent of the popular vote.

Roosevelt won with 61 percent, carrying 46 of 48 states in one of the most lopsided victories in American history. The Digest's prediction was wrong by 19 percentage points---not just wrong, but wrong in a way that humiliated the magazine and effectively ended its credibility. The Literary Digest went out of business within two years.

What Went Wrong

The Digest disaster is a foundational parable of survey methodology, and it is worth understanding in detail because the errors it illustrates are not merely historical. They recur in different forms in every era of polling.

Selection bias. The Digest's sampling frame---telephone directories, automobile registrations, magazine subscribers---systematically overrepresented affluent Americans. In 1936, in the depths of the Great Depression, telephones and automobiles were markers of relative prosperity. The people on the Digest's lists were disproportionately upper-middle-class, and in 1936, upper-middle-class Americans were disproportionately Republican. The Digest was not sampling "Americans." It was sampling "Americans who were well-off enough to own phones and cars"---and then treating the two groups as identical.

Nonresponse bias. Of the 10 million postcards mailed, only 2.4 million were returned---a response rate of 24 percent. The 76 percent who did not respond were not a random subset of the people who received postcards. Research conducted after the election suggested that Landon supporters were more motivated to return their postcards than Roosevelt supporters, possibly because they were more invested in the outcome (they were trying to unseat an incumbent) or because they had more leisure time and resources to complete and mail a survey.

No weighting or adjustment. The Digest simply tallied the responses it received without any attempt to adjust for the known biases in its sample. It treated 2.4 million self-selected respondents as if they were a miniature replica of the American electorate, which they emphatically were not.

⚠️ Common Pitfall: The Digest disaster teaches a lesson that remains relevant today: a large sample does not compensate for a biased sample. Two million biased responses are less informative than two thousand unbiased ones. This is the single most important principle in survey methodology, and it is violated more often than you might think---particularly in the age of online opt-in polls with millions of self-selected respondents.

George Gallup and the Birth of Scientific Polling

The 1936 election was a disaster for the Literary Digest but a triumph for a 35-year-old advertising researcher named George Gallup. Using a sample of only a few thousand respondents---a fraction of the Digest's millions---Gallup correctly predicted that Roosevelt would win. He also, in a bold and risky publicity move, predicted in advance that the Digest's poll would be wrong and explained why.

Gallup's insight was that representativeness matters more than size. Rather than trying to survey millions of people, he focused on constructing a sample that mirrored the demographic composition of the electorate. His method was called quota sampling: interviewers were given quotas specifying how many people to interview in each demographic category (age, sex, race, economic status, geography), and they were allowed to select anyone who fit the quota.

Quota sampling was a massive improvement over the Digest's approach. By ensuring that the sample included the right proportions of different demographic groups, Gallup reduced the most obvious sources of bias. His success in 1936 established him as the most famous pollster in America and launched an industry. The Gallup Poll became a fixture of American political life, and polling firms proliferated in the years that followed.

But quota sampling had its own problems, as the next great polling disaster would reveal.

The Gallup Era: Polling Becomes an Institution

Gallup's success in 1936 did more than validate a methodology. It created an institution. The Gallup Poll became a fixture of American political life, reported in newspapers across the country and consulted by politicians, journalists, and ordinary citizens as a barometer of the national mood.

Gallup understood that polling was not just a technical exercise but a democratic one. He argued that polls gave ordinary citizens a voice between elections, allowing the government to hear from the public on issues of the day rather than relying solely on the loudest interest groups and most connected lobbyists. "The common people," Gallup wrote, "have sound judgment on all major issues, provided they have access to the facts."

This democratic vision was sincere but also self-serving: it positioned Gallup as a public servant, not merely a businessman, and gave his enterprise a legitimacy that pure market research would have lacked. It also embedded a particular assumption into the DNA of the polling industry---the assumption that "public opinion" is a coherent, measurable thing that exists prior to and independent of the instruments used to measure it. As you will discover in Chapter 6, this assumption is far more controversial than it might appear.

Between 1936 and 1948, Gallup's organization conducted hundreds of polls on topics ranging from presidential approval to foreign policy to social attitudes. Other firms entered the market: Elmo Roper, who had also outperformed the Literary Digest in 1936, built his own polling operation, and Archibald Crossley developed a rival service. By the late 1940s, polling had become an established part of the American political landscape.

Then came 1948.

2.3 Dewey Defeats Truman (1948)

The Unanimous Prediction

The 1948 presidential election between the incumbent president Harry Truman and the Republican challenger Thomas Dewey is one of the most famous upsets in American political history. It is also the second foundational disaster of modern polling.

Every major polling organization---Gallup, Roper, Crossley---predicted a Dewey victory. Gallup's final poll showed Dewey ahead by 5 points. Roper was so confident that he stopped polling in September, declaring the race effectively over. The Chicago Tribune, relying on these polls, printed its infamous "DEWEY DEFEATS TRUMAN" headline before the results were in.

Truman won by 4.5 points.

What Went Wrong This Time

The causes of the 1948 polling failure were different from 1936 but equally instructive.

Quota sampling bias. While quota sampling ensured the right proportions of demographic groups, it gave interviewers discretion in selecting specific respondents within each quota. Interviewers tended to approach people who looked approachable---well-dressed, friendly, easy to find in public places. This introduced a subtle but systematic bias toward higher-status respondents within each demographic category. A quota for "working-class men aged 30-45" might be filled by approaching workers at a busy intersection, but the workers who were approached tended to be more politically engaged, more likely to read newspapers, and more likely to have opinions about the election than the workers who were missed.

Early cessation of polling. Most pollsters stopped conducting surveys well before Election Day. Gallup's final poll was completed in mid-October, roughly two weeks before the election. Roper stopped polling in September. This meant the polls missed any late movement toward Truman, which post-election analysis suggested was substantial. Undecided voters broke heavily toward the incumbent in the final days.

Third-party complications. Two third-party candidates---Strom Thurmond (States' Rights/Dixiecrat) and Henry Wallace (Progressive)---complicated the race. The polls had difficulty measuring support for minor candidates and even more difficulty predicting which major-party candidate their supporters would ultimately choose if they decided to vote strategically.

Allocation of undecided voters. The polls reported large numbers of undecided voters but did not attempt to allocate them to candidates. In hindsight, most undecided voters in 1948 were Democratic-leaning voters who had not yet committed to Truman. By reporting the horse race numbers without accounting for the probable behavior of undecided voters, the polls overstated Dewey's advantage.

🔴 Critical Thinking: Compare the 1936 and 1948 failures. In 1936, the problem was primarily about who was in the sample. In 1948, the problem was primarily about how the sample was collected and when polling stopped. These are different types of error, but both stem from the same underlying issue: the gap between the sample and the population it is supposed to represent. This gap is the central challenge of all survey research, and it has never been fully closed.

The Probability Sampling Revolution

The 1948 debacle triggered a crisis in the polling industry that led to its most important methodological advance: the transition from quota sampling to probability sampling.

In probability sampling, every member of the target population has a known, non-zero probability of being selected for the survey. The selection is done through a randomized procedure---not by the interviewer's judgment. The most common form is the random digit dialing (RDD) method, in which telephone numbers are generated randomly, ensuring that every household with a telephone has an equal chance of being contacted.

Probability sampling has a crucial advantage over quota sampling: it allows for the calculation of margins of error and confidence intervals. Because the selection process is random, statisticians can quantify how much the sample estimate might differ from the true population value due to chance alone. This is not possible with quota samples, where the selection is non-random and the sources of error are unknown.

The Social Science Research Council, in its 1949 report on the 1948 polling failure, recommended that pollsters adopt probability sampling and continue polling until immediately before Election Day. Over the next two decades, these recommendations were gradually adopted, and probability sampling became the gold standard of survey research.

📊 Real-World Application: When you see a poll reported with a "margin of error of plus or minus 3 percentage points," that number is derived from probability theory and is valid only if the sample was selected using probability methods. Many modern online polls do not use probability sampling, which means their reported margins of error are, strictly speaking, meaningless---a point we will explore in Chapter 8.

2.4 The Golden Age of Telephone Polling (1970s-2000s)

The Rise of RDD

The period from the mid-1970s through the early 2000s is sometimes called the "golden age" of polling---an era when the combination of probability sampling, random digit dialing, and high response rates produced the most reliable political surveys in history.

The technology was straightforward. Pollsters generated random telephone numbers using computers, ensuring coverage of both listed and unlisted numbers. Trained interviewers called these numbers, typically from centralized call centers, and conducted structured interviews using standardized questionnaires. Callbacks were made to numbers that did not answer, reducing nonresponse bias. And because virtually every American household had a landline telephone by the 1980s, the sampling frame was nearly universal.

Response rates during this period were high by today's standards---typically 30 to 40 percent, and sometimes higher for well-funded academic surveys. This meant that a large proportion of the people selected for the sample actually participated, reducing the risk that respondents differed systematically from nonrespondents.

The golden age produced some of the most important datasets in political science, including the American National Election Studies (ANES), which has been conducted continuously since 1948 and remains one of the primary sources for academic research on voting behavior. It also produced the infrastructure for the modern polling industry: firms like Gallup, Harris, Pew Research Center, and dozens of smaller organizations that conduct public polls for media clients and private polls for campaigns and advocacy organizations.

What Made the Golden Age Work

Several conditions made the golden age possible, and understanding them helps explain why it ended:

Near-universal landline penetration. By the 1980s, more than 95 percent of American households had landline telephones. This meant that an RDD sample came very close to covering the entire population.

Social norms around answering the phone. In an era before caller ID and robocalls, most people answered their phones when they rang. The telephone was a primary means of communication, and ignoring a ringing phone felt unusual.

Trust in institutions. Americans in this period had higher levels of trust in institutions, including survey research organizations. When a pollster called and identified themselves as working for a news organization or research firm, most people were willing to participate.

Relatively stable demographics. The American electorate during this period was less diverse and less mobile than it would later become, making it easier to construct representative samples using a relatively small number of demographic variables.

🌍 Global Perspective: The golden age of telephone polling was primarily a North American and Western European phenomenon. In much of the world, telephone penetration was too low for RDD polling, and political measurement relied on in-person interviews, informal assessments, or no systematic measurement at all. Even today, polling methodology varies enormously across countries, depending on telephone and internet penetration, cultural attitudes toward surveys, and the political environment.

The Craft of Telephone Interviewing

It is worth pausing to appreciate the human element of golden-age polling, because it shaped the data in ways that are often overlooked.

A telephone poll was not just a technological exercise; it was a social interaction. The interviewer's tone, pacing, and ability to build rapport with the respondent affected the quality of the data. Experienced interviewers knew how to put nervous respondents at ease, how to probe vague answers without leading the respondent, and how to handle sensitive topics (race, income, sexual behavior) with appropriate delicacy.

Trish McGovern, Meridian's senior field director, began her career as a telephone interviewer in the late 2000s, during the twilight of the golden age. "People actually talked to you then," she recalls. "You would call someone and they would give you twenty minutes. They were curious about the process. They wanted to share their opinions. Now? You are lucky if someone picks up, and if they do, you have about ninety seconds before they hang up."

The decline of the telephone interview is not just a technical problem. It is a cultural shift with profound implications for how we measure public opinion. When people were willing to spend twenty minutes talking to a stranger about politics, pollsters could ask nuanced questions, explore the reasoning behind opinions, and build a rich picture of public sentiment. When the average interview must be completed in eight minutes or less---because that is all the respondent will tolerate---the questions must be simpler, the topics fewer, and the picture less detailed.

The Rise of Likely Voter Screens

One of the most consequential innovations of the golden age was the development of the likely voter screen---a set of survey questions designed to distinguish respondents who will actually vote on Election Day from those who will not. This distinction matters because registered voters and likely voters can have quite different preferences. If a candidate has strong support among people who are unlikely to vote, that support is electorally meaningless.

Gallup developed one of the first systematic likely voter screens in the 1950s. The screen typically includes questions about past voting behavior, self-reported intention to vote, knowledge of polling place location, and stated interest in the election. Respondents who pass a threshold are classified as "likely voters" and included in the reported results; those who do not pass are excluded.

The likely voter screen is one of the clearest examples of how measurement shapes reality. The definition of "likely voter" is not a fact about the world; it is a decision made by the pollster. Different screens produce different electorates, which produce different results. A screen that is too inclusive will overrepresent voters who stay home; a screen that is too restrictive will underrepresent voters who are mobilized by an unusually compelling candidate or issue. In 2008, Barack Obama's campaign mobilized millions of first-time voters who would have been screened out by traditional likely voter models. Polls that used standard screens underestimated his support; those that used more inclusive definitions were more accurate.

For Vivian Park at Meridian, the likely voter screen is a source of constant methodological anxiety. "It is the most important decision I make in any poll," she says. "And it is the one I am least confident about. I am estimating who will vote before they vote. I am predicting behavior. And behavior is harder to predict than opinion."

We will explore likely voter screens in depth in Chapters 8 and 9, where they will take on immediate practical significance for the Garza-Whitfield race.

2.5 The Cell Phone Crisis (2000s-2010s)

The Ground Shifts

The golden age of telephone polling did not end with a single dramatic failure, like the Literary Digest or Dewey-Truman. It ended gradually, as the conditions that made it possible eroded one by one.

The most visible change was the rise of the cell phone. In 2003, approximately 5 percent of American adults lived in households with only a cell phone and no landline ("cell-only" households). By 2010, that figure had risen to roughly 27 percent. By 2018, it exceeded 55 percent. Among younger adults (ages 18-34), cell-only rates approached 75 percent.

This was a catastrophe for RDD polling. Federal regulations prohibited the use of autodialers to call cell phones, which meant that cell phone numbers had to be dialed manually---a process that was roughly ten times more expensive per completed interview than autodialed landline calls. Pollsters who continued to sample only landlines increasingly missed young adults, renters, low-income households, and racial and ethnic minorities---groups that were disproportionately cell-only and also disproportionately Democratic-leaning.

The industry's response was to adopt "dual-frame" sampling: generating random samples from both landline and cell phone number banks and then weighting the combined sample to match known population parameters. This was better than ignoring cell phones, but it was more expensive, more complex, and introduced new sources of error.

The demographic skew of cell-only households was not random. Cell-only rates were highest among precisely the populations that were already hardest to reach: young adults, renters, lower-income households, recent immigrants, and racial and ethnic minorities. A poll that excluded these groups was not just technically flawed; it was systematically biased in ways that had clear political implications. Cell-only households leaned Democratic, on average, by several percentage points relative to landline-accessible households. A landline-only poll would therefore overestimate Republican support---a pattern that several pollsters failed to recognize until it was too late.

The cell phone crisis was also unevenly distributed geographically. Urban areas, where cell-only rates were highest, were more affected than rural areas, where landlines persisted longer. This meant that state-level polls in urbanized, diverse states were more vulnerable to cell-phone-related bias than polls in more rural, homogeneous states---creating a pattern of errors that mapped onto the country's political geography in ways that were difficult to anticipate and correct.

The Response Rate Collapse

Simultaneously with the cell phone shift, response rates plummeted. The Pew Research Center, which meticulously tracks its own response rates, reported the following trajectory:

  • 1997: 36 percent
  • 2003: 25 percent
  • 2009: 15 percent
  • 2015: 8 percent
  • 2020: 6 percent
  • 2024: less than 4 percent

These numbers represent the percentage of eligible people contacted who ultimately completed the survey. A response rate below 10 percent means that more than 90 percent of the people a pollster tries to reach either do not answer, refuse to participate, or break off the interview before completion.

The causes of the response rate collapse are multiple and reinforcing:

Caller ID and call screening. The same technology that allowed people to avoid telemarketers also allowed them to avoid pollsters. When a call from an unknown number appears on your phone, the rational response in an era of robocalls and scams is to ignore it.

Robocalls and survey fatigue. The explosion of automated telemarketing calls has made Americans deeply suspicious of unsolicited phone calls. Many people assume that any call from an unknown number is a scam, and they are right often enough that this assumption is rational.

Time pressure and lifestyle changes. Americans work longer hours, commute more, and have less unstructured leisure time than in previous decades. A twenty-minute survey interview is a significant time commitment for someone juggling work, childcare, and other obligations.

Declining trust in institutions. Generalized distrust of institutions---government, media, corporations---has made people less willing to cooperate with survey research. Why should you share your political opinions with a stranger who might be working for a campaign, a corporation, or the government?

Partisan nonresponse. There is increasing evidence that willingness to participate in polls varies by political orientation. In some elections, Republican-leaning voters have been less likely to respond to polls, possibly because of distrust of media organizations that commission polls. This creates a nonresponse bias that is particularly difficult to correct because it is directly correlated with the quantity being measured (vote choice).

⚠️ Common Pitfall: A common response to low response rates is to argue that they do not matter, as long as the sample is "balanced" on key demographics. This argument is half right: weighting can correct for observable imbalances (too few young people, too many college graduates). But it cannot correct for unobservable differences between respondents and nonrespondents---like political engagement, trust in institutions, or willingness to share opinions. These unobservable differences are the source of the most dangerous polling errors.

Vivian Park Responds

The cell phone crisis shaped Vivian Park's approach to building Meridian Research Group. While many established firms were slow to adapt---continuing to rely heavily on landline samples supplemented with minimal cell phone interviewing---Vivian built Meridian from the ground up as a dual-frame operation, allocating at least 60 percent of her sample to cell phones by 2012.

This was expensive. Cell phone interviewing cost roughly $40 per completed interview, compared to $15 for landline interviews. It also meant longer field periods, because cell phone respondents were harder to reach and required more callback attempts. But Vivian insisted that methodological integrity trumped cost efficiency.

"You can produce a cheap poll or an accurate poll," she told Carlos Mendez during his first week at Meridian. "Sometimes you can do both. But when you have to choose, you choose accuracy. A cheap poll that is wrong is worse than no poll at all, because people will act on it."

This philosophy cost Meridian some clients---media organizations with tight budgets that went to cheaper competitors---but it also built the firm's reputation. When Meridian's polls consistently outperformed lower-cost alternatives, the clients who valued accuracy came back.

2.6 The Online Revolution (2010s-Present)

A New Paradigm

As telephone polling became more expensive and less reliable, the industry turned to the internet. Online surveys offered dramatic cost advantages: no interviewers to hire, no call centers to operate, no cell phone regulations to navigate. A telephone survey that cost $50,000 could be replicated online for $5,000 or less.

But online polling introduced a fundamental methodological challenge: there is no internet equivalent of random digit dialing. You cannot randomly select people from "the internet" the way you can randomly select telephone numbers. The internet is not a list; it is a network, and the people who are on it are not a representative sample of the population.

The industry's primary response to this challenge has been the online panel: a large pool of people who have agreed to take surveys in exchange for small payments or other incentives. Firms like YouGov, Ipsos, and Morning Consult maintain panels of hundreds of thousands or even millions of members. When a client commissions a poll, the firm selects a subset of panelists who match the desired demographic profile and invites them to complete the survey.

Online panels have several advantages:

Cost. Online surveys are dramatically cheaper than telephone surveys, which makes it economically feasible to poll more frequently, with larger samples, and on a wider range of topics.

Speed. An online survey can be fielded and completed in 24 to 48 hours, compared to a week or more for a telephone survey. This is valuable in fast-moving political environments where opinion can shift rapidly.

Question flexibility. Online surveys can include visual elements (images, videos, maps), complex question formats (ranking, drag-and-drop), and longer questionnaires than telephone surveys, because respondents complete them at their own pace.

Reduced social desirability bias. Some respondents are more willing to express socially unacceptable opinions---racial prejudice, support for controversial policies, admission of not voting---in an anonymous online survey than in a conversation with a live interviewer. This can improve the accuracy of measurements on sensitive topics.

But online panels also have significant limitations:

Nonprobability sampling. Most online panels are recruited through convenience methods---website ads, social media promotions, referrals---rather than through probability sampling. This means that panelists are self-selected and may differ systematically from the broader population in ways that are difficult to observe or correct.

Professional survey-takers. Some panelists complete dozens of surveys per month for the small payments involved. These "professional respondents" may not be representative of the general public in their level of engagement, their familiarity with survey conventions, or their attitudes.

The digital divide. Not everyone has reliable internet access. Low-income Americans, older Americans, rural Americans, and some immigrant communities are less likely to be online, which means they are less likely to be represented in online panels.

Coverage of the electorate. Online panels may underrepresent populations that are crucial for political polling, including very low-propensity voters, non-English speakers, and people who are deeply disengaged from public life.

🔵 Debate: The transition from telephone to online polling represents a shift from probability sampling (where every member of the population has a known chance of selection) to nonprobability sampling (where participation is voluntary and self-selected). Some methodologists argue that this shift is a step backward---a return to the pre-1948 era of convenience sampling. Others argue that modern statistical techniques (weighting, matching, multilevel regression and post-stratification) can make nonprobability samples just as accurate as probability samples, at a fraction of the cost. This debate is one of the most consequential in modern survey methodology, and we will explore it in depth in Chapter 8.

The Hybrid Approach

The most thoughtful polling organizations have not simply abandoned telephone surveys for online panels. Instead, they have moved toward mixed-mode designs that combine multiple data collection methods.

Meridian Research Group, for example, uses a hybrid approach for its public polls of the Garza-Whitfield race. The firm's standard design includes:

  1. Live telephone interviews with a random sample of registered voters, including both landline and cell phone numbers, for approximately 40 percent of the total sample.
  2. Text-to-web surveys, in which randomly selected registered voters receive a text message invitation to complete an online questionnaire, for approximately 30 percent of the sample.
  3. Online panel respondents, recruited from a probability-based panel (one where the panelists were originally selected through a probability method, not self-selection), for the remaining 30 percent.

This mixed-mode approach is more expensive and methodologically complex than any single method, but it addresses some of the limitations of each. Telephone interviews reach people who are not online. Text-to-web surveys reach cell-phone-only households at lower cost than live calls. Probability-based online panels provide large samples at moderate cost with a stronger methodological foundation than convenience panels.

"There is no perfect method," Vivian Park tells her team. "There are only trade-offs. Our job is to understand the trade-offs and make them honestly."

The Weighting Challenge

Online panels have also intensified the importance of statistical weighting---the process of adjusting survey data so that the sample matches the known characteristics of the target population. In the golden age of telephone polling, weighting was a relatively minor adjustment, because high response rates meant the raw sample was reasonably close to the population. With online panels, weighting does most of the heavy lifting.

A typical online panel survey might start with a raw sample that is 45 percent female and 55 percent male, when the target population is 52 percent female and 48 percent male. It might overrepresent college graduates by 15 percentage points and underrepresent people over 65 by 10 points. Weighting corrects these imbalances by assigning higher weights to underrepresented groups and lower weights to overrepresented groups.

The danger is that weighting can correct only for characteristics you know about and can measure. If your sample underrepresents college graduates, you can weight by education---but only if you know the correct population proportion of college graduates (from Census data) and have measured education in your survey. If your sample underrepresents people who distrust institutions and are therefore unlikely to take surveys, there is no straightforward way to weight for that, because "distrust of institutions" is not a variable with a known population distribution.

This is the core problem of modern polling: the gap between what weighting can correct and what it cannot. Education weighting, adopted widely after 2016, corrected one important source of bias. But analysts worry that there are other sources---political engagement, social trust, personality characteristics---that correlate with both survey participation and political attitudes and that no amount of weighting can fix.

Carlos Mendez, in his first months at Meridian, spends hours studying the firm's weighting protocol. "It is like balancing a scale with weights you can see and weights you cannot," he tells Vivian. She nods. "Now you understand the problem."

🧪 Try This: Find a recent public poll from a reputable organization (Pew, Gallup, Quinnipiac, or similar) and locate its methodological disclosure. Look for the weighting variables listed. How many variables are used? Do they include education? Race? Age? Gender? Geography? Past vote? What variables are not listed, and how might their omission affect the results?

2.7 Key Turning Points: A Summary Timeline

Let us pause and consolidate the history we have covered into a timeline of key turning points. Each of these moments changed what was possible in political measurement---and what could go wrong.

1824: The First Straw Poll

The Harrisburg Pennsylvanian reports candidate preferences at a public gathering, establishing the straw poll as a journalistic tool. For the next century, informal surveys will be the primary method of measuring public opinion before elections.

1916-1932: The Literary Digest's Winning Streak

The Literary Digest conducts increasingly large mail-in straw polls and correctly predicts the winner in five consecutive presidential elections, building enormous credibility. The magazine's method---mailing postcards to names drawn from telephone directories and automobile registrations---is fundamentally flawed, but the flaws do not matter in elections with clear outcomes.

1936: The Literary Digest Disaster / Gallup's Triumph

The Digest predicts Landon over Roosevelt; Gallup, with a much smaller sample, correctly predicts Roosevelt. The lesson: representativeness matters more than size. Gallup introduces quota sampling as a more systematic approach to survey construction. Scientific polling is born.

1948: Dewey Defeats Truman

Gallup, Roper, and Crossley all predict a Dewey victory; Truman wins by 4.5 points. The failures are attributed to quota sampling bias, early cessation of polling, and mishandling of undecided voters. The profession begins its transition to probability sampling.

1960s-1970s: The Rise of Random Digit Dialing

The development of RDD telephone sampling creates a near-universal sampling frame for the American population. Combined with probability theory, RDD enables the calculation of margins of error and confidence intervals, putting polling on a firmer statistical foundation.

1970s-2000s: The Golden Age

High landline penetration, strong response rates, and established methodologies produce the most reliable era of political polling. Major academic surveys (ANES, General Social Survey) and commercial polling firms (Gallup, Pew, Harris) produce a rich body of data on American public opinion and political behavior.

1980s: The Emergence of Exit Polling

News organizations develop exit polls---surveys of voters conducted as they leave polling places on Election Day. Exit polls enable same-day analysis of who voted, for whom, and why, becoming a staple of election night coverage. They also introduce new challenges, including the difficulty of sampling early and absentee voters.

2000s: The Cell Phone Crisis Begins

The rapid growth of cell-only households undermines the landline-based sampling frame. Response rates begin their dramatic decline. The industry struggles to adapt, with established firms slow to invest in cell phone interviewing.

2008-2012: The Aggregation Revolution

Nate Silver's FiveThirtyEight and similar projects demonstrate that aggregating multiple polls and combining them with non-polling information (economic indicators, historical patterns) produces more accurate forecasts than any individual poll. Poll aggregation becomes a standard feature of election coverage.

2010s: The Online Panel Takes Over

Online panels become the dominant mode of political polling, driven by cost advantages and speed. The shift raises fundamental questions about sampling methodology and the validity of nonprobability approaches.

2016: The Forecasting Crisis

Polls underestimate Trump's support in key states, leading to widespread but partially misplaced criticism of polling. The AAPOR post-mortem identifies nonresponse bias, particularly among non-college voters, as a key factor. Pollsters begin weighting by education.

2020: Polling Errors Persist

Despite methodological adjustments after 2016, polls again underestimate Republican support in 2020, this time by an even larger margin in some states. The error reinforces concerns about systematic partisan nonresponse bias that cannot be corrected by standard weighting methods.

2020s: The Mixed-Mode Era

Leading polling organizations adopt hybrid approaches that combine telephone, text-to-web, and online panel methods. New techniques---including multilevel regression and post-stratification (MRP) and respondent-driven sampling---attempt to address the fundamental challenges of the low-response-rate environment.

The Bigger Picture

This timeline, compressed as it is, reveals something fundamental: the history of polling is not a story of linear progress toward ever-greater accuracy. It is a story of adaptation and crisis, of methods that work until they do not, of innovations born from failures. Each era's "best practice" eventually becomes the next era's "outdated method," and the transition is rarely smooth.

Understanding this history is not merely an academic exercise. It is the foundation for making informed judgments about the polls and forecasts you will encounter as a citizen, a student, or a professional. When you see a poll reported in the news, the history in this chapter should prompt you to ask: What method was used? What assumptions are embedded in that method? Under what conditions might those assumptions fail? And how does this poll fit into the longer story of an industry that has been wrong before and will be wrong again?

🔗 Connection: This timeline is deliberately simplified. Each turning point will be explored in greater depth in later chapters. The sampling methodology debate is covered in Chapter 8. Exit polling is discussed in Chapter 9. Poll aggregation is the subject of Chapter 17. And the 2016 and 2020 polling errors are examined in detail in Chapter 20.

2.8 Patterns in the History

Looking at this history as a whole, several patterns emerge that will guide your thinking throughout this book.

Pattern 1: Every Innovation Creates New Blind Spots

Quota sampling corrected the Literary Digest's bias but introduced interviewer selection bias. Probability sampling corrected interviewer bias but depended on universal telephone coverage. Online panels solve the cost and speed problems of telephone polling but reintroduce nonprobability sampling challenges. Each methodological advance solves one set of problems and creates another.

This is not a counsel of despair. The errors have generally gotten smaller and more subtle over time. But it is a reminder that there is no methodological silver bullet---no approach to measuring public opinion that is permanently and completely reliable.

Pattern 2: Failures Drive Progress

Every major advance in polling methodology was triggered by a failure. The Literary Digest disaster led to scientific sampling. The 1948 debacle led to probability sampling. The cell phone crisis led to mixed-mode designs. The 2016 forecasting failure led to education-weighted samples and greater attention to correlated errors.

This pattern suggests that the polling errors of the 2020s will eventually lead to the next methodological breakthrough---but we do not yet know what it will be.

Pattern 3: Technology Shapes Methodology

Polling methodology has always been constrained by the available technology. Straw polls were limited to face-to-face encounters. Telephone polls were limited to households with phones. Online polls are limited to people with internet access. Each technological shift changes who can be reached, how quickly, at what cost, and with what degree of accuracy.

The next major technological shift---possibly involving artificial intelligence, social media analysis, or some form of passive data collection---will reshape polling methodology again. We explore these possibilities in Chapter 40.

Pattern 4: The Human Element Persists

Despite decades of technical innovation, polling remains fundamentally a social act: one person asking another person what they think. The shift from face-to-face to telephone to online has changed the mode of interaction but not the underlying challenge of persuading people to share their honest opinions with a stranger. Response rates have fallen, trust has eroded, and the social contract between pollster and respondent has frayed.

Trish McGovern puts it simply: "The technology changes every five years. The fundamental problem never changes: how do you get people to talk to you, and how do you know they are telling the truth?"

Pattern 5: The Demand for Polling Always Outstrips the Supply of Quality

At every stage of polling history, the demand for data---from media organizations, campaigns, interest groups, and the public---has exceeded the industry's ability to produce it responsibly. Newspapers wanted straw polls because readers wanted predictions. Television networks wanted instant poll results because viewers wanted election-night drama. Digital media wants hourly tracking polls because audiences want constant updates. And social media wants instant sentiment readings because users want real-time validation of their political beliefs.

This demand creates a structural pressure toward speed and volume at the expense of quality. Cheaper, faster methods will always find a market, even when they are less reliable. This is why understanding methodology matters so much: in a market flooded with polls of varying quality, the ability to distinguish the reliable from the unreliable is not a luxury but a necessity.

It also explains why polling disasters tend to occur in the elections that matter most---competitive, high-stakes, emotionally charged contests where the demand for data is greatest, the pressure to publish is strongest, and the consequences of error are most severe. The 1936, 1948, 2016, and 2020 polling failures all occurred in elections that dominated the national conversation and generated enormous appetite for predictive data. The failures were not random; they were, in part, a product of the very demand they served.

2.9 Vivian Park's Philosophy

Let us return to Vivian Park and the philosophy she built into Meridian Research Group. That philosophy is rooted in the historical lessons of this chapter, and it will guide the methodological discussions throughout this book.

Lesson 1: Respect the uncertainty. Vivian never publishes a poll without a clearly stated margin of error, and she insists that all reporting of Meridian polls include the margin of error in the headline or lead paragraph. "A poll showing Garza at 49 and Whitfield at 46 with a margin of error of 3.5 is not a poll showing Garza ahead," she tells Carlos. "It is a poll showing a race within the margin of error. Those are very different statements."

Lesson 2: Transparency is non-negotiable. Meridian publishes full methodological disclosures for every public poll, including sample size, sampling method, field dates, weighting variables, question wording, and response rate. Vivian believes that a poll whose methodology is not disclosed is a poll that should not be trusted.

Lesson 3: Know your weaknesses. Vivian is acutely aware that Meridian's polls, like all polls, have limitations. She maintains an internal document she calls the "Worry List"---a running catalog of potential sources of error in the firm's methodology, updated after every election cycle. After 2016, she added "education-based nonresponse" to the list. After 2020, she added "partisan nonresponse patterns that may shift between elections."

Lesson 4: History repeats. "Every pollster thinks the next disaster will look like the last one," Vivian says. "In 2020, everyone was looking for another 2016---and in a sense, they found it, because Republican support was again underestimated. But the next failure will be different. It will come from a direction we are not looking. That is why we study history: not to predict the future, but to remind ourselves that we do not know what is coming."

Best Practice: Vivian's approach embodies what we might call "methodological humility"---the recognition that every method has limitations, every sample is imperfect, and every estimate is uncertain. This is not weakness; it is professional integrity. The pollsters who get into trouble are the ones who believe their methods are bulletproof.

2.10 Measurement Shapes Reality, Then and Now

Let us close this chapter by returning to our central theme. Throughout the history of polling, the way public opinion has been measured has shaped what "public opinion" means---and whose opinions count.

In the straw poll era, "public opinion" was the opinion of people who showed up at public gatherings---predominantly white, male, and relatively engaged. In the Literary Digest era, it was the opinion of people who owned telephones and automobiles---the upper middle class. In the golden age of telephone polling, it was the opinion of people who had landlines and answered them---most of the population, but not all. In the online era, it is the opinion of people who are online, have joined survey panels, and choose to complete questionnaires---a self-selected and potentially unrepresentative group.

At every stage, the measurement technology determined who was heard. And at every stage, the people who were not heard---the poor, the marginalized, the disengaged, the distrustful---were rendered politically invisible by their absence from the data.

This is not just a historical observation. It is a present reality with present consequences. When Meridian polls the Garza-Whitfield race, the accuracy of the result depends on whether the firm reaches a truly representative sample of the electorate---including infrequent voters, non-English speakers, people without stable housing, and people who have every reason to distrust strangers asking questions. If these groups are systematically underrepresented, the poll will not just be "inaccurate" in a technical sense. It will misrepresent who the electorate is and what it wants, and the campaigns, media organizations, and citizens who rely on the poll will make decisions based on a distorted picture of reality.

The history of polling is a history of gradually expanding the circle of who counts. From property owners to all white men, from white men to all men, from men to women, from landline households to cell phone households, from English speakers to multilingual respondents, from high-propensity voters to the full electorate---each expansion has required not just political change but methodological innovation. And each expansion has been incomplete, leaving behind populations that the current methods cannot or do not reach.

"The people we cannot reach are not empty space," Vivian Park reminds her team. "They are voters, citizens, human beings with opinions. Our failure to reach them does not mean their opinions do not exist. It means our methods are not good enough."

This is the challenge that the rest of this book will help you understand, and in some cases, address.

A Note on Polling and Democratic Health

There is a broader argument to be made about the relationship between polling and democratic health. Some scholars have argued that public opinion polling is itself a democratic institution---that by giving citizens a collective voice between elections, polls serve as a check on government power and a guide for representative governance. Others have argued that polls distort democracy by substituting statistical representations of public opinion for genuine civic deliberation, reducing complex political judgment to binary approve/disapprove numbers.

Both arguments have merit, and neither is entirely correct. Polls are tools, and like all tools, their value depends on how they are used. A poll that accurately captures the public's priorities on healthcare spending is democratically valuable. A poll that reduces a complex policy debate to a misleading yes/no question is democratically harmful. A poll that reaches a broad cross-section of the population amplifies voices that might otherwise go unheard. A poll that systematically excludes marginalized communities reinforces existing power structures.

As you move through this book, you will develop the ability to distinguish between polls that serve democracy and polls that distort it. This is not a skill that can be reduced to a checklist; it requires judgment, context, and the kind of historical perspective that this chapter has provided. But it is one of the most valuable things you can learn.

Chapter Summary

This chapter has traced the history of political polling from nineteenth-century straw polls through the scientific revolution launched by George Gallup, the golden age of telephone surveys, the cell phone crisis, and the ongoing transition to online and mixed-mode methods. Along the way, you have encountered three landmark failures---the 1936 Literary Digest disaster, the 1948 Dewey-Truman upset, and the 2016 forecasting crisis---each of which triggered significant methodological reform.

You have met Dr. Vivian Park and learned how her journey from frustrated academic to independent pollster shaped the philosophy of Meridian Research Group: respect the uncertainty, demand transparency, know your weaknesses, and remember that history repeats.

The central lesson of this chapter is that measurement shapes reality. Every polling methodology determines who is heard and who is invisible, and the evolution of polling has been a long, ongoing struggle to expand the circle of people whose opinions are counted. That struggle is far from over.

In Chapter 3, you will move from the history of polling to the broader political data ecosystem---the vast network of government agencies, campaigns, media organizations, academic institutions, and civic technology nonprofits that produce, collect, analyze, and publish the data that shapes modern politics. You will meet Adaeze Nwosu and OpenDemocracy Analytics, and you will begin to map the landscape that you will navigate throughout this book.


In Chapter 3, we shift from the history of measurement to the infrastructure of measurement: who produces political data, where it lives, how it flows, and who controls access. You will meet Adaeze Nwosu and learn how OpenDemocracy Analytics is trying to democratize the political data ecosystem.