34 min read

> "It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so."

In This Chapter

Learning Objectives
Introduction
Section 28.1: Why Probabilistic Thinking Matters
Section 28.2: Probability Basics
Section 28.3: Bayes' Theorem
Section 28.4: Bayesian Epistemology
Section 28.5: Base Rate Neglect
Section 28.6: Superforecasting
Section 28.7: Calibration and Epistemic Humility
Section 28.8: Communicating Uncertainty
Section 28.9: Decision-Making Under Uncertainty
Key Terms
Discussion Questions
Summary

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 28: Probabilistic Thinking and Uncertainty

"It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so." — Attributed to Mark Twain (uncertain provenance — see Chapter 27)

"How often have I said to you that when you have eliminated the impossible, whatever remains, however improbable, must be the truth?" — Arthur Conan Doyle, The Sign of the Four

Learning Objectives

By the end of this chapter, students will be able to:

Explain why binary (true/false) thinking about uncertain claims is epistemically inadequate and how it enables misinformation.
Define and apply basic probability concepts: sample spaces, events, conditional probability, and independence.
State, derive, and apply Bayes' Theorem to real-world cases including medical testing.
Explain the Bayesian framework for belief as probability and rational belief update as a Bayesian process.
Identify base rate neglect in reasoning about rare events and in specific psychological cases (Linda problem, prosecutor's fallacy).
Describe Tetlock's superforecasting research and identify the characteristics that distinguish good forecasters from poor ones.
Assess their own calibration and identify strategies for improving it.
Translate between verbal probability expressions and numerical estimates, and critically evaluate communication of uncertainty in scientific reporting.
Apply expected value reasoning and recognize how uncertainty is strategically weaponized by misinformation actors.

Introduction

When you read that "scientists believe" a new treatment might be effective, or that "experts warn" a risk is elevated, or that a political claim is "mostly true" — what do you actually know? You know something, but not how much. The information is probabilistic: it positions you somewhere between certainty and ignorance, but it does not tell you exactly where.

Most of us are poorly equipped to navigate this probabilistic middle ground. We have strong intuitions for clean cases — something is either dangerous or it isn't, a claim is either true or it isn't — and we are systematically bad at reasoning about degrees of probability, particularly when base rates, conditional probabilities, and small absolute risks are involved. These limitations are not randomly distributed: they cluster in precisely the ways that misinformation exploits.

Consider: A test for a rare disease is 99% accurate. You test positive. What is the probability you actually have the disease? Most people answer "99%." The correct answer is almost always far lower — sometimes below 10% — because the base rate of the disease matters enormously. This failure of probabilistic reasoning has real consequences: it drives unnecessary treatment, inappropriate anxiety, and poor policy decisions.

Or consider: A politician claims that crime has "skyrocketed" in a city because of immigration, citing a statistic that immigrants are responsible for 30% of crimes in a neighborhood where they are 20% of the population. Is this cause for alarm? Probably not — but assessing it requires thinking probabilistically about base rates, confounding variables, and the difference between absolute and relative rates.

Probabilistic thinking is not merely an academic exercise. It is the cognitive infrastructure for rational engagement with an information environment saturated with uncertain, contested, and strategically manipulated claims. This chapter builds that infrastructure.

Section 28.1: Why Probabilistic Thinking Matters

The Cost of Binary Thinking

Human cognition has a powerful pull toward binary categorization. We prefer to know whether a food is safe or dangerous, a person is trustworthy or deceptive, a treatment works or doesn't. This categorical thinking is adaptive in many contexts: when you need to decide quickly, a binary heuristic is faster than a probability calculation.

But the information environment we inhabit is not binary. Claims exist on spectra of evidence quality. Sources have partial credibility. Health risks are probabilistic. Political statements mix truth and distortion in varying proportions. Treating all of this as binary forces an artificial choice: either you accept a claim as true or reject it as false, when the accurate answer is "probably somewhat true, with significant uncertainty about the magnitude."

The consequences of binary thinking for information processing are severe: - It makes you vulnerable to false certainty. If your cognitive system only has "true" and "false" slots, a confidently stated claim with some supporting evidence gets slotted into "true." - It makes correction difficult. A belief that was placed in the "true" category resists being moved to "false" because there is no intermediate category to occupy. - It creates all-or-nothing vulnerability. If you believe a source is "trustworthy," everything from that source gets trusted equally. If you discover one error, the source collapses entirely to "untrustworthy." - It is exploited by misinformers. Bad actors craft messages that fit binary cognitive categories: "experts are hiding the truth" (forcing a choice between mainstream consensus and contrarian rejection), "it's either natural or dangerous" (a false dichotomy used in vaccine and food debates), and "you either believe the data or you believe in a cover-up" (eliminating the middle ground of "the data are real but the interpretation is contested").

Calibration as the Alternative

The alternative to binary thinking is calibration — the practice of assigning degrees of confidence to beliefs that correspond to how well-supported those beliefs actually are. A well-calibrated person who says they are "80% confident" in a claim is right approximately 80% of the time when they say that. Their expressed uncertainty tracks their actual uncertainty.

Calibration is both an epistemic virtue and a learnable skill. Research by Philip Tetlock and others has shown that some people are significantly better calibrated than average, that calibration can be measured objectively, and that specific practices improve it. This chapter is in part a manual for becoming more calibrated — for having beliefs whose expressed confidence reflects the actual quality of the evidence.

Misinformation and Manufactured Uncertainty

Probabilistic thinking is not just about avoiding overconfidence. It is equally about avoiding manufactured uncertainty — a deliberate strategy in which well-funded actors create the impression of scientific doubt where little genuine doubt exists, specifically to prevent public and regulatory action.

The tobacco industry's internal documents reveal that their primary strategy for decades was not to prove cigarettes were safe, but to manufacture doubt about the evidence that they were harmful. The strategy explicitly aimed to maintain "controversy" — to keep public understanding in a state of apparent uncertainty even as scientific consensus solidified. Similar strategies have been employed by fossil fuel companies regarding climate science and by pharmaceutical companies regarding drug safety data.

Recognizing manufactured uncertainty requires probabilistic sophistication: the ability to distinguish genuine scientific uncertainty (where experts genuinely disagree based on genuine evidential ambiguity) from manufactured uncertainty (where the appearance of disagreement is artificially produced by a small number of industry-funded voices against a strong consensus). The tools for making this distinction — understanding confidence intervals, consensus measures, and the sociology of scientific disagreement — are the tools of probabilistic thinking.

Section 28.2: Probability Basics

Sample Spaces and Events

A probability is a number between 0 and 1 assigned to an event, representing the likelihood that the event will occur. By convention, P(A) = 0 means impossible; P(A) = 1 means certain; P(A) = 0.5 means equally likely to occur or not.

A sample space (Ω) is the set of all possible outcomes. An event is a subset of the sample space — a collection of outcomes we are interested in.

Example: Rolling a fair six-sided die. The sample space is {1, 2, 3, 4, 5, 6}. The event "rolling an even number" is the subset {2, 4, 6}, with probability 3/6 = 0.5.

Basic Probability Rules

Complement rule: P(not A) = 1 - P(A). If the probability of rain today is 0.3, the probability of no rain is 0.7.

Addition rule (OR): For two events A and B: - If A and B are mutually exclusive (cannot both occur): P(A or B) = P(A) + P(B) - In general: P(A or B) = P(A) + P(B) - P(A and B)

The subtraction of P(A and B) prevents double-counting events that satisfy both conditions.

Multiplication rule (AND): - If A and B are independent (occurrence of one doesn't affect the other): P(A and B) = P(A) × P(B) - In general: P(A and B) = P(A) × P(B|A), where P(B|A) is the conditional probability of B given A

Conditional Probability

Conditional probability P(B|A) is the probability of event B given that event A has already occurred. It is defined as:

P(B|A) = P(A and B) / P(A)

Example: What is the probability of drawing a king given that you already know the card is a face card? - P(king) = 4/52 - P(face card) = 12/52 - P(king|face card) = P(king and face card) / P(face card) = (4/52) / (12/52) = 4/12 = 1/3

Independence

Two events A and B are independent if the occurrence of one provides no information about the occurrence of the other: P(A|B) = P(A), equivalently P(A and B) = P(A) × P(B)

Independence is a special case, not the default. In most real-world contexts, events are correlated — knowing one thing happened changes the probability of related things happening. Assuming independence when events are actually correlated is a common error in both everyday reasoning and formal analysis.

Section 28.3: Bayes' Theorem

Derivation

Bayes' Theorem follows directly from the definition of conditional probability. From the multiplication rule:

P(A and B) = P(A) × P(B|A) = P(B) × P(A|B)

Therefore: P(A|B) = P(A) × P(B|A) / P(B)

This is Bayes' Theorem in its simplest form. It tells us how to compute the probability of A given B, when we know: - P(A): the prior probability of A - P(B|A): the probability of B given A (the likelihood) - P(B): the total probability of B

Expanding the Denominator

Often, P(B) is not directly known. It can be expanded using the law of total probability:

P(B) = P(B|A) × P(A) + P(B|not A) × P(not A)

This accounts for all the ways B can occur: either A is true and B occurs, or A is not true and B occurs anyway.

The full form of Bayes' Theorem is therefore:

P(A|B) = [P(A) × P(B|A)] / [P(A) × P(B|A) + P(not A) × P(B|not A)]

The Medical Testing Example: Why Doctors Fail at Conditional Probability

Suppose you receive a positive result on a screening test for a relatively rare condition. The test has: - Sensitivity = 99% (if you have the condition, the test correctly says positive 99% of the time) - Specificity = 99% (if you don't have the condition, the test correctly says negative 99% of the time) - The prevalence of the condition in the general population = 1% (base rate)

What is the probability that you actually have the condition?

Most physicians, when asked this question without access to Bayes' Theorem, estimate the probability at around 95%. The actual answer:

Let A = "having the condition," B = "testing positive" - P(A) = 0.01 (prevalence) - P(not A) = 0.99 - P(B|A) = 0.99 (sensitivity) - P(B|not A) = 0.01 (false positive rate = 1 - specificity)

P(A|B) = (0.01 × 0.99) / (0.01 × 0.99 + 0.99 × 0.01) = 0.0099 / (0.0099 + 0.0099) = 0.0099 / 0.0198 = 0.50 (50%)

A positive test result from a 99%-accurate test for a 1% prevalence condition gives you only a 50% probability of actually having the condition. This counterintuitive result — often called the base rate problem — occurs because the false positive rate, applied to the large population of people who don't have the condition, generates as many false positives as true positives.

Consider 10,000 people tested: - 100 have the condition (1% prevalence): 99 test positive (true positives), 1 tests negative (false negative) - 9,900 don't have the condition: 99 test positive (false positives), 9,801 test negative (true negatives) - Total positive tests: 99 + 99 = 198 - Of those 198 positive tests, only 99 are true positives: 99/198 = 50%

A celebrated study by Gerd Gigerenzen and colleagues found that when 48 physicians were presented with this type of problem, only 1 gave the correct answer. Most gave answers between 85% and 99%. This systematic failure has direct consequences for how doctors communicate risk to patients.

The Mammography Case

The mammography context illustrates the clinical stakes. Breast cancer has a prevalence of roughly 1% in women aged 40 undergoing routine screening. The mammogram has sensitivity around 80% and specificity around 90% (false positive rate 10%).

P(cancer|positive mammogram) = (0.01 × 0.80) / (0.01 × 0.80 + 0.99 × 0.10) = 0.008 / (0.008 + 0.099) = 0.008 / 0.107 ≈ 7.5%

A positive mammogram from routine screening gives approximately a 7.5% probability of cancer — meaning about 92.5% of positive mammograms are false positives. This is not a failure of the test; it is a mathematical consequence of low base rates. Understanding this is essential for informed consent and for avoiding unnecessary anxiety and invasive follow-up procedures.

Likelihood Ratios: A More Intuitive Formulation

An alternative formulation of Bayes' Theorem uses odds rather than probabilities and is often more intuitive for sequential reasoning.

The prior odds of A = P(A) / P(not A) The likelihood ratio (LR) for evidence E = P(E|A) / P(E|not A) The posterior odds = prior odds × likelihood ratio

This multiplicative form is particularly useful when multiple pieces of evidence are available: each piece multiplies the odds. Evidence that strongly favors A (LR >> 1) increases the odds; evidence that weakly favors not-A (LR < 1) decreases them.

Section 28.4: Bayesian Epistemology

Belief as Probability Distribution

The Bayesian framework applied to epistemology treats beliefs as probability distributions rather than binary accept/reject states. Instead of believing a proposition as true or false, a Bayesian agent assigns a credence (degree of belief) between 0 and 1 to it.

This framework has several advantages for thinking about information and misinformation: - It naturally accommodates uncertainty. A claim can be believed to degree 0.7 — significantly but not certainly true. - It makes explicit the role of prior knowledge. Two people who start with different priors should hold different posteriors after seeing the same evidence, and this is rational rather than irrational. - It provides a normative model for belief update. When new evidence arrives, Bayes' Theorem specifies how much the probability should change.

Prior and Posterior Beliefs

A prior probability (or simply "prior") is the probability assigned to a hypothesis before new evidence is considered. The posterior probability is the probability after incorporating the new evidence.

The gap between prior and posterior is a function of the diagnosticity of the evidence: how much more likely is this evidence to occur if the hypothesis is true than if it is false?

Example: A friend tells you she saw a coyote in Central Park. Your prior probability that coyotes inhabit Central Park might be very low — say, 3%. But after she tells you this, you update. The question is: how much? If your friend is generally reliable and not prone to misidentifying animals, this is diagnostically strong evidence. If she frequently misidentifies animals, it's weaker evidence. Bayes' Theorem gives the formal machinery for quantifying this update.

The Problem of Priors

The Bayesian framework is prescriptive: it tells you how to update beliefs given new evidence. But it does not, by itself, specify what your prior beliefs should be.

This is both a strength and a limitation. The strength is that it can accommodate diverse starting points and tracks evidence correctly from any starting position. The limitation is that someone with an extremely strong prior in a false belief will update that belief very slowly even in the face of substantial evidence — they will describe the evidence as not very diagnostic rather than as disconfirming.

The appropriate Bayesian response to empirically outlandish claims is a very low prior — a level of skepticism proportional to how far the claim departs from established knowledge. The astronomer Carl Sagan's maxim "extraordinary claims require extraordinary evidence" is a Bayesian principle: extraordinary claims are those with very low priors, and updating from a very low prior requires very strong evidence.

Bayesian Reasoning as a Framework for Rational Belief Change

The Bayesian framework provides an ideal-type model for how rational agents should update beliefs in response to evidence. It does not describe how people actually reason (Chapter 5's cognitive bias literature documents the many ways actual reasoning departs from Bayesian ideals), but it provides a normative standard against which actual reasoning can be evaluated.

Key Bayesian norms for evaluating reasoning: - Proportionality: The strength of belief should be proportional to the strength of evidence. - Sensitivity to new information: Beliefs should update appropriately when new evidence arrives — neither too quickly (susceptibility to single, unreliable data points) nor too slowly (anchoring to prior beliefs despite overwhelming evidence). - Prior specification: Priors should be based on background knowledge, not on wishful thinking or tribal affiliation. - Independence: Evidence gathered from different sources should be treated as independent if the sources are genuinely independent.

Section 28.5: Base Rate Neglect

The Linda Problem

In 1983, Daniel Kahneman and Amos Tversky published one of the most famous results in cognitive psychology. They presented participants with the following scenario:

Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

Participants were then asked to rank various statements about Linda by probability. The critical comparison: - (A) Linda is a bank teller. - (B) Linda is a bank teller and is active in the feminist movement.

More than 85% of participants rated statement B as more probable than statement A. This is logically impossible: B is a conjunction (bank teller AND feminist), and any conjunction is necessarily less probable than either of its parts alone. The probability of two events both occurring cannot exceed the probability of either one occurring.

This error — the conjunction fallacy — arises because the description of Linda is highly representative of a feminist, making "feminist" feel like it adds plausibility to the overall description rather than reducing its probability. The representativeness heuristic (Chapter 4) overrides proper probability reasoning.

The Linda problem is a pure case of base rate neglect: the base rate for "bank teller" (which is all bank tellers, including feminist ones) is higher than the base rate for "bank teller and feminist," and participants fail to honor this mathematical necessity.

The Prosecutor's Fallacy

The prosecutor's fallacy is a base rate error with potentially catastrophic legal consequences. It involves confusing two different conditional probabilities:

P(evidence | innocent) vs. P(innocent | evidence)

Suppose DNA evidence from a crime scene matches a defendant's DNA with a 1-in-a-million probability of a random match. A prosecutor argues: "The probability that an innocent person would have this DNA match is one in a million — therefore the probability that this defendant is innocent is one in a million."

This is the prosecutor's fallacy. The probability of the DNA match given innocence (1 in a million) is not the same as the probability of innocence given the DNA match.

To compute the latter properly (using Bayes' Theorem), we need the base rate: how many potential suspects are there? If the police have a pool of 10 million potential suspects (e.g., a large city population), then we expect 10 people by chance to have this DNA profile. If the defendant was identified because they were in a DNA database searched after the crime, not because of a prior independent connection to the crime, the base rate matters enormously.

This is not a theoretical concern. In the UK, Sally Clark was convicted of murdering her two infant sons partly because the expert witness Sir Roy Meadow testified that the probability of two sudden infant deaths in the same family was 1 in 73 million. This number ignored base rates (the background rate of SIDS correlates within families due to shared genetic and environmental factors), violated the independence assumption, and confused two different conditional probabilities. Clark's conviction was eventually overturned, but she died four years later.

Base Rate Neglect in Medical Diagnosis

Returning to the medical testing example from Section 28.3: the 92.5% of positive mammogram results that are false positives is a consequence of base rate neglect at the population level. The rarity of the condition (the base rate) is often the most important piece of information for interpreting a diagnostic test result, yet it is the piece most frequently omitted from patient-doctor conversations about positive tests.

Gigerenzen and colleagues have extensively documented how reformulating probabilistic information as natural frequencies (e.g., "10 out of every 10,000 women screened will have cancer") rather than conditional probabilities ("the probability of cancer given a positive test is X%") dramatically improves both physician and patient comprehension and reduces base rate neglect.

How Misinformation Exploits Base Rate Neglect

Base rate neglect is a primary tool in the misinformer's toolkit:

Relative risk inflation: "Treatment X doubles the risk of heart attack!" might be true while concealing that the absolute risk went from 0.001% to 0.002% — a doubling of a tiny number. Without the base rate, the relative statistic ("doubles the risk") sounds alarming.

Cherry-picked denominators: "In our study, 30% of vaccine recipients experienced adverse events." What is the adverse event rate in unvaccinated controls? Without this base rate, the 30% figure is uninterpretable.

Misused raw counts: "There were 10,000 breakthrough COVID infections among vaccinated people last month, compared to 5,000 infections among unvaccinated people." This appears to show vaccines are harmful. But if 90% of the population is vaccinated, the vaccinated group has 10,000 infections out of 270 million people, while the unvaccinated group has 5,000 out of 30 million — dramatically higher infection rates per capita for the unvaccinated.

Section 28.6: Superforecasting

Tetlock's Forecasting Tournaments

Philip Tetlock, a political psychologist at the University of Pennsylvania, spent two decades studying expert political forecasting. His 2005 book Expert Political Judgment published the sobering finding that expert predictions about political events were not significantly better than chance — and that pundits who spoke with more confidence and sweeping generalizations were actually less accurate than those who expressed greater uncertainty.

The Good Judgment Project (GJP), which Tetlock launched as part of a US government intelligence forecasting tournament (IARPA's Aggregative Contingent Estimation program), extended this research with a more optimistic finding: while experts as a whole are not particularly good forecasters, a small subset of people — whom Tetlock called "superforecasters" — are substantially better than average, better than prediction markets, and better than intelligence analysts with access to classified information.

What Makes a Superforecaster

Superforecasters are not identified by domain expertise or raw intelligence alone. They share a cluster of epistemic habits and dispositions:

Probabilistic thinking: Superforecasters naturally think in probabilities rather than binary categories. They express predictions as "I think there's a 65% chance of X" rather than "I think X will happen."

Active open-mindedness: A disposition to actively seek out information that might contradict current beliefs, rather than information that confirms them. This is the deliberate override of confirmation bias.

Calibration focus: Superforecasters care intensely about the accuracy of their probabilistic estimates, not just whether they get the direction right. Being "right" about a 51% probability that came true is not impressive; being wrong about a 95% probability is very informative.

Granularity: Superforecasters make fine-grained probability estimates (distinguishing 63% from 65%) rather than rounding to round numbers (50%, 60%, 70%). This forces them to articulate their reasons more precisely.

Updating: Superforecasters update their forecasts readily when new information arrives, without the excessive anchoring to prior positions that characterizes poor forecasters.

Self-awareness: Superforecasters are aware of their own cognitive biases and actively work to counteract them in specific forecast contexts.

Dragonfly eye view: They integrate multiple independent perspectives — different models, different framings, different base rate comparisons — rather than committing prematurely to a single view.

The Intelligence of Superforecasters

Superforecasters do exist across the ability spectrum, but they tend to cluster in the upper ranges of cognitive ability — not because intelligence alone makes you a good forecaster, but because the metacognitive skills of calibration and updating appear to be partly bottlenecked by general reasoning ability.

However, Tetlock's research identified domain-specific knowledge as less important than general probabilistic skills. A generalist with strong Bayesian habits regularly outperformed domain specialists who reasoned in less calibrated ways. This is important for media literacy: probabilistic reasoning skills transfer across topics, while domain expertise does not.

The Aggregation Advantage

One of the GJP's most striking findings was that aggregating forecasts from multiple superforecasters produced even better predictions than any individual. The mechanism is simple: different people make independent errors, and those errors partially cancel when averaged. The "wisdom of crowds" operates best when forecasters are genuinely independent — not when they are influenced by each other's published views.

This finding has implications for how we should weight claims in the information environment. The consensus of multiple independent expert assessments is a stronger epistemic signal than any single authoritative source, however confident.

Section 28.7: Calibration and Epistemic Humility

Measuring Calibration

Calibration can be measured precisely. For a large sample of predictions, a forecaster is well-calibrated if, among all predictions they made with 70% confidence, approximately 70% turned out to be true; among those made with 90% confidence, approximately 90% turned out to be true; and so on.

A reliability diagram (or calibration curve) plots stated confidence on the x-axis against actual accuracy on the y-axis. A perfectly calibrated forecaster produces a 45-degree diagonal line. Most people produce a curve that bows toward the top-left corner — they are overconfident: claiming 90% confidence when they are right only 70% of the time.

The Brier score is a mathematical measure of forecast accuracy that penalizes both overconfidence and underconfidence. For a binary prediction (will event X occur?), the Brier score is:

BS = (forecast probability - actual outcome)²

where actual outcome = 1 if the event occurred, 0 if it didn't. A perfect forecast of 1.0 for an event that occurred earns BS = 0; a forecast of 0.5 for an event that either occurred or didn't earns BS = 0.25; a forecast of 1.0 for an event that didn't occur earns BS = 1.0. Lower is better.

Overconfidence Bias

Overconfidence is perhaps the most robustly documented bias in the psychology of judgment. Studies consistently show that: - When people say they are "90% sure," they are right approximately 70-75% of the time - When people produce 90% confidence intervals (ranges they are 90% sure contain the true value), the true value falls outside the interval about 40-50% of the time — four to five times more often than expected - Experts are often more overconfident than laypeople in their domains of expertise

Overconfidence is relevant to misinformation in two directions. First, it makes people more likely to accept confidently stated claims: overconfidence in sources reads as competence. Second, it makes people less likely to seek additional verification — if you already feel 90% sure of something, you are unlikely to invest energy in finding out if you're wrong.

Strategies for Better Calibration

Reference class forecasting: Instead of reasoning from first principles about a specific situation, ask what the base rate is for similar situations. How often do new restaurants survive their first year? How often do first-time authors get published? Anchoring on the base rate prevents excessive optimism or pessimism.

Premortem analysis: Before committing to a belief or decision, actively imagine that you are wrong and reason about why. "If this claim turned out to be false, what would the most likely explanation be?" This exercises the imagination in the direction of disconfirmation.

Outside view vs. inside view: The "inside view" focuses on the specific details of the case at hand and tends to generate overconfidence. The "outside view" asks how cases like this typically turn out. Good calibrators deliberately shift to the outside view before finalizing probability estimates.

Track record maintenance: Keeping explicit records of your predictions and their outcomes — even informally — provides feedback that improves calibration over time. Most people never find out they were wrong because they never write down their predictions.

Section 28.8: Communicating Uncertainty

The Words-vs.-Numbers Problem

Scientific uncertainty is typically expressed in verbal terms — "likely," "probable," "possible," "unlikely" — but these words map onto very different numbers for different readers. Research consistently shows that people interpret probability words idiosyncratically: - "Possible" ranges from 5% to 60% in different people's interpretations - "Likely" ranges from 55% to 90% - "Probable" ranges from 50% to 90% - "Rarely" ranges from near 0% to 20%

This interpretive variability is not merely a philosophical curiosity. When a physician says a treatment "might help," patients vary enormously in how optimistic they are. When a forecast says a hurricane "could" affect a coastal area, residents vary in their response.

IPCC Uncertainty Language

The Intergovernmental Panel on Climate Change (IPCC) developed a formal system for communicating uncertainty in its assessment reports, precisely because the words-vs.-numbers problem had created miscommunication between scientists, policymakers, and the public.

The IPCC system maps verbal terms to numerical probability ranges: - Virtually certain: > 99% probability - Extremely likely: > 95% probability - Very likely: > 90% probability - Likely: > 66% probability - About as likely as not: 33-66% probability - Unlikely: < 33% probability - Very unlikely: < 10% probability - Extremely unlikely: < 5% probability - Exceptionally unlikely: < 1% probability

This system is explicitly defined in IPCC reports, but surveys show that many readers — including journalists and policymakers — apply their own interpretations to these terms rather than using the defined values.

A specific consequence: when the IPCC says it is "likely" (>66%) that global warming will exceed 1.5°C, some readers interpret "likely" as meaning "more probable than not" (50%+), which is correct; others interpret it as meaning "a real possibility but not the most probable outcome," which significantly underestimates the expressed confidence.

The Problem of "Maybe" in Science Communication

Science communication faces a structural tension. Scientists are trained to be epistemically humble — to acknowledge uncertainty, to hedge claims, to avoid overstating conclusions. Journalists are trained to make claims newsworthy and clear, which pushes toward declarative statements and vivid claims. The intersection of these incentives produces characteristic distortions.

"Scientists find X," when the paper actually reports a modest correlation in a small sample. "Experts warn of X," when X is one of many possible scenarios. "New study shows X is dangerous," when the study showed a relative risk increase of 20% from a very low base rate.

The challenge for news consumers is to read through the journalistic transformation and recover the actual epistemic status of the underlying claim. This requires asking: What was the actual study design? What was the sample size? What was the effect size and confidence interval? Was this peer-reviewed? Has it been replicated? Is this consistent with the existing literature?

Section 28.9: Decision-Making Under Uncertainty

Expected Value

When decisions must be made under uncertainty, expected value (EV) provides a normative framework. The expected value of an action is the probability-weighted average of its possible outcomes:

EV = Σ [P(outcome_i) × Value(outcome_i)]

Example: A vaccine has a 0.1% chance of causing mild side effects (valued at -50 utility units for discomfort and time) and prevents infection with 85% probability, where infection would cost -5,000 utility units (illness, lost work, medical costs). Not vaccinating avoids side effects but risks infection.

EV(vaccinate) = 0.001 × (-50) + 0.999 × [0.85 × 0 + 0.15 × (-5000)] ≈ -0.05 + 0.999 × (-750) ≈ -750

EV(not vaccinate) = 0.85 × 0 + 0.15 × (-5000) = -750

In this simplified case, the expected values are approximately equal, but note that vaccination also provides community protection (positive externality) and the side effect probability was simplified. Real calculations are more complex, but the framework is the same.

The Precautionary Principle

The precautionary principle, often stated as "better safe than sorry," is a response to situations of deep uncertainty — particularly when potential outcomes are catastrophic and irreversible. In its strong form: if an action risks harm to the public or environment and there is scientific uncertainty about the harm, the burden of proof falls on those taking the action to prove it is not harmful.

The precautionary principle has been applied in environmental regulation, pharmaceutical approval, and biosafety policy. It represents a rational asymmetry: in cases where false negatives (failing to prevent a catastrophe) are much more costly than false positives (unnecessary precaution), we should set a lower threshold for caution.

However, the precautionary principle is also misused. Because almost any action carries some risk of some harm, the principle can be selectively applied to oppose actions one dislikes (nuclear power, GMOs, vaccines) while ignoring the risks of inaction (energy poverty, malnutrition, infectious disease). Rational application of the precautionary principle requires symmetric analysis: the risks of action must be compared to the risks of inaction, and both must be assessed probabilistically.

Pascal's Mugging and Probability Extremes

Pascal's Wager, Blaise Pascal's famous argument for belief in God, is a decision-theoretic argument: if God exists and you believe, you gain infinite reward; if God exists and you don't believe, you suffer infinite punishment; therefore the expected value of belief is infinite regardless of the probability of God's existence. Even a tiny probability of infinite reward makes belief the rational choice.

The philosopher Nick Bostrom identified a related problem called "Pascal's Mugging": a mugger threatens to cause enormous harm (say, to a trillion people in a simulation) unless you give them $10. The probability of this is extremely small, but the potential harm is so large that the expected value calculation might seem to favor complying.

These thought experiments reveal a limitation of pure expected value reasoning: it can be derailed by extreme utilities and low probabilities in ways that produce intuitively absurd conclusions. Rational decision-making requires bounding claims about probability and utility by realistic prior distributions, not just accepting any stated probabilities and utilities at face value.

How Uncertainty is Weaponized by Misinformers

Misinformation actors exploit uncertainty in several characteristic ways:

False symmetry: Presenting a fringe scientific view as equivalent to the consensus view, implying genuine uncertainty where there is strong consensus. "Scientists disagree about whether climate change is man-made" — technically true (a tiny number of scientists disagree) but deeply misleading about the distribution of scientific opinion.

Uncertainty laundering: Taking genuine scientific uncertainty about one aspect of a question and using it to cast doubt on the whole. "Scientists aren't sure exactly how much sea levels will rise, therefore we don't know if sea levels will rise significantly at all."

Demanding certainty to paralyze action: "We can't be certain that pesticide X causes cancer, so we shouldn't restrict it." This inverts the precautionary principle and demands certainty that science never provides, thereby preventing any regulation based on probabilistic risk evidence.

Exploiting the asymmetry of evidence: Scientific claims require strong positive evidence to be established; disproven claims require only effective doubt about that evidence. Misinformers exploit this by perpetually generating new pieces of superficially plausible counter-evidence, requiring more and more scientific resources to rebut each piece, while the burden of establishing the original claim is largely already met.

Key Terms

Calibration — The property of forecasts in which stated confidence levels accurately reflect actual accuracy rates; a well-calibrated forecaster who says "80% confident" is right approximately 80% of the time.

Bayes' Theorem — A mathematical theorem specifying how to update probability estimates when new evidence is acquired: P(A|B) = P(A) × P(B|A) / P(B).

Prior probability — The probability assigned to a hypothesis before new evidence is considered.

Posterior probability — The probability assigned to a hypothesis after incorporating new evidence.

Likelihood ratio — The ratio P(E|hypothesis true) / P(E|hypothesis false); measures how much evidence E increases or decreases the odds of a hypothesis.

Base rate neglect — The failure to incorporate the prior probability of an event when evaluating new evidence; often leads to overestimation of the probability of rare events after positive test results.

Prosecutor's fallacy — Confusing P(evidence|innocent) with P(innocent|evidence); assuming that low probability of evidence given innocence implies high probability of guilt.

Conjunction fallacy — Rating the probability of two events occurring together as higher than the probability of either event alone; exemplified by the Linda problem.

Superforecaster — A person demonstrating significantly above-average calibration in probabilistic forecasting across diverse domains, as identified in Tetlock's Good Judgment Project.

Brier score — A mathematical measure of forecast accuracy: (probability - outcome)²; lower is better; 0 is perfect.

Expected value — The probability-weighted sum of possible outcomes of an action; the central concept in rational decision-making under uncertainty.

Precautionary principle — The principle that uncertain risks of serious harm should be given priority even in the absence of certainty.

Manufactured uncertainty — The deliberate creation of doubt about scientific consensus by interested parties, used historically by tobacco and fossil fuel industries.

Discussion Questions

The base rate problem in medical testing has the counterintuitive implication that a positive test result from a 99%-accurate test can have less than 50% probability of indicating true disease. If this mathematical reality were better known among patients, how do you think it would change the experience of being tested and receiving results? Would there be costs as well as benefits?
Tetlock's research shows that hedgehog-style experts (who know one big thing and explain everything through it) are less accurate forecasters than fox-style thinkers (who draw on many frameworks). Yet hedgehogs are more sought after as media commentators because they deliver confident, clear-cut predictions. What does this tell us about the media system's relationship to epistemic quality?
The IPCC's formal uncertainty communication system (defining "likely" as >66%, etc.) is explicitly defined in every report. Yet journalists and policymakers routinely misapply these terms. Should scientists change how they communicate uncertainty, or should the audience be expected to learn the technical vocabulary? Who bears the burden of improving science-public communication?
Pascal's Mugging shows that pure expected value reasoning can produce absurd conclusions when enormous utilities are involved. Does this undermine the expected value framework entirely, or does it point to specific conditions under which expected value reasoning should be applied with caution?
Manufactured uncertainty (as practiced by the tobacco and fossil fuel industries) was highly effective for decades at preventing regulatory action. If you were designing institutions to detect and counter manufactured uncertainty, what features would they need to have? What are the obstacles to creating such institutions?
Calibration training can measurably improve forecasting accuracy. Should calibration be taught in schools as part of mathematics or social studies curricula? What objections might be raised to making this a standard educational goal, and how would you respond to them?
Superforecasters perform better than domain experts in many contexts. What are the limits of this finding? Are there domains — medicine, law, engineering — where domain expertise clearly outperforms general probabilistic skill?

Summary

This chapter has built the probabilistic thinking framework that underlies rational engagement with an uncertain information environment. We began with the costs of binary thinking and the alternative of calibrated probabilistic reasoning. We established the mathematical foundations: probability axioms, conditional probability, and Bayes' Theorem, with detailed analysis of the base rate problem that trips up even trained physicians.

We then developed the Bayesian epistemological framework — belief as probability distribution, rational belief update, and the prior/posterior distinction — and examined how this framework exposes common failures of reasoning, particularly base rate neglect and the conjunction fallacy. Tetlock's superforecasting research provided empirical evidence that probabilistic reasoning skill is learnable and consequential, and that calibration is both measurable and improvable.

The chapter concluded by connecting probabilistic thinking to communication and decision-making: the difficulties of translating numerical uncertainty into verbal terms, the IPCC's systematic approach to this problem, and the ways in which uncertainty — both genuine and manufactured — affects decision-making under conditions that misinformation actors strategically exploit.

Probabilistic thinking will not inoculate anyone against all misinformation. But it provides the cognitive infrastructure for the most important judgment: distinguishing between confident assertion and well-supported probability estimate, between manufactured doubt and genuine scientific uncertainty, and between the improbable and the impossible.