39 min read

> "Reductio ad absurdum, which Euclid loved so much, is one of a mathematician's finest weapons. It is a far finer gambit than any chess gambit: a chess player may offer the sacrifice of a pawn or even a piece, but a mathematician offers the game."

Prerequisites

  • 4

Learning Objectives

  • Write a rigorous proof by contradiction, clearly stating the assumption that is driven to absurdity.
  • Prove a statement by exhaustive cases, justifying that the cases are complete and using 'without loss of generality' correctly.
  • Prove existence claims (constructively and non-constructively) and uniqueness claims.
  • Disprove a universal statement by exhibiting a single counterexample, and explain why one suffices.
  • Choose the proof strategy that fits the shape of a claim, and recognize how contradiction underlies impossibility results in computer science.

Chapter 5: Proof Strategies II — Contradiction and Cases

"Reductio ad absurdum, which Euclid loved so much, is one of a mathematician's finest weapons. It is a far finer gambit than any chess gambit: a chess player may offer the sacrifice of a pawn or even a piece, but a mathematician offers the game." — G. H. Hardy, A Mathematician's Apology

Overview

Some claims refuse to be proved head-on. Try to prove directly that there is no largest prime number and you are stuck before you start: there is nothing to construct, nothing to compute, no object to point at. The claim is about an absence — "no such thing exists" — and you cannot build an absence. The same wall stands in front of every impossibility result in computer science. Why can't a program decide whether an arbitrary program halts? Why can't any comparison sort beat $n \log n$ in the worst case? Why is there no lossless compressor that shrinks every file? Each of these is a statement that something is impossible, and impossibility is exactly what a direct proof cannot touch.

The technique that breaks through the wall is proof by contradiction: assume the thing you want to forbid actually happens, then follow the consequences until they collapse into nonsense. The nonsense is the proof. It is the single most powerful move in a prover's repertoire — Hardy called it a mathematician's finest weapon — and it is the move behind the deepest results in all of computing, including the one we will build toward all book long: that the halting problem is unsolvable (Chapter 36).

This chapter also equips you with the other workhorses of everyday proof. Proof by cases splits a messy claim into a handful of clean ones — the same instinct as an exhaustive switch statement, and the same obligation to cover every branch. Existence and uniqueness proofs answer the two questions a software designer asks about any specification: is there a valid solution at all, and is it the only one? And disproof by counterexample is how you kill a false conjecture in one shot — the cheapest, most decisive move in the whole subject, and the daily complement to the property-based tests you already write.

By the end you'll own a decision procedure: given a claim, you'll recognize its shape and reach for the strategy that fits.

In this chapter, you will learn to:

  • Prove a statement by assuming its negation and deriving a contradiction.
  • Prove the two most famous results in elementary mathematics — that $\sqrt{2}$ is irrational and that there are infinitely many primes — and see why both need contradiction.
  • Recognize contradiction as the engine of impossibility results across computer science.
  • Split a proof into exhaustive cases and use "without loss of generality" to avoid redundant work.
  • Prove that something exists (with or without building it) and that it is unique.
  • Disprove a universal claim with a single counterexample, and judge when "no counterexample found" is reassurance versus when it demands a proof.

Learning Paths

🏃 Fast Track: If proof by contradiction is already comfortable, skim 5.1, read 5.2 (the CS impossibility preview) carefully because it frames the rest of the book, then go to 5.5 (counterexamples) and 5.6 (choosing a strategy). Do the ⭐⭐ and ⭐⭐⭐ exercises.

📖 Standard Path: Read in order. Work every 🔄 Check Your Understanding before continuing, and try to reconstruct the $\sqrt{2}$ and infinitude-of-primes proofs from memory before reading ours. Attempt each 🧩 Productive Struggle first.

🔬 Deep Dive: After the chapter, investigate the line between constructive and non-constructive existence proofs (5.4) and why some logicians reject proof by contradiction for existence claims. Then read ahead to how diagonalization (Chapter 10) turns contradiction into a tool for comparing infinities. Exercise set, Part E.


5.1 Proof by contradiction

Here is the idea in one sentence, the same way we'll state the idea behind every strategy in this book: to prove a statement, assume it is false and show that something impossible follows. If assuming "$P$ is false" forces a contradiction — a statement that is both true and false at once — then "$P$ is false" cannot stand, so $P$ must be true.

Why is that valid? It rests on a principle you proved truth tables for back in Chapter 1: a proposition is either true or false, with no third option (the law of the excluded middle). So if $\neg P$ leads to absurdity, $\neg P$ is untenable, and the only remaining possibility is $P$. In symbols, the strategy proves the implication $$\neg P \rightarrow (\,q \land \neg q\,)$$ for some statement $q$. Since $q \land \neg q$ is always false (a contradiction, in the precise sense from Chapter 1), the only way that implication can be true is if $\neg P$ is false — that is, if $P$ is true.

💡 Intuition: Think of it as a proof by trap. You let the adversary assume what you want to forbid, give them all the rope they ask for, and they hang themselves on it. You don't have to build anything; you just have to show their assumption is self-destructive. This is why contradiction is the natural weapon against "there is no such thing" claims — you assume there is such a thing and watch it self-immolate.

Let's see the two proofs that, more than any others, made mathematicians fall in love with this technique.

The square root of 2 is irrational

Recall from Chapter 4 the definitions we'll lean on. An integer is even if it equals $2m$ for some integer $m$, and odd otherwise. A real number is rational if it can be written as a fraction $\frac{p}{q}$ of integers with $q \neq 0$; otherwise it is irrational. The ancient Greeks believed, as a near-religious conviction, that every length was a ratio of whole numbers. The number $\sqrt{2}$ — the diagonal of a unit square, by the Pythagorean theorem — destroyed that belief. Here is why.

The strategy first. We cannot directly exhibit "a number that is not a fraction" — irrationality is again an absence, the absence of any integer pair $p, q$ that works. So we assume the opposite: that $\sqrt{2}$ is a fraction, in lowest terms. Then we squeeze that fraction until both its numerator and denominator turn out to be even — which contradicts "lowest terms." The phrase "in lowest terms" is the rope we hand the adversary; it is what eventually strangles the assumption.

🧩 Productive Struggle. Before reading the proof: suppose $\frac{p}{q}$ is a fraction in lowest terms and $\frac{p^2}{q^2} = 2$, so $p^2 = 2q^2$. What does $p^2 = 2q^2$ tell you about whether $p$ is even or odd? (Hint: use the fact from Chapter 4 that an integer's square has the same parity as the integer.) Push as far as you can before continuing.

Theorem. $\sqrt{2}$ is irrational.

Proof (by contradiction). Assume, for contradiction, that $\sqrt{2}$ is rational. Then we can write $$\sqrt{2} = \frac{p}{q}$$ where $p$ and $q$ are integers, $q \neq 0$, and — crucially — the fraction is in lowest terms, meaning $p$ and $q$ share no common factor greater than 1. (Any fraction can be reduced to lowest terms, so this assumption costs us nothing.)

Squaring both sides gives $2 = \dfrac{p^2}{q^2}$, hence $$p^2 = 2q^2. \tag{5.1}$$ Equation (5.1) says $p^2$ is even (it is two times an integer). By the parity fact from Chapter 4 — if $p^2$ is even then $p$ is even (its contrapositive, "$p$ odd $\Rightarrow p^2$ odd," is a direct computation) — we conclude $p$ is even. So $p = 2a$ for some integer $a$.

Substitute $p = 2a$ into (5.1): $$(2a)^2 = 2q^2 \quad\Longrightarrow\quad 4a^2 = 2q^2 \quad\Longrightarrow\quad q^2 = 2a^2.$$ Now the same reasoning applies to $q$: $q^2$ is even, so $q$ is even.

But we have just shown that both $p$ and $q$ are even — they share the common factor 2. This contradicts our assumption that $\frac{p}{q}$ was in lowest terms. The assumption that $\sqrt{2}$ is rational is therefore untenable, so $\sqrt{2}$ is irrational. $\blacksquare$

Sit with the shape of that argument, because it is the template for every contradiction proof: assume the negation → derive consequences → arrive at a statement that contradicts either a known fact or an earlier line of the proof → conclude the negation was false. The contradiction here was internal — it contradicted our own "lowest terms" setup. Sometimes, as in the next proof, the contradiction is against an established fact instead.

We can test the claim computationally before trusting the proof. The file code/example-01-sqrt2-irrational.py searches every fraction $p/q$ with denominator up to 1000 for one whose square is exactly 2:

def exact_hit_within(limit):
    """First q in [1, limit] with some integer p making p*p == 2*q*q, else None."""
    for q in range(1, limit + 1):
        p = round((2 ** 0.5) * q)        # nearest possible numerator
        if p * p == 2 * q * q:           # would mean (p/q)^2 == 2 exactly
            return q
    return None

print("denominator giving an exact hit:", exact_hit_within(1000))
# Expected output:
# denominator giving an exact hit: None

The search returns None — no fraction works — which is reassuring but proves nothing: we only checked $q \le 1000$. The proof, by contrast, settles every denominator at once. That division of labor between code and proof is theme four of this book, and it will recur in every chapter.

🔄 Check Your Understanding 1. In the $\sqrt{2}$ proof, what exactly is the contradiction — which two statements clash? 2. Where in the proof did we use the assumption "in lowest terms"? Would the proof work without it? 3. The proof used "if $p^2$ is even then $p$ is even." Is that the statement we proved in Chapter 4, or its contrapositive?

Answers

  1. "$p$ and $q$ share no common factor $> 1$" (our lowest-terms assumption) clashes with "$p$ and $q$ are both even" (derived). 2. Only at the very end — it is the fact the derived consequence contradicts. Without it there is no contradiction: every fraction equals infinitely many others, so "both even" wouldn't be absurd. The lowest-terms assumption is the entire trap. 3. We proved (directly) "$p$ odd $\Rightarrow p^2$ odd"; the form we used, "$p^2$ even $\Rightarrow p$ even," is its contrapositive (Chapter 4) — logically equivalent.

There are infinitely many primes

This is Euclid's theorem, roughly 2300 years old, and still the cleanest illustration of contradiction against a constructed object. Recall a prime is an integer $p > 1$ whose only positive divisors are 1 and $p$; we'll use the divisibility notation $a \mid b$ ("$a$ divides $b$") from Chapter 4. We also use one fact we'll prove properly in Chapter 22 (the Fundamental Theorem of Arithmetic, in its weak form): every integer greater than 1 has at least one prime divisor. (You can see why quickly: the smallest divisor of $n$ above 1 cannot itself be composite, or it would have a still-smaller divisor.)

The strategy first. "Infinitely many primes" is a statement about an unbounded supply — another thing you can't display all at once. So assume the opposite: only finitely many primes exist, and write them all down. The trick is then to manufacture a number that no prime on the list can divide, yet which must have a prime divisor. That divisor is a prime you forgot — contradicting "we listed them all." The manufacturing step is the whole proof; everything else is bookkeeping.

Theorem (Euclid). There are infinitely many prime numbers.

Proof (by contradiction). Assume, for contradiction, that there are only finitely many primes. Then we can list them all: $$p_1, p_2, \dots, p_k.$$ Consider the number $$N = (p_1 \cdot p_2 \cdots p_k) + 1,$$ the product of all the primes, plus one. Now, $N > 1$, so by the fact above $N$ has some prime divisor; call it $q$. Since $p_1, \dots, p_k$ is the complete list of primes, $q$ must be one of them, say $q = p_i$.

But $p_i$ divides the product $p_1 \cdots p_k$ (it's one of the factors), and we are supposing $p_i = q$ divides $N$. If a number divides two integers, it divides their difference, so $p_i$ divides $$N - (p_1 \cdots p_k) = 1.$$ That means $p_i \mid 1$ — but the only positive divisor of 1 is 1 itself, and $p_i > 1$ because it's prime. Contradiction. Hence the assumption that there are finitely many primes is false, and there are infinitely many. $\blacksquare$

⚠️ Common Pitfall: A frequent misreading is that $N$ itself is always prime. It need not be! The proof only claims $N$ has a prime factor outside the list. The smallest counterexample is famous: with the first six primes, $$2\cdot 3\cdot 5\cdot 7\cdot 11\cdot 13 + 1 = 30031 = 59 \times 509,$$ which is composite — yet 59 and 509 are both primes missing from the list, exactly as the proof guarantees. The argument promises a new prime divisor, not a new prime.

The file code/example-02-infinitude-primes.py makes the construction concrete by computing $N$ and its smallest prime factor for two starting lists:

for plist in ([2, 3, 5, 7], [2, 3, 5, 7, 11, 13]):
    n, newp = euclid_witness(plist)      # n = product(plist)+1
    print(plist, "-> N =", n, ", new prime factor:", newp)
# Expected output:
# [2, 3, 5, 7] -> N = 211 , new prime factor: 211
# [2, 3, 5, 7, 11, 13] -> N = 30031 , new prime factor: 59

For $[2,3,5,7]$ the manufactured $N = 211$ happens to be prime; for the six-prime list, $N = 30031$ is composite but coughs up the new prime 59. Either way, the list was incomplete — which is the contradiction.

🔗 Connection. Primes and divisibility are not idle curiosities. They are the entire foundation of the RSA cryptosystem that secures the web, which you will implement in Chapter 25. The fact that there are infinitely many primes — and, harder, that large ones are plentiful — is what makes it possible to hand every device on Earth its own key. Contradiction proofs about primes in Chapter 5 are the seed of the security in Part IV.

🚪 Threshold Concept. Proof by contradiction is a genuine threshold — once you internalize "to forbid something, assume it and break it," a whole category of claims that looked unattackable becomes routine. Every impossibility result you will ever meet — no largest prime, no rational $\sqrt 2$, no algorithm that solves the halting problem (Chapter 36), no comparison sort faster than $n\log n$ — is proved this way, because impossibility is the absence of a thing, and contradiction is how we reason about absence. If you take one technique from this chapter into the rest of computer science, take this one.


5.2 Contradiction in computer science: the shape of impossibility

Why does a programmer need a 2300-year-old proof technique? Because the most important results in computer science are not "here is a faster algorithm" — they are "no algorithm can do this," and every one of them is a proof by contradiction. Learning the move now means that when these results arrive later in the book, you will recognize the machinery instead of being mystified by it.

Here is the pattern, stripped to its skeleton. To prove "no program can do task $T$," you:

  1. Assume a program $D$ exists that does $T$, for every input.
  2. Build a new program (or input) that uses $D$ against itself — typically by feeding $D$ a description of itself.
  3. Derive that this new program both must and must not behave a certain way — a contradiction.
  4. Conclude that $D$ cannot exist.

That self-referential twist in step 2 — feeding a program its own description — is the same diagonal trick we'll meet in Chapter 10 when we prove some infinities are bigger than others, and it is exactly how Chapter 36 proves the halting problem unsolvable. Let's preview that argument at a level you can already follow.

The strategy first (a preview, not a full proof). The halting problem asks for a program halts(P, x) that returns True if program P eventually stops on input x, and False if P loops forever — and is itself always guaranteed to stop and give an answer. We'll assume such a halts exists and turn it into a paradox by building a program that does the opposite of what halts predicts about it.

Suppose, for contradiction, that the function halts(P, x) exists and is always correct. Then we could write this troublemaker:

def trouble(P):
    if halts(P, P):     # does P halt when given its own source as input?
        while True:      # if so, loop forever
            pass
    else:
        return           # if not, halt immediately

Now ask the fatal question: does trouble halt when given itself as input — that is, what does trouble(trouble) do?

  • If trouble(trouble) halts, then by definition halts(trouble, trouble) returned True. But look at the code: when halts(...) returns True, trouble enters while True and loops forever. So it does not halt. Contradiction.
  • If trouble(trouble) loops forever, then halts(trouble, trouble) returned False. But when halts(...) returns False, trouble hits the else and returns immediately. So it does halt. Contradiction.

Either way we reach a statement that is both true and false. The only assumption we made was that halts exists, so halts cannot exist: no program can decide, in general, whether an arbitrary program halts. $\blacksquare$ (We'll make every step of this rigorous in Chapter 36, once we have a precise model of "program.")

💡 Intuition: The halting proof is the $\sqrt{2}$ proof wearing a lab coat. Both assume a thing exists, both extract a self-contradiction, both conclude the thing is impossible. The only new ingredient is self-reference — letting the program inspect itself. Self-reference plus contradiction is one of the most productive pairings in all of logic and computing; we'll return to it for uncomputability (Chapter 10) and undecidability (Chapter 36).

This is also the answer to a question working programmers actually ask: why doesn't my compiler just warn me about every infinite loop, or every program that will crash? It cannot, even in principle — not because compiler writers are lazy, but because such a tool is mathematically impossible, and the proof is the contradiction above. Knowing this saves you from chasing impossible features and tells you when to reach for approximate tools (linters, bounded model checkers, timeouts) instead.

🔄 Check Your Understanding 1. In the halting argument, what plays the role that "in lowest terms" played in the $\sqrt{2}$ proof — i.e., what is the assumption the contradiction ultimately destroys? 2. Why is the step "feed the program its own description" essential? What would go wrong if trouble called halts(P, x) on some unrelated input instead of halts(P, P)?

Answers

  1. The assumption that halts exists and is always correct — that is the single supposition every line depends on, so it is what the contradiction refutes. 2. Self-reference is what lets the program contradict its own predicted behavior. If trouble consulted halts about an unrelated program, there'd be no clash between prediction and behavior, and no paradox — the contradiction comes precisely from trouble doing the opposite of what halts says trouble will do.

5.3 Proof by cases

Not every claim yields to a single clean argument. Sometimes the objects you're reasoning about come in distinct flavors — even or odd, positive or negative or zero, prime or composite — and the cleanest proof handles each flavor separately. That is proof by cases: you split the universe of possibilities into a finite list of exhaustive cases, prove the claim in each one, and conclude it holds in all.

💡 Intuition: Proof by cases is the switch statement of mathematics. You enumerate every branch the input could fall into, handle each branch, and — this is the part students forget — prove your branches are exhaustive, just as a switch with a missing default is a lurking bug. A proof that omits a case is exactly a switch that silently drops an input on the floor.

The one non-negotiable obligation is completeness: the cases must cover every possibility, with no gap. (They may overlap — that's harmless — but they may not leave anything out.) Let's see the canonical example.

The strategy first. We want to show $n^2 + n$ is always even, for every integer $n$. There's no single formula manipulation that obviously yields "even," but every integer is either even or odd — that's a complete two-way split — and in each case the algebra is trivial. So we split on the parity of $n$ and grind out two short sub-proofs.

Theorem. For every integer $n$, the number $n^2 + n$ is even.

Proof (by cases on the parity of $n$). Every integer is either even or odd; these two cases are exhaustive.

Case 1: $n$ is even. Then $n = 2a$ for some integer $a$. So $$n^2 + n = (2a)^2 + 2a = 4a^2 + 2a = 2(2a^2 + a).$$ Since $2a^2 + a$ is an integer, $n^2 + n$ is even.

Case 2: $n$ is odd. Then $n = 2a + 1$ for some integer $a$. So $$n^2 + n = (2a+1)^2 + (2a+1) = (4a^2 + 4a + 1) + (2a + 1) = 4a^2 + 6a + 2 = 2(2a^2 + 3a + 1).$$ Since $2a^2 + 3a + 1$ is an integer, $n^2 + n$ is even.

In both cases $n^2 + n$ is even, and the cases are exhaustive, so $n^2 + n$ is even for every integer $n$. $\blacksquare$

🔗 Connection. There's a slicker one-line proof: $n^2 + n = n(n+1)$, the product of two consecutive integers, and of any two consecutive integers one is even — so the product is even. That's still a proof by cases (which of the two is even?), just compressed. Often a clever factoring hides the case split rather than removing it. Recognizing the hidden cases is part of reading proofs critically.

The file code/example-03-proof-by-cases.py confirms the claim on a sample, tagging each value with the proof case it falls under:

sample = range(-4, 6)
for n in sample:
    print(f"n={n:>2} ({case_of(n)}): n^2+n = {n*n + n}, even? {is_even(n*n + n)}")
print("all even on sample?", all(is_even(n*n + n) for n in sample))
# Expected output (first and last lines shown):
# n=-4 (even): n^2+n = 12, even? True
# ...
# n= 5 (odd): n^2+n = 30, even? True
# all even on sample? True

Without loss of generality

Cases sometimes come in symmetric pairs, where the argument for one is identical to the other with the labels swapped. Writing both out is pure redundancy. The phrase without loss of generality (abbreviated WLOG) lets you handle one representative case and assert the others follow by the same reasoning — provided the symmetry is real.

The strategy first. To prove "the product of two integers of the same parity is ... " or any claim symmetric in two variables $a$ and $b$, note that swapping $a$ and $b$ changes nothing essential. So we may assume an ordering — say $a \le b$ — handle that, and observe the case $b < a$ is identical after relabeling. WLOG is a labor-saving device, not a logical shortcut: it is only legitimate when swapping genuinely leaves the claim unchanged.

Theorem. For all real numbers $a$ and $b$, $\lvert a + b\rvert \le \lvert a\rvert + \lvert b\rvert$ (the triangle inequality), in the special case where $a$ and $b$ have the same sign.

Proof (by cases, using WLOG). Suppose $a$ and $b$ have the same sign (or are zero). There are two sign-cases — both non-negative, or both non-positive — but they are symmetric under multiplying everything by $-1$ (which leaves all the absolute values unchanged). Without loss of generality, assume $a \ge 0$ and $b \ge 0$. Then $a + b \ge 0$, so $\lvert a + b\rvert = a + b = \lvert a\rvert + \lvert b\rvert$, and in particular $\lvert a+b\rvert \le \lvert a\rvert + \lvert b\rvert$. The case $a \le 0, b \le 0$ follows by replacing $a, b$ with $-a, -b$, which negates nothing inside the absolute values. $\blacksquare$

⚠️ Common Pitfall: WLOG is abused far more often than it's used correctly. You may only write "without loss of generality, assume $a \le b$" when the roles of $a$ and $b$ are truly interchangeable in the claim. If the statement treats them asymmetrically — say it mentions $a^b$, or "$a$ is the input and $b$ is the output" — then swapping them changes the problem, and WLOG is a logical error that can "prove" false things. Test for real symmetry before invoking it: swap the variables in the statement and check you get back the identical statement.

🔄 Check Your Understanding 1. In a proof by cases, which obligation is the one students most often neglect, and what's the analogous bug in a switch statement? 2. You're proving a claim about two integers $m$ and $n$, and the claim is "$m^n$ is positive whenever $m > 0$." Can you say "WLOG, assume $m \le n$"? Why or why not?

Answers

  1. Completeness — proving the cases are exhaustive (cover every possibility). The analogous bug is a switch with no default (or a missing branch) that silently mishandles some inputs. 2. No. The claim treats $m$ and $n$ asymmetrically ($m$ is the base, $n$ the exponent), so swapping them changes the statement. WLOG requires genuine symmetry between the variables, which is absent here.

5.4 Existence and uniqueness proofs

Two of the most practical questions a programmer can ask about a specification are: does a valid solution exist at all? and is it the only one? A type system asks the first ("is there any value of this type?"); a database key constraint asserts the second ("at most one row per key"). Mathematics has a dedicated proof shape for each.

Existence proofs

An existence proof establishes a claim of the form "there exists an $x$ such that $P(x)$" — symbolically $\exists x\, P(x)$. There are two flavors, and the difference matters enormously in computer science.

A constructive existence proof exhibits a specific $x$ (or gives an algorithm to find one) and verifies $P(x)$. A non-constructive existence proof proves that some $x$ exists without producing one — often by contradiction ("if no such $x$ existed, absurdity follows"). Constructive proofs are gold for programmers because they hand you the object; non-constructive proofs tell you a search will succeed without telling you where to look.

The strategy first (constructive). To prove "there exist two irrational numbers whose product is rational," we just need to name two and check. The square root we proved irrational in 5.1 is right there: $\sqrt 2 \cdot \sqrt 2 = 2$, rational. Done — the witnesses are explicit.

Theorem. There exist irrational numbers $a$ and $b$ such that $a \cdot b$ is rational.

Proof (constructive). Take $a = b = \sqrt{2}$. By the theorem of 5.1, $\sqrt 2$ is irrational, so both $a$ and $b$ are irrational. Their product is $a \cdot b = \sqrt 2 \cdot \sqrt 2 = 2$, which is rational. The pair $(\sqrt 2, \sqrt 2)$ is an explicit witness. $\blacksquare$

Now contrast a famously non-constructive gem.

The strategy first (non-constructive). We'll prove there exist irrationals $a, b$ with $a^b$ rational — but by a trick that proves existence without revealing which pair works. Consider the number $\sqrt 2^{\sqrt 2}$. We don't know offhand whether it's rational or irrational, so we split into those two cases — and show that whichever case holds, a valid pair exists. The proof never tells us which case is the true one; that's what makes it non-constructive.

Theorem. There exist irrational numbers $a$ and $b$ such that $a^b$ is rational.

Proof (non-constructive, by cases). Consider the number $x = \sqrt{2}^{\,\sqrt{2}}$. Either $x$ is rational or it is irrational — these cases are exhaustive.

Case 1: $x$ is rational. Then take $a = b = \sqrt 2$. Both are irrational (5.1), and $a^b = \sqrt 2^{\,\sqrt 2} = x$ is rational by assumption. Done.

Case 2: $x$ is irrational. Then take $a = \sqrt 2^{\,\sqrt 2}$ (irrational, in this case) and $b = \sqrt 2$ (irrational). Then $$a^b = \left(\sqrt 2^{\,\sqrt 2}\right)^{\sqrt 2} = \sqrt 2^{\,(\sqrt 2 \cdot \sqrt 2)} = \sqrt 2^{\,2} = 2,$$ which is rational. Done.

In either case, a pair of irrationals $a, b$ with $a^b$ rational exists. Therefore such a pair exists. $\blacksquare$

💡 Intuition: Read that proof again and notice the strange thing: we have proved a suitable pair exists, but at the end we still cannot say which pair it is, because we never settled whether $\sqrt 2^{\sqrt 2}$ is rational. (It is in fact irrational — a hard 20th-century theorem — but the proof above doesn't need that and is the more elegant for it.) Non-constructive existence is a real and sometimes uncomfortable thing: a guarantee that a solution is out there, with no map to it. For a programmer this is the difference between "a solution exists" and "here is the solution" — and it is exactly why constructive proofs are prized in computer science.

🔗 Connection. This distinction returns with force in the probabilistic method (Chapter 20), where we prove an object with some desired property exists by showing a random object has it with positive probability — without ever constructing the object. It also underlies the gap between deciding a solution exists and finding it, which is the heart of the P-versus-NP question (Chapter 37).

Uniqueness proofs

To prove uniqueness — "there is exactly one $x$ with $P(x)$," symbolically $\exists!\, x\, P(x)$ — you owe two separate things:

  1. Existence: at least one such $x$ exists (prove it as above).
  2. Uniqueness: at most one such $x$ exists.

The standard way to prove the second is itself a small contradiction-flavored move: assume two objects both satisfy the property, then show they must be equal. Suppose $x_1$ and $x_2$ both satisfy $P$; derive $x_1 = x_2$.

The strategy first. For a uniqueness claim, lead with existence (often constructive), then prove uniqueness by supposing $x_1$ and $x_2$ both work and forcing $x_1 = x_2$. The forcing step usually subtracts or compares the two and shows the difference is zero.

Theorem. For every real number $a$, there is a unique real number $x$ such that $a + x = 0$ (the additive inverse).

Proof. Existence: take $x = -a$; then $a + (-a) = 0$, so a solution exists.

Uniqueness: suppose $x_1$ and $x_2$ both satisfy the property, so $a + x_1 = 0$ and $a + x_2 = 0$. Then $a + x_1 = a + x_2$. Adding $-a$ to both sides (or subtracting $a$) gives $x_1 = x_2$. So any two solutions coincide; the solution is unique. $\blacksquare$

⚠️ Common Pitfall: A uniqueness proof that forgets the existence half proves the wrong thing. "At most one $x$ works" is vacuously true if no $x$ works at all! Always prove both halves. Conversely, an existence proof says nothing about uniqueness — a quadratic can have two roots. Match the proof obligations to the exact claim: $\exists$ needs one witness; $\exists!$ needs a witness and a collapse argument.

🔄 Check Your Understanding 1. What is the practical difference, for a programmer, between a constructive and a non-constructive existence proof? 2. A uniqueness proof has two obligations. Name both, and explain why skipping the first makes "at most one solution" misleading.

Answers

  1. A constructive proof yields the actual object (or an algorithm to compute it) — you can return it from a function. A non-constructive proof only guarantees the object exists, giving you no way to produce it; you know a search would succeed but not where. 2. Existence (at least one solution) and uniqueness (at most one). Skipping existence is dangerous because "at most one solution" is vacuously true when there are zero solutions — so without existence, the uniqueness claim can be technically true yet practically empty.

5.5 Counterexamples and disproof

So far we've proved statements true. Often the more urgent task — especially when debugging your own reasoning or a teammate's confident claim — is to prove a statement false. For universal statements, there is a move so cheap and so decisive that it deserves its own discipline: the counterexample.

You met counterexamples back in Chapter 2 as the way to refute a $\forall$ statement. Here we put them to work as a proof strategy in their own right and connect them to the testing you do every day.

🔗 Connection. A counterexample is a single object that makes a universal claim false. Recall from Chapter 2 that the negation of "$\forall x\, P(x)$" is "$\exists x\, \neg P(x)$" — so disproving a "for all" claim is exactly exhibiting one $x$ where the property fails. That is why a single counterexample is a complete disproof: it is a constructive proof of the negation.

💡 Intuition: Proving a "for all" claim is expensive — you must cover infinitely many cases. Disproving it is cheap — you need exactly one bad case. This asymmetry is the most lopsided trade in all of mathematics, and it's the same asymmetry behind Karl Popper's view of science: a universal law can never be confirmed by finitely many observations, but it can be refuted by a single one. One black swan disproves "all swans are white."

Here is a conjecture that looks unbreakable and survives a startling amount of testing.

The strategy first. The polynomial $n^2 + n + 41$ produces a prime for $n = 0, 1, 2, \dots$ — try the first several and you'll get 41, 43, 47, 53, 61, all prime. It is tempting to conjecture "$n^2 + n + 41$ is prime for every $n \ge 0$." To disprove it we don't argue cleverly; we hunt for the first $n$ that breaks it. The hunt is the disproof.

Conjecture (false). For every integer $n \ge 0$, the value $n^2 + n + 41$ is prime.

Disproof. The claim holds for $n = 0$ through $n = 39$ (Euler noticed this in 1772). But take $n = 40$: $$40^2 + 40 + 41 = 1600 + 40 + 41 = 1681 = 41 \times 41 = 41^2,$$ which is composite. So $n = 40$ is a counterexample, and the universal claim is false. $\blacksquare$

The number 40 was not pulled from a hat — it's found by searching, which is precisely what code is for. The file code/example-04-counterexample-search.py hunts for it:

def first_counterexample(upto):
    """Smallest n in [0, upto] with n^2+n+41 composite, else None."""
    for n in range(upto + 1):
        if not is_prime(n * n + n + 41):
            return n
    return None

print("first n making n^2+n+41 composite:", first_counterexample(100))
# Expected output:
# first n making n^2+n+41 composite: 40

⚠️ Common Pitfall: Passing 40 consecutive tests is not a proof. The conjecture above is true for $n = 0, \dots, 39$ — forty cases! — and still false. This is the daily lesson of theme two of this book: "it passed the tests" is not "it is correct." Tests (and first_counterexample's search) can only ever find a counterexample; they can never certify its absence over an infinite domain. When the search comes up empty, you've earned confidence, not proof — and the gap between them is exactly where the hardest bugs and the deepest theorems live.

🐛 Find the Error. A teammate claims: "For all integers $a, b$: if $a \mid b$ and $b \mid a$, then $a = b$." They support it with examples: $3 \mid 3$ and $3 \mid 3$ give $a = b = 3$; $12 \mid 12$ likewise. Is the claim true? Find a counterexample or prove it.

Answer

False. Counterexample: $a = 3$, $b = -3$. Then $3 \mid -3$ (since $-3 = 3 \cdot (-1)$) and $-3 \mid 3$ (since $3 = -3 \cdot (-1)$), yet $a \neq b$. The correct statement is "$a = b$ or $a = -b$," i.e. $\lvert a\rvert = \lvert b\rvert$. The teammate only tried positive examples, which is exactly the blind spot a counterexample search over a wider domain (including negatives) would catch. Moral: when you conjecture, test the boundaries — zero, negatives, empty inputs — because that's where universal claims tend to break.

When you can't find a counterexample

Failing to find a counterexample does not prove a statement true — but it does two useful things. First, it raises your confidence enough to justify investing effort in a proof. Second, the attempt often reveals the proof: the reason your search keeps failing is frequently the reason the statement is true, and watching why every candidate survives can hand you the argument. Use code to triage conjectures — refute the false ones instantly, and concentrate your proving energy on the survivors.

🔄 Check Your Understanding 1. Why does a single counterexample fully disprove a universal claim, while a million confirming examples don't prove it? 2. Your test suite passes 10,000 randomized cases for the property "reverse(reverse(xs)) == xs." Have you proved the property? What would a proof require that the tests don't provide?

Answers

  1. The negation of "$\forall x\,P(x)$" is "$\exists x\,\neg P(x)$" (Chapter 2), so one failing case is a complete proof of the negation. But "$\forall x\,P(x)$" ranges over (typically) infinitely many $x$; finitely many confirmations leave infinitely many cases unchecked. 2. No — you've gained strong confidence, not proof. A proof would have to cover every possible list (e.g., by induction on the list's length, Chapter 6/7), whereas the tests cover only the 10,000 lists actually generated.

5.6 Choosing a strategy

You now have, between Chapters 4 and 5, a toolkit of proof strategies: direct proof, contraposition (Chapter 4), and contradiction, cases, existence/uniqueness, and counterexample (this chapter). The skill that separates a fluent prover from a flailing one is picking the right tool fast. Strategy selection is driven by the shape of the claim — its logical form and what it asserts. Here is the decision procedure.

Read the claim's logical form, then:

If the claim looks like… First reach for… Because…
"$P \rightarrow Q$" (a conditional) Direct proof (Ch. 4): assume $P$, derive $Q$ It's the default; try it first.
"$P \rightarrow Q$" but $Q$ is awkward and $\neg P$ is concrete Contraposition (Ch. 4): prove $\neg Q \rightarrow \neg P$ Sometimes the contrapositive is easier to attack.
"There is no $x$…" / "it's impossible to…" / "$X$ is irrational/infinite/unsolvable" Contradiction You can't construct an absence; assume it exists and break it.
The objects split into natural types (parity, sign, prime/composite) Proof by cases Handle each type; prove the cases exhaust the possibilities.
"There exists an $x$ with $P(x)$" Existence — constructive if you can build $x$, else non-constructive Exhibit a witness, or prove a search must succeed.
"There is exactly one $x$…" Existence + uniqueness Prove one exists; then assume two and force them equal.
"For all $x$, $P(x)$" that you suspect is false Counterexample One failing $x$ disproves the whole claim.
"For all $x$, $P(x)$" over the naturals that you suspect is true Induction (Ch. 6) The subject of the next chapter.

💡 Intuition: The two biggest tells. (1) The words no, impossible, cannot, irrational, infinite, unsolvable almost always signal contradiction — they describe absences. (2) A claim you're suspicious of, especially a sweeping "for all," deserves a counterexample hunt before any proof attempt; refuting it in thirty seconds beats an hour of trying to prove something false.

The strategy first — a worked selection. Claim: "the sum of a rational and an irrational number is irrational." Scan the form: it's a conditional ("if $r$ rational and $x$ irrational, then $r + x$ irrational") whose conclusion is an absence ("$r+x$ is not a fraction"). Absence in the conclusion → reach for contradiction. Assume $r + x$ is rational and derive that $x$ would be rational too — contradicting the hypothesis.

Theorem. If $r$ is rational and $x$ is irrational, then $r + x$ is irrational.

Proof (by contradiction). Suppose $r$ is rational, $x$ is irrational, but — for contradiction — $r + x$ is rational. The rationals are closed under subtraction (the difference of two fractions is a fraction), so $$x = (r + x) - r$$ is the difference of two rationals, hence rational. But $x$ was assumed irrational. Contradiction. Therefore $r + x$ is irrational. $\blacksquare$

Notice how reading the claim's shape ("conditional with an absence in the conclusion") pointed straight at the technique. With practice this becomes instant.

🧩 Productive Struggle. For each claim, name the strategy you'd try first (don't prove them): (a) "Every integer $n \ge 2$ has a prime divisor." (b) "There is no smallest positive rational number." (c) "$2^n > n^3$ for all $n \ge 10$." (d) "Some set of 5 integers contains two whose difference is divisible by 4." (e) "Every odd square leaves remainder 1 when divided by 8."

Suggested strategies

(a) Existence (constructive: the least divisor $>1$) — or induction/strong induction later. (b) Contradiction — "no smallest" is an absence; assume one exists and find a smaller. (c) Induction (Ch. 6) — a "for all $n \ge$ base" that's true. (d) Existence via pigeonhole (Ch. 17), but provable now by cases on remainders mod 4. (e) Proof by cases on the odd number's form, or direct: write the odd number as $2k+1$ and compute $\pmod 8$.


Project Checkpoint: the disproof half of the Toolkit

Theme four says computation and proof are complementary; theme two says passing tests is not correctness. Chapter 5's contribution to your growing dmtoolkit package sits exactly at that seam. We add the disproof engine: a counterexample finder. It can kill a false universal claim instantly — a real, complete disproof — but when it finds nothing, it returns only evidence, never a proof. That honest boundary is the whole point.

We extend dmtoolkit/logic.py — the module begun in Chapters 1–3 with truth_table, is_tautology, and equivalent — with two composable functions:

def find_counterexample(pred, domain):
    """Return the first x in domain with pred(x) False, else None.
    A returned x DISPROVES 'for all x in domain, pred(x)'."""
    for x in domain:
        if not pred(x):
            return x
    return None

def disprove_forall(pred, domain):
    """True iff a counterexample to 'for all x, pred(x)' exists in domain."""
    return find_counterexample(pred, domain) is not None

Used on Euler's polynomial conjecture from 5.5 over $n = 0, \dots, 59$:

def prime_claim(n):
    v = n * n + n + 41
    return v > 1 and all(v % d for d in range(2, int(v ** 0.5) + 1))

print("counterexample n:", find_counterexample(prime_claim, range(0, 60)))
print("claim disproved?", disprove_forall(prime_claim, range(0, 60)))
# Expected output:
# counterexample n: 40
# claim disproved? True

The tool reports 40 — the smallest $n$ where $n^2 + n + 41$ goes composite — disproving the conjecture in one call. Point it at a true universal (say, "$n^2 + n$ is even") over any finite range and it returns None: no counterexample in that range, which is encouragement to go prove it, not a proof.

Toolkit so far: logic.py now holds the truth-table tools (Chapters 1–3) plus this disproof pair, and Chapter 6 will seed recurrences.py with a conjecture tester (check_identity). Together they are the computational arm of the book's central discipline — use code to refute, use proof to certify — which culminates in the capstone (Chapter 39), where a real project leans on both.


Summary

This chapter added four proof strategies to your repertoire. Reference-grade recap:

The strategies and their triggers

Strategy Proves a claim of the form Core move Ends when…
Proof by contradiction "$P$" — especially impossibility/absence Assume $\neg P$; derive $q \land \neg q$ a statement contradicts a known fact or an earlier line
Proof by cases "$P$" where objects split into types Split into exhaustive cases; prove each every case is proved and shown to be exhaustive
Existence ($\exists$) "there exists $x$ with $P(x)$" Exhibit a witness (constructive) or prove a search must succeed (non-constructive) a valid $x$ is produced or guaranteed
Uniqueness ($\exists!$) "exactly one $x$ with $P(x)$" Prove existence, then assume $x_1, x_2$ both work and force $x_1 = x_2$ both obligations are met
Counterexample disproving "$\forall x\, P(x)$" Exhibit one $x$ with $\neg P(x)$ a single failing case is shown

Key results proved

  • $\sqrt{2}$ is irrational (contradiction; "lowest terms" is the trap).
  • There are infinitely many primes (Euclid; manufacture $N = p_1\cdots p_k + 1$).
  • $n^2 + n$ is even for every integer $n$ (cases on parity).
  • The triangle inequality in the same-sign case (cases + WLOG).
  • There exist irrationals $a, b$ with $a^b$ rational (non-constructive).
  • Every real $a$ has a unique additive inverse (existence + uniqueness).
  • A rational plus an irrational is irrational (contradiction).
  • "$n^2 + n + 41$ is always prime" is false (counterexample at $n = 40$).

Rules and warnings

  • Proof by contradiction is valid because of the law of the excluded middle: if $\neg P$ is impossible, $P$ holds.
  • The contradiction may be internal (against your own setup, as in $\sqrt 2$) or external (against a known fact).
  • Proof by cases must prove its cases are exhaustive — the missing-default bug.
  • WLOG is legal only under genuine symmetry between the variables; test by swapping them in the statement.
  • A uniqueness claim has two obligations; "at most one" is vacuous without existence.
  • One counterexample fully disproves a "for all"; any number of confirming cases proves nothing (theme two).
  • Impossibility results in CS — halting problem (Ch. 36), uncomputability (Ch. 10) — are contradiction proofs with self-reference.

Spaced Review

Retrieval keeps knowledge available. Answer from memory before checking.

  1. (Ch. 4) The $\sqrt 2$ proof used "if $p^2$ is even then $p$ is even." State this implication's contrapositive, and explain which one is easier to prove directly and why.
  2. (Ch. 4) Classify each as theorem, lemma, corollary, or conjecture: (a) "$\sqrt 2$ is irrational"; (b) the parity fact "$p$ odd $\Rightarrow p^2$ odd" used inside that proof; (c) "$n^2 + n + 41$ is always prime."
  3. (Ch. 1) In "assume $\neg P$ and derive $q \land \neg q$," why is $q \land \neg q$ guaranteed false? Name the truth-table category it belongs to.
  4. (Ch. 1) De Morgan's laws turn the negation of $p \land q$ into what? Use this to write the negation of "$N$ is divisible by 2 and by 3."

Answers

  1. Contrapositive: "if $p$ is odd then $p^2$ is odd." The contrapositive is the easier one to prove directly: write $p = 2a+1$ and compute $p^2 = 4a^2 + 4a + 1 = 2(2a^2+2a)+1$, manifestly odd — a one-line algebra, whereas "$p^2$ even $\Rightarrow p$ even" has no such direct handle (you'd prove it via the contrapositive anyway). This is the Chapter 4 payoff: prove whichever direction is concrete. 2. (a) theorem (a major result); (b) lemma (a helper used to prove the theorem); (c) conjecture — and a false one, disproved by counterexample. 3. $q \land \neg q$ is false in every row of its truth table (whenever $q$ is true, $\neg q$ is false, and vice versa), so the conjunction is never true — it is a contradiction (Chapter 1). 4. $\neg(p \land q) \equiv \neg p \lor \neg q$. So the negation is "$N$ is not divisible by 2 or $N$ is not divisible by 3."

What's Next

You can now prove a claim true by contradiction, split it into cases, establish existence and uniqueness, and demolish a false claim with one counterexample. But one enormous class of claims still resists every tool so far: statements that must hold for all natural numbers at once, like "this formula sums correctly for every $n$" or "this recursive function is correct on every input." You can't check infinitely many cases, and there's no single algebraic trick that covers them all. The technique built precisely for this — proving infinitely many statements with two finite pieces of work, and the formal backbone of recursion and loops — is mathematical induction. It's the workhorse of the rest of the book, and it's Chapter 6.