38 min read

> "The infinite we shall do right away. The finite may take a little longer."

Prerequisites

  • 6

Learning Objectives

  • Write a proof by strong induction, choosing the correct number of base cases for the recurrence at hand.
  • Decide when ordinary induction suffices and when an argument genuinely requires strong induction.
  • State and apply the well-ordering principle, including the 'smallest counterexample' proof pattern.
  • Give a recursive definition of a sequence, a set, or a data structure using a basis clause and a recursive clause.
  • Prove a property of every element of a recursively defined set by structural induction.
  • Explain why ordinary induction, strong induction, and well-ordering are logically equivalent.

Chapter 7: Strong Induction, Well-Ordering, and Recursive Definitions

"The infinite we shall do right away. The finite may take a little longer." — attributed to the mathematician Stanislaw Ulam

Overview

In Chapter 6 you learned to knock over an infinite row of dominoes by checking two things: the first one falls, and each falling domino topples its immediate neighbor. That instinct — each one back to the previous one — is exactly how a recursive function that calls itself on n - 1 is built. But not every recursive function reaches back exactly one step.

Think about the function you would write to factor an integer, or the naive Fibonacci function from last chapter that calls itself on both n - 1 and n - 2, or a merge sort that splits its input in half and recurses on two pieces each roughly n / 2 in size. When fib(n) depends on fib(n - 1) and fib(n - 2), the ordinary inductive hypothesis — "assume it works for the one case just below" — isn't enough. You need to assume it works for several smaller cases, or all of them. That is strong induction, and it is the proof technique that matches the recursion you actually write.

This chapter does three connected things. First, it upgrades induction so that the inductive hypothesis can lean on every smaller case, not just the immediately preceding one — the natural language for divide-and-conquer and for any recurrence that reaches back more than one step. Second, it reveals the bedrock fact about the natural numbers that makes all forms of induction work — the well-ordering principle — and turns it into a proof pattern of its own. Third, it teaches you to define objects recursively (sequences, sets, and the tree and list structures that hold up half of computer science) and then prove things about every such object by structural induction — induction that follows the shape of the definition rather than counting up through the integers.

By the end, you'll see that these are not three unrelated tricks. Ordinary induction, strong induction, and well-ordering are the same principle in three costumes, and structural induction is what that principle looks like when your objects are built by recursive definitions instead of indexed by counting numbers.

In this chapter, you will learn to:

  • Recognize when a proof needs more than the single preceding case, and write it with strong induction.
  • Count the base cases correctly — the single most common way a strong induction proof silently breaks.
  • Use the well-ordering principle directly, especially the "smallest counterexample" argument.
  • Define sequences, sets, and data structures recursively, with a clear basis and recursive clause.
  • Prove universal claims about recursively defined objects by structural induction.
  • Explain, and use, the equivalence of the three induction principles.

Learning Paths

🏃 Fast Track: If strong induction is already comfortable, skim 7.1, read 7.2 (well-ordering) for the "smallest counterexample" pattern you'll use in number theory, then concentrate on 7.4 (structural induction) and 7.5 (trees and lists), which power Chapters 18 and 31. Do the ⭐⭐ and ⭐⭐⭐ exercises.

📖 Standard Path: Read in order. Before reading each proof, try to state its strategy yourself. Work every 🔄 Check Your Understanding. Pay special attention to how many base cases each strong induction needs — that is the skill that separates a correct proof from a plausible-looking wrong one.

🔬 Deep Dive: After 7.6, write out the full three-way equivalence (ordinary ⟹ strong ⟹ well-ordering ⟹ ordinary) with every implication proved, and then define a non-trivial recursive set of your own (e.g., balanced bracket strings) and prove a structural property of it. Exercise set, Part E.


7.1 Strong induction (and when you need it)

Let's start where ordinary induction runs out of road. Recall the Fibonacci numbers from Chapter 6:

$$F_0 = 0, \quad F_1 = 1, \quad F_n = F_{n-1} + F_{n-2} \ \text{ for } n \ge 2.$$

Suppose we want to prove a bound on how fast they grow.

Claim. For all $n \ge 1$, $F_n \le 2^{\,n-1}$.

Try to prove this with ordinary induction and watch it stall. In the inductive step you would assume $F_k \le 2^{\,k-1}$ and try to show $F_{k+1} \le 2^{\,k}$. But $F_{k+1} = F_k + F_{k-1}$, and your single hypothesis tells you about $F_k$ — it says nothing about $F_{k-1}$. You are stuck, not because the claim is false, but because the tool only handed you one case and the recurrence reaches back two.

💡 Intuition: Ordinary induction is a ladder where each rung is built from the one directly below it. Strong induction is a ladder where each rung is built from all the rungs below it. If your recurrence reaches back two steps (Fibonacci), or an unpredictable number of steps (factoring, divide-and-conquer), you need the stronger ladder.

Here is the fix. Instead of assuming the claim only for $k$, assume it for every value from the base up to $k$, and use whichever of those you need.

The Principle of Strong Induction. Let $P(n)$ be a statement about the integer $n \ge n_0$. If

  1. the base case(s) hold — $P(n_0)$ is true (and, when the recursive structure requires, $P(n_0+1), \dots, P(n_0+j)$ as well), and
  2. for every $k \ge n_0$, the implication $$\big(P(n_0) \land P(n_0+1) \land \dots \land P(k)\big) \rightarrow P(k+1)$$ holds (the strong inductive step),

then $P(n)$ is true for all integers $n \ge n_0$.

The only change from Chapter 6 is the hypothesis: instead of assuming just $P(k)$, you may assume all of $P(n_0), \dots, P(k)$ — "everything proved so far." This bundled assumption is the strong inductive hypothesis. The conclusion is identical, and so, as we'll prove in 7.6, is the power: strong induction proves exactly the same theorems ordinary induction can. It is not stronger in what it can establish — it is more convenient when the recurrence reaches back more than one step.

Now the Fibonacci bound goes through cleanly.

The strategy first. The recurrence $F_{k+1} = F_k + F_{k-1}$ reaches back two steps, so we will assume the bound for all indices up to $k$ and apply it to both $F_k$ and $F_{k-1}$. Because the step uses two predecessors, it is only valid once both exist — that forces us to verify two base cases ($n=1$ and $n=2$) by hand before the step can take over.

Proof. Let $P(n)$ be the statement $F_n \le 2^{\,n-1}$, for $n \ge 1$.

Base cases. For $n = 1$: $F_1 = 1$ and $2^{0} = 1$, so $1 \le 1$ holds. For $n = 2$: $F_2 = 1$ and $2^{1} = 2$, so $1 \le 2$ holds. Both $P(1)$ and $P(2)$ are true.

Strong inductive step. Fix $k \ge 2$ and assume $P(1), P(2), \dots, P(k)$ all hold — in particular, $F_k \le 2^{\,k-1}$ and $F_{k-1} \le 2^{\,k-2}$. (Both predecessors exist precisely because $k \ge 2$.) Then $$F_{k+1} = F_k + F_{k-1} \le 2^{\,k-1} + 2^{\,k-2} = 2^{\,k-2}(2 + 1) = 3 \cdot 2^{\,k-2} \le 4 \cdot 2^{\,k-2} = 2^{\,k}.$$ Since $2^{\,k} = 2^{\,(k+1)-1}$, this is exactly $P(k+1)$. By strong induction, $P(n)$ holds for all $n \ge 1$. $\blacksquare$

Notice the two places the "strong" part earned its keep. First, the step used two earlier cases at once ($F_k$ and $F_{k-1}$), which ordinary induction does not grant. Second — and this is the part students miss — because the step reaches back two cases, we needed two base cases. If we had verified only $P(1)$, then the very first use of the step (proving $P(2)$ from $P(1)$ and $P(0)$) would have referenced $P(0)$, which lies below our starting point and was never claimed. Count the base cases by counting how far back the step reaches.

⚠️ Common Pitfall: too few base cases. The number of base cases you must check equals the largest "reach" of your inductive step. A recurrence using $n-1$ and $n-2$ needs two base cases; one using $n-1, n-2, n-3$ needs three. Skipping a base case is the strong-induction analogue of the horses fallacy from Chapter 6: the chain looks complete but one early link was never forged.

🔗 Connection. Strong induction is the proof technique that matches divide-and-conquer. When merge sort splits a list of size $n$ into two halves and recurses, its correctness proof assumes the algorithm works on both smaller pieces — two cases, neither of them $n-1$. We'll formalize divide-and-conquer recurrences in Chapter 18 and solve them in Chapter 19; the inductive scaffolding is exactly what you're building now.

Let's verify the bound computationally, in the spirit of theme four — computation tests, proof guarantees:

def fib(n):
    """F_0 = 0, F_1 = 1, F_n = F_{n-1} + F_{n-2}."""
    if n < 2:
        return n
    return fib(n - 1) + fib(n - 2)

# Check the bound F_n <= 2**(n-1) we just proved, for n = 1..12.
print([fib(n) <= 2 ** (n - 1) for n in range(1, 13)])
# Expected output:
# [True, True, True, True, True, True, True, True, True, True, True, True]

Twelve Trues are reassuring, but they are not the proof — the proof covers all $n$ at once. The list is what you'd run before proving, to make sure the claim is even worth the effort.

🔄 Check Your Understanding 1. In the Fibonacci-bound proof, exactly which two earlier cases did the strong inductive step use, and where would ordinary induction have failed? 2. A recurrence is defined by $a_n = a_{n-1} + a_{n-3}$ for $n \ge 3$. How many base cases must a strong-induction proof about $a_n$ establish, and which ones?

Answers

  1. It used $F_k \le 2^{k-1}$ and $F_{k-1} \le 2^{k-2}$. Ordinary induction supplies only $P(k)$ (the bound on $F_k$), so it cannot bound $F_{k-1}$ in $F_{k+1} = F_k + F_{k-1}$ — the step stalls. 2. Three base cases, for $n = 0, 1, 2$. The step reaches back as far as $a_{n-3}$, so the first index at which the step is fully supported is $n = 3$ (needing $a_2, a_1, a_0$); every index below 3 must be a base case.

7.2 The well-ordering principle

We've now used induction in two strengths, and twice promised to explain why it works. The answer is a single, almost obvious-sounding fact about the natural numbers — and it is powerful enough to be a proof technique in its own right.

Start concrete. Take any non-empty collection of natural numbers — say $\{17, 4, 23, 4, 100\}$, or the set of all even numbers greater than a million, or the set of all $n$ for which some program crashes. Does each of these have a smallest member? The finite one obviously does (it's 4). The evens-over-a-million set does (it's $1{,}000{,}002$). And the third — whatever it is — does too, as long as it isn't empty. That is the whole principle.

The Well-Ordering Principle. Every non-empty subset of the natural numbers $\mathbb{N} = \{0, 1, 2, \dots\}$ has a least element.

Three things make this fact special, and worth dwelling on:

  • It is false for the integers $\mathbb{Z}$: the set of all integers has no least element (you can always go more negative). It is false for the non-negative rationals: the set $\{x \in \mathbb{Q} : x > 0\}$ has no least element (halve any candidate). The well-ordering principle is a distinguishing property of $\mathbb{N}$ — it is, in a precise sense, what makes the naturals "discrete."
  • It needs non-empty. The empty set has no least element simply because it has no elements at all; the hypothesis rules that case out.
  • It is the engine under induction. We previewed this in Chapter 6: induction works because the naturals are well-ordered. We'll make that rigorous in 7.6.

🚪 Threshold Concept. Well-ordering is the quiet axiom beneath every induction proof you will ever write. Once you internalize "any non-empty set of naturals has a smallest element," you gain a second, often slicker, way to prove the same theorems — by contradiction, through a smallest counterexample. Many number-theory proofs (and the correctness of the Euclidean algorithm in Chapter 22) are cleanest in this form. It reshapes how you see the integers: not as a line you climb one step at a time, but as a set that can never sink below a floor.

The smallest-counterexample pattern

Here is the proof technique well-ordering gives you. To show $P(n)$ holds for all $n \ge n_0$:

The smallest-counterexample template. 1. Suppose, for contradiction, the set $C = \{ n \ge n_0 : P(n) \text{ is false} \}$ is non-empty. 2. By well-ordering, $C$ has a least element $m$. 3. Note $m \ne n_0$ because you check the base case directly, so $m > n_0$ and $P(j)$ holds for every $n_0 \le j < m$ (else $m$ wouldn't be the least counterexample). 4. Derive a contradiction — typically by using those smaller true cases to force $P(m)$ to be true after all. 5. Conclude $C$ is empty: $P(n)$ holds for all $n \ge n_0$. $\blacksquare$

This is strong induction wearing a different hat: step 3 hands you all the smaller cases as true, exactly the strong inductive hypothesis. Let's use it to prove a result you'll meet again in number theory.

The Division Algorithm (existence part). For every integer $a \ge 0$ and every integer $d > 0$, there exist integers $q$ and $r$ with $a = dq + r$ and $0 \le r < d$.

The strategy first. We want a remainder $r$ that is non-negative but smaller than $d$. The trick is to look at all the non-negative numbers of the form $a - dq$ as $q$ ranges over the integers, and grab the smallest one — well-ordering guarantees it exists. We'll then show that smallest value can't be $\ge d$, because if it were, subtracting one more $d$ would give an even smaller non-negative value, contradicting "smallest."

Proof. Consider the set $$S = \{ a - dq : q \in \mathbb{Z} \text{ and } a - dq \ge 0 \}.$$ First, $S$ is non-empty: taking $q = 0$ gives $a - 0 = a \ge 0$ (since $a \ge 0$), so $a \in S$. And $S \subseteq \mathbb{N}$ by construction. By the well-ordering principle, $S$ has a least element; call it $r$, and let $q$ be the integer with $r = a - dq$. By definition of $S$, $r \ge 0$, and rearranging gives $a = dq + r$. It remains to show $r < d$.

Suppose instead $r \ge d$. Then $$a - d(q+1) = (a - dq) - d = r - d \ge 0,$$ so $r - d$ is a non-negative number of the form $a - d q'$ — that is, $r - d \in S$. But $r - d < r$ (because $d > 0$), contradicting that $r$ is the least element of $S$. Therefore $r < d$, and we have $a = dq + r$ with $0 \le r < d$. $\blacksquare$

That argument would have been clumsy with a counting-up induction (what would you induct on?). Well-ordering let us reach straight for the object we wanted — the smallest valid remainder — and the contradiction did the rest.

🔗 Connection. This existence proof is the foundation of the divmod you call without thinking, and of every modular-arithmetic computation in Part IV. When we build the Euclidean algorithm in Chapter 22, its termination is a well-ordering argument: each step produces a strictly smaller non-negative remainder, and a strictly decreasing sequence of natural numbers cannot go on forever — because there's a floor.

🔄 Check Your Understanding 1. Why does the well-ordering principle fail for $\mathbb{Z}$ but hold for $\mathbb{N}$? Give the one-sentence reason. 2. In the division-algorithm proof, where exactly did we use "$r$ is the least element," as opposed to merely "an element"? 3. The smallest-counterexample pattern requires you to check a base case (step 3). Which earlier proof technique from Chapter 5 is this whole pattern an instance of?

Answers

  1. $\mathbb{Z}$ has no floor — you can always subtract one to get a smaller integer — whereas any non-empty set of naturals cannot descend below 0 forever. 2. When we found $r - d \in S$ with $r - d < r$: that contradicts minimality. If $r$ were merely some element, finding a smaller one would be no contradiction at all. 3. Proof by contradiction — we assumed the set of counterexamples was non-empty and derived an impossibility.

7.3 Recursive definitions (sequences, sets, structures)

So far we've proved things about objects that were already defined. Now we turn the same self-referential idea into a way to define objects. You define a recursive function by giving a base case and a rule that reduces to smaller inputs; you define an object recursively the same way.

A recursive definition (also called an inductive definition) specifies a collection in two parts:

  • a basis clause (or base case) that names one or more starting elements outright, and
  • a recursive clause that gives rules for building new elements from elements already known to be in the collection.

By convention there is also an implicit exclusion clause: nothing is in the collection unless it can be produced by finitely many applications of the basis and recursive clauses. That last clause matters — it's what stops the definition from accidentally including extra junk, and it's the reason structural induction (next section) is valid.

You already met a recursive definition in Chapter 6 without the name: the Fibonacci sequence. Let's restate it as a textbook recursive definition of a sequence.

Recursive definition (Fibonacci). - Basis: $F_0 = 0$ and $F_1 = 1$. - Recursive clause: for $n \ge 2$, $F_n = F_{n-1} + F_{n-2}$.

The factorial is another: - Basis: $0! = 1$. - Recursive clause: for $n \ge 1$, $n! = n \cdot (n-1)!$.

These define sequences — functions on $\mathbb{N}$. But the real power of recursive definitions is that they also define sets, including sets whose members are themselves structured objects (strings, expressions, trees). When the thing being defined is a set, we call it a recursively defined set.

Here is the canonical example, and it's one you'll lean on in formal languages (Chapter 35) and compilers: a set of arithmetic expressions.

Recursive definition (a tiny language of arithmetic expressions over the symbol x). Let $E$ be the smallest set such that: - Basis: the symbol x is in $E$, and every decimal numeral (0, 1, 2, …) is in $E$. - Recursive clause: if $a \in E$ and $b \in E$, then the strings (a + b) and (a * b) are in $E$. - Exclusion: nothing else is in $E$.

From these rules, x is in $E$ (basis); (x + 1) is in $E$ (apply the recursive clause to x and 1); ((x + 1) * x) is in $E$ (apply it again, to (x + 1) and x). Each element carries, in its very construction, a recipe for how it was built — and that recipe is what we induct over.

💡 Intuition: A recursive definition is a factory. The basis is the raw stock already on the shelf; the recursive clause is the machine that takes finished parts and bolts them into bigger parts; the exclusion clause says the warehouse contains only what the factory actually produced. Because every product has a build history, you can prove things about all products by reasoning about the build steps — that's structural induction.

We can mirror the expression grammar directly in Python, which makes the "build history" concrete — each expression is a small tree of tuples:

def make_num(v):     return ("num", v)          # basis: a numeral
def make_var():      return ("var", "x")        # basis: the variable x
def add(a, b):       return ("+", a, b)         # recursive clause
def mul(a, b):       return ("*", a, b)         # recursive clause

# Build  ((x + 1) * x)  from the rules, bottom-up:
e = mul(add(make_var(), make_num(1)), make_var())
print(e)
# Expected output:
# ('*', ('+', ('var', 'x'), ('num', 1)), ('var', 'x'))

Every parenthesis in the original string corresponds to one application of the recursive clause — one add or mul node in the tuple. That correspondence is not a coincidence; it is the content of the next section's proof.

⚠️ Common Pitfall: forgetting the exclusion clause. "Let $E$ be the smallest set such that…" is doing real work. Without "smallest" (the exclusion clause), $E$ could secretly contain anything at all and still satisfy the basis and recursive clauses — the universe of all strings satisfies them too. Smallest pins $E$ to exactly the strings the rules generate, and that is precisely what makes structural induction sound.

🔄 Check Your Understanding 1. Write a recursive definition of the set of all non-negative even integers, $\{0, 2, 4, \dots\}$, using a basis clause and a recursive clause. 2. In the expression set $E$, is the string x + 1 (without the outer parentheses) a member? Why or why not?

Answers

  1. Basis: $0 \in S$. Recursive clause: if $n \in S$ then $n + 2 \in S$. Exclusion: nothing else. 2. No. The recursive clause produces the parenthesized string (a + b); it never produces a + b without the surrounding parentheses. Since x + 1 cannot be generated by any sequence of clause applications, the exclusion clause keeps it out of $E$.

7.4 Structural induction

When a set is defined recursively, the natural way to prove that every element has some property is not to count up through the integers — the elements aren't numbers — but to follow the structure of the definition. This is structural induction, and it has the same two-part shape as the recursive definition it mirrors:

The Principle of Structural Induction. To prove that a property $P$ holds for every element of a recursively defined set $S$:

  1. Basis step: prove $P$ holds for each element named in the basis clause.
  2. Recursive (inductive) step: prove that if $P$ holds for the elements used by a recursive-clause rule, then $P$ holds for the new element that rule produces.

Then $P$ holds for every element of $S$.

The assumption "$P$ holds for the elements used to build this one" is the structural inductive hypothesis. The justification is exactly the exclusion clause: every element of $S$ is built from basis elements by finitely many applications of the recursive clauses, so if $P$ survives the basis and is preserved by every clause, it must hold for the finished element — by induction on the number of construction steps. Structural induction is ordinary induction in disguise, with the induction variable being "how many build steps it took."

Let's prove the parenthesis fact we noticed in 7.3.

Claim. Every expression in the set $E$ (from 7.3) has an equal number of left parentheses ( and right parentheses ).

The strategy first. The set $E$ is built by two clauses, so the proof has two matching parts. For the basis (a numeral or x) there are zero parentheses of each kind — equal, trivially. For the recursive step, a new expression (a + b) or (a * b) adds exactly one ( and exactly one ) to whatever $a$ and $b$ already had; if $a$ and $b$ were each balanced (the hypothesis), adding one of each keeps them balanced. Structure of the proof = structure of the definition.

Proof (by structural induction on $E$). For an expression $e$, let $L(e)$ and $R(e)$ be its numbers of left and right parentheses; let $P(e)$ be the statement $L(e) = R(e)$.

Basis step. The basis elements are the variable x and the numerals. None of these contains any parenthesis, so $L(e) = 0 = R(e)$. Thus $P(e)$ holds for every basis element.

Recursive step. Suppose $a, b \in E$ and assume the inductive hypothesis $P(a)$ and $P(b)$ — that is, $L(a) = R(a)$ and $L(b) = R(b)$. The recursive clause forms $e = $ (a + b) (the * case is identical). Counting parentheses, $e$ has all of $a$'s, all of $b$'s, plus the one new outer pair: $$L(e) = L(a) + L(b) + 1, \qquad R(e) = R(a) + R(b) + 1.$$ Using the hypotheses $L(a) = R(a)$ and $L(b) = R(b)$, $$L(e) = L(a) + L(b) + 1 = R(a) + R(b) + 1 = R(e).$$ So $P(e)$ holds. By structural induction, $P(e)$ holds for every $e \in E$. $\blacksquare$

That is the entire method: one case per clause of the definition, the recursive clauses getting an inductive hypothesis about their inputs. No reference to "$n$" anywhere — the recursion is over structure, not over a counter.

Let's do one more, on a recursively defined set of numbers, to show the same machine works when the elements happen to be integers.

Recursive definition. Let $S$ be the smallest set of integers with: basis $3 \in S$; recursive clause if $x \in S$ and $y \in S$ then $x + y \in S$.

Claim. Every element of $S$ is divisible by 3.

The strategy first. Two clauses, two cases. The basis element 3 is divisible by 3. The recursive clause adds two members; the sum of two multiples of 3 is again a multiple of 3. Unpack "divisible by 3" into "equals $3m$ for some integer $m$" (the Chapter 4 definition) and the algebra is one line.

Proof (by structural induction on $S$). Let $P(s)$ be "$3 \mid s$."

Basis step. $s = 3$: since $3 = 3 \cdot 1$, we have $3 \mid 3$, so $P(3)$ holds.

Recursive step. Suppose $x, y \in S$ with $P(x)$ and $P(y)$: write $x = 3m$ and $y = 3n$ for integers $m, n$. The new element is $x + y = 3m + 3n = 3(m + n)$, and $m + n \in \mathbb{Z}$, so $3 \mid (x + y)$ — that is, $P(x+y)$ holds. By structural induction, $3 \mid s$ for every $s \in S$. $\blacksquare$

(In fact $S = \{3, 6, 9, \dots\}$ — every positive multiple of 3 — but notice we proved the property without needing to first identify the set exactly. That is structural induction's gift: you reason from the construction rules, not from a closed-form description you might not have.)

🔄 Check Your Understanding 1. In a structural induction proof, what is the structural inductive hypothesis allowed to assume, and why is assuming it legitimate (which clause of the recursive definition licenses it)? 2. The set $E$ also satisfies: every expression has at least as many right parentheses as left, reading left to right at no point... actually, what single property did our proof establish, and what did it not establish about the order of the parentheses?

Answers

  1. It may assume $P$ holds for the sub-objects that the recursive clause combines (e.g., $P(a)$ and $P(b)$ when forming (a + b)). It is legitimate because the exclusion clause guarantees every element is built from basis elements by finitely many clause applications — so an induction on the number of build steps reaches every element. 2. We proved the counts are equal, $L(e) = R(e)$. We did not prove the parentheses are properly nested/ordered (e.g., that no prefix has more ) than (); that is a strictly stronger "well-balanced" property requiring a different invariant. Equal counts is necessary but not sufficient for proper nesting.

7.5 Recursion in data structures (trees and lists)

Structural induction isn't an abstract curiosity — it is the everyday tool for reasoning about the recursive data structures you program with. Two of them, lists and trees, are defined recursively, and properties of them are proved by structural induction. This section is the bridge to Chapter 31 (trees) and the recursion that runs through all of Part V.

Lists

A (singly linked) list of elements is one of the most fundamental recursive structures:

Recursive definition (lists). - Basis: the empty list, written Nil, is a list. - Recursive clause: if L is a list and x is an element, then Cons(x, L) — the list with x stuck on the front — is a list.

So Cons(1, Cons(2, Cons(3, Nil))) is the list $[1, 2, 3]$. Define the length of a list recursively too: $$\operatorname{len}(\texttt{Nil}) = 0, \qquad \operatorname{len}(\texttt{Cons}(x, L)) = 1 + \operatorname{len}(L).$$ And define concatenation append(L, M) (gluing M onto the end of L): $$\operatorname{append}(\texttt{Nil}, M) = M, \qquad \operatorname{append}(\texttt{Cons}(x, L), M) = \texttt{Cons}(x, \operatorname{append}(L, M)).$$

Now a property worth knowing — that length distributes over concatenation:

Claim. For all lists L and M, $\operatorname{len}(\operatorname{append}(L, M)) = \operatorname{len}(L) + \operatorname{len}(M)$.

The strategy first. Both len and append recurse on their first argument, so we do structural induction on L (holding M fixed). The basis is L = Nil; the recursive step assumes the claim for the tail L and proves it for Cons(x, L). At each step we simply unfold the recursive definitions of append and len and apply the hypothesis — the proof writes itself once you let the definitions do the talking.

Proof (by structural induction on L, for arbitrary fixed M). Let $P(L)$ be the statement $\operatorname{len}(\operatorname{append}(L, M)) = \operatorname{len}(L) + \operatorname{len}(M)$.

Basis step (L = Nil). By the definitions, $\operatorname{append}(\texttt{Nil}, M) = M$, so $$\operatorname{len}(\operatorname{append}(\texttt{Nil}, M)) = \operatorname{len}(M) = 0 + \operatorname{len}(M) = \operatorname{len}(\texttt{Nil}) + \operatorname{len}(M).$$ So $P(\texttt{Nil})$ holds.

Recursive step (L = Cons(x, L')). Assume the inductive hypothesis $P(L')$: $\operatorname{len}(\operatorname{append}(L', M)) = \operatorname{len}(L') + \operatorname{len}(M)$. Then, unfolding append and then len: $$\operatorname{len}(\operatorname{append}(\texttt{Cons}(x, L'), M)) = \operatorname{len}(\texttt{Cons}(x, \operatorname{append}(L', M))) = 1 + \operatorname{len}(\operatorname{append}(L', M)).$$ By the hypothesis, this equals $1 + \big(\operatorname{len}(L') + \operatorname{len}(M)\big) = \big(1 + \operatorname{len}(L')\big) + \operatorname{len}(M) = \operatorname{len}(\texttt{Cons}(x, L')) + \operatorname{len}(M)$. That is $P(\texttt{Cons}(x, L'))$. By structural induction, $P(L)$ holds for all lists L. $\blacksquare$

Read that proof again and notice: it never mentions a number $n$. The induction is on the shape of the list. Every recursive function you write on lists can be verified this way, and the proof's two cases line up exactly with the function's two branches (if not L: versus the recursive case).

Trees

The same idea, one level up, gives us binary trees — and a classic counting fact. A full binary tree is one in which every node has either zero children (a leaf) or exactly two children (an internal node).

Recursive definition (full binary trees). - Basis: a single node with no children is a full binary tree (just a leaf). - Recursive clause: if $T_1$ and $T_2$ are full binary trees, then the tree with a new root whose two children are $T_1$ and $T_2$ is a full binary tree.

Claim. In every full binary tree, the number of leaves is exactly one more than the number of internal nodes. Writing $\ell(T)$ for leaves and $i(T)$ for internal nodes: $\ell(T) = i(T) + 1$.

The strategy first. Structure of the tree definition, structure of the proof. A single leaf has $\ell = 1$ and $i = 0$, so $1 = 0 + 1$ — basis done. For a combined tree, the new root is internal and the leaves and internal nodes are just the totals of the two subtrees (plus the one new internal root). Add the two inductive hypotheses and watch the "+1"s line up.

Proof (by structural induction on $T$). Let $P(T)$ be $\ell(T) = i(T) + 1$.

Basis step. $T$ is a single leaf: $\ell(T) = 1$ and $i(T) = 0$, and indeed $1 = 0 + 1$. So $P(T)$ holds.

Recursive step. Suppose $T$ has root $r$ with subtrees $T_1, T_2$, and assume the inductive hypotheses $\ell(T_1) = i(T_1) + 1$ and $\ell(T_2) = i(T_2) + 1$. The leaves of $T$ are precisely the leaves of $T_1$ and $T_2$ (the new root $r$ has two children, so it is not a leaf): $\ell(T) = \ell(T_1) + \ell(T_2)$. The internal nodes of $T$ are those of $T_1$, those of $T_2$, plus $r$ itself: $i(T) = i(T_1) + i(T_2) + 1$. Therefore $$\ell(T) = \ell(T_1) + \ell(T_2) = \big(i(T_1) + 1\big) + \big(i(T_2) + 1\big) = \big(i(T_1) + i(T_2) + 1\big) + 1 = i(T) + 1.$$ So $P(T)$ holds. By structural induction, $\ell(T) = i(T) + 1$ for every full binary tree. $\blacksquare$

A small ASCII tree makes the count tangible. Here $r$ is the root, $a$ is an internal node, and $L_1, L_2, L_3$ are leaves:

        r
       / \
      a   L3
     / \
    L1  L2

Count them: internal nodes $\{r, a\}$, so $i = 2$; leaves $\{L_1, L_2, L_3\}$, so $\ell = 3$; and indeed $3 = 2 + 1$. The proof guarantees this balance for every full binary tree, of any size.

🔗 Connection. This leaf/internal-node identity is exactly the fact that makes Huffman coding (Chapter 31) and prefix-free codes work: a full binary tree with $L$ leaves encodes $L$ symbols using $L - 1$ internal decision points. Structural induction is the standard way to prove properties of binary search trees, heaps, and expression trees — the recursion in the proof mirrors the recursion in the code that walks the tree.

🐛 Find the Error. A student tries to prove $\ell(T) = i(T) + 1$ by ordinary induction "on the number of nodes $n$": "Assume it holds for a tree with $n$ nodes; add one node to get a tree with $n+1$ nodes; …" What is the flaw in even setting the proof up this way?

Answer

You cannot, in general, turn a full binary tree into another full binary tree by "adding one node": adding a single child to a leaf makes that former leaf have exactly one child, which violates the full property. Full binary trees grow by adding two children at once (the recursive clause), so the building block isn't "+1 node." Structural induction sidesteps this entirely by inducting on the definition's construction rule rather than on a node count — which is precisely why we prefer it for recursively defined structures.


7.6 The equivalence of the induction forms

We've now used three principles — ordinary induction (Chapter 6), strong induction (7.1), and well-ordering (7.2) — and leaned on the claim that they're "the same idea in three costumes." Time to cash that in. The remarkable fact is that the three are logically equivalent: assume any one (for the natural numbers) and you can prove the other two. None is more powerful than the others; they prove exactly the same theorems.

🚪 Threshold Concept. It is tempting to think "strong induction is stronger — look at the name." It is not stronger in deductive power; it proves no theorem ordinary induction can't. The word "strong" refers only to the hypothesis you get to assume (all previous cases, not just one), which makes some proofs more convenient. Realizing that a more generous-looking assumption buys no extra theorems — only convenience — is a genuine shift in how you think about proof techniques.

We'll prove the equivalence as a cycle of implications: $$\text{(Ordinary induction)} \ \Rightarrow\ \text{(Strong induction)} \ \Rightarrow\ \text{(Well-ordering)} \ \Rightarrow\ \text{(Ordinary induction)}.$$ A cycle suffices: if each implies the next, then any one implies all the others.

Ordinary induction ⟹ strong induction

The strategy first. We're handed ordinary induction and must derive strong induction. The trick is a change of statement: to prove $P(n)$ by strong induction, apply ordinary induction to a cleverly bundled statement $Q(n)$ that says "$P$ holds for everything up to $n$." Bundling converts a strong hypothesis into an ordinary one.

Proof. Suppose ordinary induction is valid, and suppose we have a strong-induction setup for $P$: a base case $P(0)$ and a step proving $P(k+1)$ from $P(0) \land \dots \land P(k)$. Define $$Q(n) : \quad P(0) \land P(1) \land \dots \land P(n) \ \text{ are all true}.$$ We prove $Q(n)$ for all $n$ by ordinary induction. Base case: $Q(0)$ is just $P(0)$, which holds. Inductive step: assume $Q(k)$ — i.e. $P(0), \dots, P(k)$ all hold. That is exactly the strong-induction hypothesis, so the given strong step yields $P(k+1)$; combined with $Q(k)$, all of $P(0), \dots, P(k+1)$ hold, which is $Q(k+1)$. By ordinary induction, $Q(n)$ holds for all $n$, and since $Q(n)$ includes $P(n)$, we get $P(n)$ for all $n$. $\blacksquare$

Strong induction ⟹ well-ordering

The strategy first. Assume strong induction; prove every non-empty $T \subseteq \mathbb{N}$ has a least element. We prove the contrapositive-flavored statement: if $T$ has no least element, then $T$ is empty — by using strong induction to show every $n$ is absent from $T$.

Proof. Suppose strong induction holds, and let $T \subseteq \mathbb{N}$ have no least element. Let $P(n)$ be the statement "$n \notin T$." We show $P(n)$ for all $n$ by strong induction. Strong step (which also subsumes the base case): fix $n$ and assume $P(0), \dots, P(n-1)$ — i.e. none of $0, 1, \dots, n-1$ is in $T$. If $n$ were in $T$, then $n$ would be its least element (everything smaller is excluded by the hypothesis), contradicting that $T$ has no least element. So $n \notin T$, which is $P(n)$. (When $n = 0$ the hypothesis is vacuous and the same argument shows $0 \notin T$, since $0$ would be least.) By strong induction, $n \notin T$ for all $n \in \mathbb{N}$, so $T = \emptyset$. Contrapositively, every non-empty subset of $\mathbb{N}$ has a least element — the well-ordering principle. $\blacksquare$

Well-ordering ⟹ ordinary induction

The strategy first. This is the "smallest counterexample" pattern from 7.2, now used to justify induction itself. Assume well-ordering and an ordinary-induction setup ($P(0)$ and $P(k) \rightarrow P(k+1)$). If $P$ failed somewhere, well-ordering would hand us a smallest failure — and we'll show that smallest failure is impossible.

Proof. Suppose well-ordering holds, and suppose $P(0)$ is true and $P(k) \rightarrow P(k+1)$ for all $k \ge 0$. Let $C = \{ n \in \mathbb{N} : P(n) \text{ is false} \}$ be the set of counterexamples, and suppose for contradiction that $C \ne \emptyset$. By well-ordering, $C$ has a least element $m$. Now $m \ne 0$, because $P(0)$ is true, so $m \ge 1$ and $m - 1 \in \mathbb{N}$. Since $m$ is the least counterexample, $m - 1 \notin C$, i.e. $P(m-1)$ is true. But the inductive step gives $P(m-1) \rightarrow P(m)$, forcing $P(m)$ true — contradicting $m \in C$. Hence $C = \emptyset$ and $P(n)$ holds for all $n$, which is ordinary induction. $\blacksquare$

Three implications, one cycle, and the equivalence is established. Use whichever form makes a given proof cleanest: ordinary induction when the step reaches back one case, strong induction when it reaches back several, and well-ordering when "grab the smallest bad case" is the most natural move. They are interchangeable lenses on the same truth about $\mathbb{N}$.

💡 Intuition: Picture the three as translations of one sentence into three languages. Ordinary induction says "start at the bottom and climb one rung at a time." Strong induction says "stand on everything below you." Well-ordering says "there is no bottomless pit." Each sentence implies the others; you reach for whichever phrasing the problem speaks most fluently.

🔄 Check Your Understanding 1. In the proof that ordinary induction implies strong induction, what was the bundled statement $Q(n)$, and why was bundling the key idea? 2. "Strong induction can prove theorems ordinary induction cannot." True or false — and what is the one-sentence justification?

Answers

  1. $Q(n)$ was "$P(0) \land P(1) \land \dots \land P(n)$ all hold." Bundling turns the strong hypothesis (all previous cases) into an ordinary hypothesis about the single statement $Q$, so ordinary induction can carry it. 2. False. The equivalence (this section) shows strong induction and ordinary induction prove exactly the same theorems; "strong" describes a more convenient hypothesis, not greater deductive power.

Project Checkpoint: recursive definitions and strong-induction tools for the Toolkit

This chapter is about reaching back to many smaller cases and about recursive definitions — so let's grow the recurrences.py module we began in Chapter 6 (it reaches its full form in Chapters 18–19). We'll add two things that capture this chapter's ideas: a general evaluator for a recursively defined sequence whose step may reach back several cases, and a strong-induction demonstrator — a function that decides whether an amount of postage is constructible from 4-cent and 5-cent stamps, which is a problem whose proof is a strong induction.

Add to dmtoolkit/recurrences.py:

def recursive_seq(step, base, n):
    """Evaluate a recursively defined sequence with multiple base cases.
    base : list of initial terms [a_0, a_1, ..., a_{j}].
    step : function(prev_terms_list, k) -> a_k, given ALL prior terms a_0..a_{k-1}.
    Returns a_n. Strong-induction friendly: step sees every earlier term."""
    terms = list(base)
    for k in range(len(base), n + 1):
        terms.append(step(terms, k))
    return terms[n]

def make_postage(n):
    """Return counts (fours, fives) of 4- and 5-cent stamps summing to n cents,
    or None if impossible. Provable by strong induction: every n >= 12 works."""
    for fives in range(n // 5 + 1):
        rest = n - 5 * fives
        if rest % 4 == 0:
            return (rest // 4, fives)
    return None

Here is the expected behavior, traced by hand (we never run the code):

# Fibonacci via the generic evaluator: step reaches back TWO terms (strong-induction shaped).
fib_step = lambda t, k: t[k - 1] + t[k - 2]
print(recursive_seq(fib_step, [0, 1], 10))   # F_10
# Strong-induction theorem in action: every amount >= 12 is constructible.
print(make_postage(12), make_postage(23), make_postage(11))
# Expected output:
# 55
# (3, 0) (2, 3) None

Trace check: $F_{10} = 55$ (from $0,1,1,2,3,5,8,13,21,34,55$). For postage, $12 = 3 \cdot 4 + 0 \cdot 5$ gives (3, 0); $23 = 2 \cdot 4 + 3 \cdot 5 = 8 + 15$ gives (2, 3); and $11$ is the famous gap — the largest amount the two stamps cannot make — so None. That None at 11, and success at every value from 12 upward, is exactly the boundary the strong-induction proof in this chapter's exercises pins down (the step reduces $n$ by 4 to reach a known case, which is why four consecutive base cases — 12, 13, 14, 15 — are required).

Toolkit so far: logic.py (Chapters 1–3) and recurrences.py, now holding check_identity and fib (Chapter 6) plus recursive_seq and make_postage (this chapter). The generic recursive_seq foreshadows the systematic recurrence solving of Chapters 18–19, where Fibonacci finally gets its closed form and an $O(\log n)$ algorithm.


Summary

This chapter extended induction to match the recursion you actually write, exposed the principle beneath all of it, and taught you to define and reason about recursively built objects.

The three induction principles (all equivalent — see 7.6):

Principle Hypothesis in the step Base cases needed Reach for it when…
Ordinary induction (Ch. 6) assume $P(k)$ one (where the claim starts) the step uses only the immediately preceding case
Strong induction (7.1) assume $P(n_0) \land \dots \land P(k)$ as many as the step reaches back the recurrence uses several earlier cases (Fibonacci, divide-and-conquer, factoring)
Well-ordering (7.2) "take the least counterexample" check the base, then derive a contradiction "grab the smallest bad case" is the natural move (number theory, termination)

Counting base cases (the #1 strong-induction error): the number of base cases equals how far back the strong step reaches. A step using $n-1$ and $n-2$ needs two; $n-1, n-2, n-3$ needs three; the 4-and-5 postage step (reduce by 4) needs four.

Well-ordering principle: every non-empty subset of $\mathbb{N}$ has a least element. False for $\mathbb{Z}$ and for $\mathbb{Q}_{>0}$. It is the bedrock under all induction and the engine of termination arguments.

Recursive definitions specify a collection with: a basis clause (starting elements), a recursive clause (rules to build new from old), and an implicit exclusion clause ("smallest such set" — nothing else belongs). They define sequences, sets, and structures.

Structural induction proves $P$ for every element of a recursively defined set:

Step What you prove
Basis step $P$ holds for each basis element
Recursive step if $P$ holds for the parts a clause combines, then $P$ holds for the result

It is valid because of the exclusion clause (every element is built in finitely many steps), and it makes the proof's structure mirror the definition's — and the code's branches.

Key results proved this chapter:

  • $F_n \le 2^{\,n-1}$ for $n \ge 1$ (strong induction, two base cases).
  • The division algorithm's existence (well-ordering, via the least non-negative remainder).
  • Every expression in $E$ has equal ( and ) counts (structural induction).
  • $\operatorname{len}(\operatorname{append}(L, M)) = \operatorname{len}(L) + \operatorname{len}(M)$ (structural induction on lists).
  • A full binary tree has $\ell(T) = i(T) + 1$ (structural induction on trees).
  • Ordinary induction ⟺ strong induction ⟺ well-ordering.

Spaced Review

Answer from memory first; then check. These revisit Chapters 5 and 6.

  1. (Ch. 6) State the ordinary principle of mathematical induction in one sentence, naming the base case and the inductive step. How is its inductive hypothesis different from strong induction's?
  2. (Ch. 6) The "all horses are the same color" proof failed at the implication $P(1) \rightarrow P(2)$. Re-express that failure in this chapter's vocabulary: it is really a base-case-counting error in disguise. Explain.
  3. (Ch. 5) The well-ordering "smallest counterexample" pattern is an instance of which Chapter 5 proof strategy? Name it and state the first move of that strategy.
  4. (Ch. 6) In Chapter 6 you proved a recursive triangle(n) correct by ordinary induction. Why was ordinary induction (not strong) sufficient there, but not sufficient for the naive recursive fib(n)?

Answers

  1. If $P(\text{start})$ holds (base case) and $P(k) \rightarrow P(k+1)$ for all $k \ge \text{start}$ (inductive step), then $P(n)$ holds for all $n \ge \text{start}$. Ordinary induction's hypothesis assumes only the single case $P(k)$; strong induction's assumes all cases $P(\text{start}) \land \dots \land P(k)$. 2. The overlap argument in the horses proof is only valid for $k \ge 2$, so the step really starts at $k = 2$; making it sound would require a second base case at $n = 2$ — which is exactly the strong-induction discipline of "check as many base cases as the step needs." The proof is broken because that needed base case ($P(2)$) is false. 3. Proof by contradiction. First move: assume the opposite — here, assume the set of counterexamples is non-empty (then derive a contradiction from its least element). 4. triangle(n) recurses on a single smaller input, n - 1, so the single ordinary hypothesis $P(k)$ exactly matches the one recursive call. fib(n) recurses on two smaller inputs, n - 1 and n - 2; a proof of its correctness needs to assume both are correct — a strong inductive hypothesis.

What's Next

You can now prove things about objects defined by recurrences, and you've met the well-ordering principle that guarantees such recurrences and algorithms terminate. The natural next question is quantitative: given a recurrence like $F_n = F_{n-1} + F_{n-2}$ or the cost equation of a divide-and-conquer algorithm, can we find a closed form — a direct formula that skips the recursion entirely? That is the subject of recurrence relations in Chapter 18, where the strong-induction reasoning of this chapter becomes the foundation for solving recurrences, not just proving facts about them. The Fibonacci thread you've followed since Chapter 6 will there finally yield its closed form (the golden ratio) and, in Chapter 19, an algorithm that computes $F_n$ in $O(\log n)$ time. And the structural induction you practiced on lists and trees will pay off directly in Chapter 31, where trees become first-class objects of study.