Exercises: Where Discrete Math Goes Next
This is a survey chapter, so these exercises are a survey too: most ask you to recognize and apply
one idea from each of the four directions, not to grind a long technique. A few go deeper, and the
Interleaved section closes the book by pulling threads from across all forty chapters. Difficulty:
⭐ foundational, ⭐⭐ intermediate, ⭐⭐⭐ challenging. Worked solutions to the daggered (†) and
odd-numbered problems are in the appendix answers-to-selected.md — try them before you peek.
A note on spirit: the chapter is a map. The point of these exercises is the habit it was secretly training — when something looks new, ask "what is this, structurally?" Several problems below are deliberately dressed in unfamiliar clothes; the win is naming what they really are.
Part A — Warm-ups (⭐)
40.1 Compute the Shannon entropy, in bits, of a source with probabilities $\{1/2, 1/4, 1/4\}$. Show the three terms before summing.
40.2 † A fair eight-sided die is rolled. State its entropy in bits and the number of bits a fixed-length code needs per outcome. Is the fixed-length code wasteful here? Answer in one sentence.
40.3 For the workshop LP of §40.1 (maximize $3x + 5y$ subject to $x + 2y \le 14$, $2x + y \le 10$, $x, y \ge 0$), evaluate the objective at the corner $(5, 0)$ and at the corner $(0, 7)$. Which of the four listed corners is optimal, and what is the optimum?
40.4 † State $R(3,3)$ and write, in plain English, exactly what that number guarantees about any group of that many people.
40.5 Write the two functor laws (for map over lists) in plain English — one sentence each, no
symbols.
40.6 A binary symmetric channel flips each bit independently with probability $p = 0$. Use $C = 1 - H(p)$ to compute its capacity, and say in one sentence why the answer is exactly what you'd expect.
Part B — Prove it (⭐⭐)
40.7 † (Scaffolded — fill in the missing steps.) Prove the weak duality bound for the workshop LP: every primal-feasible $(x, y)$ and dual-feasible $(u, v)$ satisfy $3x + 5y \le 14u + 10v$. The dual constraints are $u + 2v \ge 3$ and $2u + v \ge 5$, with $x, y, u, v \ge 0$. Fill the blanks:
- Multiply $u + 2v \ge 3$ by $x \ge 0$ and $2u + v \ge 5$ by $y \ge 0$ (multiplying an inequality by a nonnegative number preserves its direction), then add to get $x(u + 2v) + y(2u + v) \ge \underline{\hphantom{xxxx}}$.
- Regroup the left side by the dual variables instead: $x(u + 2v) + y(2u + v) = \underline{\hphantom{xxxxxxxx}}$.
- Apply the primal constraints $x + 2y \le 14$ and $2x + y \le 10$ with nonnegative multipliers $u, v \ge 0$ to bound that regrouped expression above by $\underline{\hphantom{xxxx}}$.
- Chain the displays to conclude $3x + 5y \le 14u + 10v$. $\blacksquare$
40.8 Prove the upper bound $R(3,3) \le 6$ in full: every red/blue coloring of $K_6$ contains a monochromatic triangle. State where the pigeonhole principle (Chapter 17) is used and why six vertices are needed rather than five.
40.9 † Prove that for any discrete source with $n$ outcomes, the entropy satisfies $H(X) \ge 0$, with equality exactly when one outcome has probability 1. (Hint: each term $-p_i \log_2 p_i$ is $\ge 0$ for $0 \le p_i \le 1$; argue why, and say when the whole sum is 0.)
40.10 Using only the weak-duality fact from §40.1, prove the following certificate claim without re-solving either LP: if you exhibit a primal-feasible point with objective value $36$ and a dual-feasible point with objective value $36$, then $36$ is the optimal value of both programs. (This is the "produce a witness that proves optimality" pattern.)
40.11 † Verify the functor composition law on a fresh instance. Let $f(x) = 2x$ and $g(y) = y - 3$, and take the list $[5, 6, 7]$. Compute both $F(g \circ f)$ (map the composed function once) and $F(g) \circ F(f)$ (map $f$, then map $g$), show the two results, and state what their equality illustrates.
Part C — Implement it (⭐⭐)
Write Python for each. Do not run it on the grader's machine — hand-trace and include an
# Expected output: comment, matching the book's convention. Reference solutions are in
code/exercise-solutions.py.
40.12 † Write a function binary_entropy(p) that returns the binary entropy
$H(p) = -p\log_2 p - (1-p)\log_2(1-p)$, handling the endpoints $p = 0$ and $p = 1$ (where the
convention $0\log_2 0 = 0$ makes $H = 0$). Call it on $p = 0$, $p = 1/2$, and $p = 1/4$, and give the
expected output (round to 3 decimals).
40.13 Write a function capacity_bsc(p) for the capacity $C = 1 - H(p)$ of a binary symmetric
channel, reusing your binary_entropy. Print the capacity for $p = 0$, $p = 0.11$, and $p = 0.5$
(round to 3 decimals), and add a one-line comment on what the $p = 0.5$ value means.
40.14 † Write a function dual_objective(u, v) returning $14u + 10v$ and a checker
dual_feasible(u, v) returning True iff $u + 2v \ge 3$, $2u + v \ge 5$, and $u, v \ge 0$. Evaluate
both at the optimal dual point $(7/3, 1/3)$ and print the results.
40.15 Extend the chapter's has_mono_triangle(n, color) (from §40.3) with a one-line uniform
random coloring built from a fixed list of edge colors you write out by hand (do not call an RNG —
hard-code a coloring of $K_4$). Check whether your hand-made $K_4$ coloring has a monochromatic
triangle and report the expected output.
Part D — Find the error (⭐⭐)
Each argument below is wrong. State precisely which part fails and why.
40.16 † Claim: "Since the probabilistic method (§40.3) proves a triangle-free-of-one-color coloring of $K_n$ exists whenever $E[X] < 1$, we can simply read the good coloring off the proof." "Justification": the proof computes $E[X] = \binom{n}{k}\,2^{1-\binom{k}{2}}$ and concludes $X = 0$ somewhere, so that somewhere is the coloring we want. — What is wrong with "read it off the proof"? Name the property of the argument that this misses.
40.17 Claim: "A source with three outcomes of probabilities $1/2, 1/4, 1/4$ can be compressed below $1.5$ bits per symbol on average, because a clever enough code can always beat the entropy." "Justification": entropy is just the average of $-\log_2 p_i$, and averages can be beaten by optimizing. — Find the flaw, naming the theorem it violates.
40.18 † Claim: "$R(3,3) \le 5$." "Proof": fix a vertex $v$ in $K_5$; it has 4 incident edges, so by pigeonhole at least $\lceil 4/2 \rceil = 2$ share a color, say red, to vertices $a, b$. If $ab$ is red, $v a b$ is a red triangle; if $ab$ is blue, we look further… — Pinpoint exactly where this argument fails to deliver a monochromatic triangle, and connect the failure to why $R(3,3) = 6$, not 5.
40.19 Claim: "Every linear program has an optimum, and it is in the interior of the feasible region where the objective is largest." — Two things are wrong with this sentence. Identify both (one about existence, one about where the optimum sits), citing the relevant fact from §40.1.
Part E — Model it (⭐⭐⭐)
Translate each real scenario into the discrete-math objects of this chapter. You are setting up the model, not necessarily solving it.
40.20 † A delivery company must assign 3 trucks to 3 routes to minimize total cost; truck $i$ on route $j$ costs $c_{ij}$, each truck takes exactly one route, each route exactly one truck. (a) Name the genre this belongs to (one term from §40.1). (b) Write it as an integer linear program: give the decision variables, the objective, and the constraints. (c) In one sentence, say why "just try all assignments" is fine for 3 trucks but not for 300.
40.21 A streaming service logs which of four UI layouts a user lands on, with observed frequencies $\{0.7, 0.1, 0.1, 0.1\}$. The data team wants to know the theoretical minimum average bits to log one layout choice. (a) Which §40.2 quantity answers this? (b) Set up the exact expression (do not evaluate). (c) Explain in one sentence why a fixed 2-bit-per-event log is wasteful here.
40.22 † A social network wants to guarantee that in any group of guests it seats together, some ordered structure is unavoidable: among any six attendees, three are mutual contacts or three are mutual non-contacts. (a) Model "contact / non-contact among six people" as an edge-2-coloring of which graph? (b) Which theorem from §40.3 guarantees the property, and what is the smallest group size for which it holds? (c) Show by naming a construction why five guests would not suffice.
40.23 A compiler team notices it can replace map(g, map(f, xs)) with a single pass
map(lambda x: g(f(x)), xs) and wants to be sure this is always safe, not just usually. (a) Which
§40.4 concept (and which of its two laws) justifies the rewrite? (b) State, in the language of
functors, why the optimization (called map fusion) is sound for every f, g, and list — not a
coincidence of one program.
Part F — Conjecture and test (⭐⭐⭐)
For each, write code to gather evidence on small cases, form a conjecture, then prove or disprove it.
40.24 † (Entropy and uniformity.) Conjecture, then settle: among all probability distributions on $n$ outcomes, the entropy $H$ is maximized by the uniform distribution, with value $\log_2 n$. (a) Write code that, for $n = 3$, evaluates $H$ on the uniform distribution $\{1/3, 1/3, 1/3\}$ and on three non-uniform distributions you choose by hand, and reports which is largest. (b) State the conjecture the evidence supports. (c) Prove the special case $n = 2$ in full: show $H(p) = -p\log_2 p - (1-p)\log_2(1-p)$ is maximized at $p = 1/2$ with value 1. (You may argue from symmetry and the two endpoints, as the chapter does, or use calculus if you know it.)
40.25 (Erdős's bound, numerically.) The chapter states that $\binom{n}{k}\,2^{1-\binom{k}{2}} < 1$ implies $R(k,k) > n$. (a) Write code that, for $k = 4$, finds the largest $n$ for which $\binom{n}{k}\,2^{1-\binom{k}{2}} < 1$ still holds, giving the lower bound $R(4,4) > n$ this yields. (b) Compare your computed lower bound to the known value $R(4,4) = 18$ from the chapter: is the probabilistic bound tight here, loose, or wrong? (c) Explain in one sentence why a loose lower bound is still a genuine theorem, not a failure of the method.
40.26 † (A pattern that breaks — recognizing a false conjecture.) Define $C = 1 - H(p)$ on a grid of values $p = 0, 0.1, 0.2, \dots, 0.5$. A teammate conjectures "capacity decreases linearly in $p$ on this range." (a) Tabulate $C$ at those six values with code. (b) Is the decrease linear? Disprove or support the conjecture by comparing consecutive differences. (c) State what the true shape of $C$ vs. $p$ is (increasing/decreasing, and convex or concave near $p = 0$), connecting it to the behavior of $H(p)$.
Part G — Interleaved & Deep Dive
The book closes the way it ran: by mixing the tools. Each of these reaches back across the chapters.
40.27 † (Ch. 34 + 40.1.) The max-flow min-cut theorem says max flow $=$ min cut. Explain, in two or three sentences, why this is an instance of LP strong duality — identify which side is the primal maximization, which is the dual minimization, and what strong duality contributes.
40.28 (Ch. 31 + 40.2.) Huffman coding (Chapter 31) builds an optimal prefix-free code, and its average codeword length lies in $[H(X), H(X) + 1)$. For the source $\{1/2, 1/4, 1/8, 1/8\}$, give the Huffman code, compute its exact average length, compute the entropy $H(X)$, and confirm the code hits the entropy floor. Why does it hit exactly here when it usually doesn't?
40.29 † (Ch. 17 + 20 + 40.3.) Two principles power the $R(3,3) = 6$ story: one counting principle for the upper bound, one property of expectation for Erdős's lower bound. Name both, and for each write the single sentence in which it does its work.
40.30 (Ch. 9 + 24 + 40.4.) Function composition (Chapter 9) and a group operation (Chapter 24) share two algebraic laws that are exactly the two laws defining a category. Name both laws, then state which additional group axiom a general category is allowed to drop, and what one-object category with that axiom restored becomes.
40.31 † (Ch. 25 + 40.5.) RSA (Chapter 25) rests on one hardness assumption. (a) Name it. (b) Name the algorithm that breaks it on a quantum computer. (c) Name the kind of mathematical structure many post-quantum replacement schemes are built on. (d) In one sentence, say why "the number theory you learned is the on-ramp, not the destination."
40.32 (Deep Dive — Ch. 36, 37 + 40.3.) The chapter groups three "you can know more than you can construct / decide" ideas: uncountability (Chapter 10), undecidability (Chapter 36), and non-constructive existence (the probabilistic method, §40.3). In one paragraph, explain the common thread: in each case, what do we establish, and what do we not get? Tie it to theme four of the book (computation and proof are complementary).
40.33 (Deep Dive — the whole book.) Pick any three rows of the tools table in §40.5. For each, write one sentence of the form: "Because I can now [concrete skill from this book], I am on the on-ramp to [research/advanced area], which I'd see in [industry use]." Make the skills genuinely things you can do after this book — a specific algorithm, proof technique, or construction — not vague gestures.
Solutions to † and odd-numbered exercises: see appendices/answers-to-selected.md. This being the
final chapter, the rubric rewards the meta-skill above all: correctly naming the structure of a
disguised problem, then attacking it with the standard tool. A model answer recognizes, sets up, and
only then computes.