Exercises: Capstone — Applying Discrete Mathematics

DataField.Dev

Exercises: Capstone — Applying Discrete Mathematics

Unlike every other chapter's problem set, these exercises do not introduce a new technique. They ask you to compose what you already have — to model a problem, assemble Toolkit pieces, defend the result, and report honestly. They span all four tracks (A: RSA, B: social-network analyzer, C: Sudoku solver, D: error-correcting codes) and the presentation skills of §39.7, and the final section interleaves the whole book. Difficulty: ⭐ foundational, ⭐⭐ intermediate, ⭐⭐⭐ challenging. Worked solutions to the daggered (†) and odd-numbered problems are in the appendix answers-to-selected.md; attempt them before you peek. No solutions live in this file.

A note on scope. A capstone problem rewards depth over breadth — the same advice §39.2 gave for the project itself. For the longer modeling and correctness exercises, a precise, modest claim you can defend beats an ambitious sketch you cannot. Where an exercise says "state the exact guarantee," that is the graded part.

Part A — Inventory and warm-ups (⭐)

These check that you can map the Toolkit to its mathematics and read off the load-bearing facts before you build anything.

39.1 † For each capstone track, name its load-bearing theorem (the one result the whole track rests on) and the dmtoolkit module that holds the operational core. Do this from memory, then check against the chapter's summary table.

39.2 The §39.1 inventory lists ten modules. Which single module does Track A's text_to_int / int_to_text layer not touch at all, even though the track as a whole depends on numbertheory.py? Explain in one sentence what each of those two functions actually does mathematically (hint: think of the "char → integer" map of Chapter 9).

39.3 † In the worked Track A key ($n = 3233$, $e = 17$, $d = 2753$), the message "A" has byte value 65 and $65 < 3233$, so it fits in one block. Compute the largest single byte value that the toy modulus can hold, and explain why "Hi" ($m = 18537$) does not fit.

39.4 Track B's analyzer reports the most-central vertex as the maximum-degree vertex. For the chapter's six-person graph, two vertices tie at degree 3. List both, and explain in one sentence why the chapter insists you report the tie rather than letting max silently pick one.

39.5 † State the minimum-distance bound from Track D in full: a code with minimum distance $d$ can detect up to how many errors, and correct up to how many? Then specialize both numbers to Hamming(7,4), whose minimum distance is $d = 3$.

39.6 Track C models Sudoku as a constraint satisfaction problem. Name its three ingredients (variables, domains, constraints) for $9\times 9$ Sudoku, and state how many pairwise inequalities the "all different" rule for a single row unpacks into. (It is a binomial coefficient.)

Part B — Prove it, with scaffolding (⭐⭐)

39.7 † (Scaffolded — fill the missing step.) Complete the proof that RSA decryption inverts encryption in the coprime case. Let $n = pq$, $\phi(n) = (p-1)(q-1)$, and $ed \equiv 1 \pmod{\phi(n)}$, and suppose $\gcd(m, n) = 1$.

Because $ed \equiv 1 \pmod{\phi(n)}$, we may write $ed = 1 + k\phi(n)$ for some integer $\underline{\hphantom{xxxx}}$.
By Euler's theorem (Chapter 23), since $\gcd(m, n) = 1$, we have $m^{\phi(n)} \equiv \underline{\hphantom{xxxx}} \pmod n$.
Therefore $m^{ed} = m^{1 + k\phi(n)} = m \cdot \big(m^{\phi(n)}\big)^{k} \equiv m \cdot \underline{\hphantom{xxxx}} = \underline{\hphantom{xxxx}} \pmod n$.

State which single hypothesis you used at each step.

39.8 (Scaffolded.) Complete the single-error-correction argument for Track D. Let $x$ be the transmitted codeword, $y$ the received word with exactly one flipped bit, and $x' \ne x$ any other codeword, in a code with minimum distance $d \ge 3$.

$\operatorname{dist}(x, y) = \underline{\hphantom{xx}}$ (one bit flipped).
$\operatorname{dist}(x, x') \ge \underline{\hphantom{xx}}$ (definition of minimum distance).
By the triangle inequality, $\operatorname{dist}(y, x') \ge \operatorname{dist}(x, x') - \operatorname{dist}(x, y) \ge \underline{\hphantom{xxxx}}$.
Since $\operatorname{dist}(y, x') > \operatorname{dist}(y, x)$, nearest-codeword decoding returns $\underline{\hphantom{xx}}$.

Name the one property of Hamming distance that the triangle-inequality step relies on.

39.9 † Prove the soundness half of the backtracking Sudoku solver: if solve(grid) returns True, the final grid violates no row, column, or box constraint. (Hint: a digit is only ever written after is_legal confirms it; and constraints, once satisfied, are never weakened by later placements. Structure it as an induction on the number of filled cells, exactly as §39.5 sketches.)

39.10 Prove that on the chapter's six-person friendship graph, the BFS distance from "Ana" to "Eve" is exactly 3 — not by running code but by exhibiting a 3-edge path and arguing no shorter path exists. (For "no shorter," use the bridge: every Ana→{Dev,Eve,Fay} path must cross the single edge Cam–Dev.)

Part C — Implement it (⭐⭐)

Write Python for each. Do not run it — hand-trace and include an # Expected output: comment, the book's convention. Each should be at most ~30 lines and compose Toolkit functions where possible. Reference implementations are in code/exercise-solutions.py.

39.11 † Write the blocking layer for Track A that §39.3 lists as the must-do extension. Given a byte string and a modulus n, write to_blocks(data, n) that splits the bytes into chunks each guaranteed to encode to an integer below n, and from_blocks(blocks) that reassembles them. Show, in an # Expected output: comment, that for a small n your blocking turns "Hi" into two single-byte blocks [72, 105] rather than the over-large 18537.

39.12 Write the "people you may know" recommender from Track B's extensions: a function recommend(g, v) that scores every non-neighbor w of v by the number of common friends $\lvert N(v) \cap N(w) \rvert$ (a set intersection — Chapter 8) and returns the candidates sorted by score, highest first. Run it by hand on the chapter's graph for v = "Ana" and put the result in the # Expected output: comment.

39.13 † Write the uniqueness checker from Track C's extensions: modify the backtracking solver so that instead of stopping at the first solution it counts solutions (stopping early once the count exceeds 1). Call it count_solutions(grid). State, in a comment, what a well-formed Sudoku must return, and hand-trace the return value on a $4\times 4$ "Shidoku"-style grid small enough to reason about (or describe the trace if you keep the full $9\times 9$).

39.14 Write the self_test integration test from the Project Checkpoint from memory: one assertion exercising each of numbertheory, crypto, graphs, and coding, returning True if all pass. Then add a fifth assertion of your own that exercises sieve from numbertheory.py (e.g., that sieve(10) returns the primes below or equal to 10). Hand-derive the expected primes.

Part D — Find the error (⭐⭐)

Each item below is a plausible-looking capstone mistake. State precisely what is wrong and how it would manifest — and note whether it fails loudly (an exception) or silently (a wrong answer that looks fine). Silent failures are the dangerous ones, which is exactly why a correctness argument matters.

39.15 † (Find the error — Track A.) A student's RSA text layer encrypts "Hi" directly under the toy key $n = 3233$ without blocking: c = rsa_encrypt(text_to_int("Hi"), pub), then decrypts. They report "it ran without errors, so it works." Explain exactly what rsa_decrypt returns and why it is not 18537, and name the constraint from §39.3 that was violated.

39.16 (Find the error — Track C.) A student deletes the line grid[r][c] = 0 from solve, reasoning "if the digit was legal when I placed it, why erase it?" Describe what the search now explores after the first dead end, and explain why the solver may now either fail to find a real solution or return an invalid board.

39.17 † (Find the error — correctness argument.) A write-up claims: "I tested decryption on 50,000 random messages and every one decrypted correctly, so RSA decryption is proven correct." This conflates two kinds of evidence. State which theme of the book it violates, and write the one sentence that would turn the demonstration into a guarantee (name the theorem and the chapter).

39.18 (Find the error — Track D.) A student claims Hamming(7,4) "is robust because it corrects any errors in a block." Correct the overclaim: state exactly how many errors it corrects per block, what happens when two bits flip, and why that failure is silent (the decoder returns a confident wrong answer). Connect your answer to the code rate $4/7$ and the robustness-vs-efficiency trade-off.

Part E — Model it (⭐⭐⭐)

These ask you to do the first and most important step of any capstone: translate a messy real scenario into discrete-math objects. Lead with the model — the model is the intellectual core.

39.19 † (Model it — a scheduling problem as a CSP.) A university must assign each of 12 final exams to one of 4 time slots so that no student sits two exams at once. You are given, for every pair of exams, whether some student is enrolled in both. Model this as a discrete-math object so that Track C's engine (the constraint-independent solve skeleton from §39.5) could solve it. Specify the variables, their domains, and the constraint — and name the classical graph problem this is in disguise (Chapter 33). What do the four time slots correspond to?

39.20 (Model it — a recommendation feed as a graph.) A music app wants a "fans also follow" feature for artists. Model the data as a graph and define, in graph-theory terms, (a) the recommendation score between two artists, (b) "the most influential artist," and (c) "two artists in unrelated scenes." For each, name the Toolkit tool from graphs.py you would reach for. State one way the model simplifies away something real (an honest limitation, §39.7).

39.21 † (Model it — a noisy storage channel as a code.) A backup system writes 4-bit nibbles to a flash chip that flips at most one bit per nibble before the next read. Model the reliability requirement as a coding-theory problem: what minimum distance $d$ must the code have, which Hamming code from §39.6 meets it, and what is the storage overhead (code rate) you pay? If the chip could instead flip two bits per nibble, explain in model terms (minimum distance) why Hamming(7,4) is no longer enough.

39.22 (Model it — a digital signature.) §39.3's extensions describe running RSA "backwards" to sign a message. Model the signing-and-verification protocol precisely: write the formula for the signature $s$ in terms of the private exponent $d$ and a hash $H(m)$, the formula the verifier checks using the public exponent $e$, and state in one sentence what property of RSA (which you proved in the correctness argument) makes verification succeed exactly when the signature is genuine.

Part F — Conjecture and test (⭐⭐⭐)

Use code on many cases to form a conjecture, then settle it with a proof or a counterexample — theme four of the book in its purest form. Remember: passing cases are evidence, not proof.

39.23 † (Conjecture and test, then prove.) For Track B, conjecture a relationship between a connected graph's diameter and its number of vertices $V$. Write code (do not run it) that, for paths $P_2, P_3, \dots, P_8$ (a line of $V$ vertices), computes the diameter via your diameter function and tabulates it against $V$. Read off the pattern, conjecture a formula for the diameter of $P_V$, then prove it. (Then note: is your formula an upper bound for all connected graphs on $V$ vertices, or exact only for paths? Justify.)

39.24 (Conjecture and test.) In Track A, does the order of the two primes matter? Conjecture whether rsa_keygen_from_primes(p, q, e) and rsa_keygen_from_primes(q, p, e) produce the same public modulus and the same private exponent. Test the conjecture by hand-deriving both for $(p, q) = (61, 53)$, then prove or disprove it from the definitions of $n = pq$ and $\phi(n) = (p-1)(q-1)$.

39.25 † (Conjecture and test — a Track D claim.) Conjecture: "In Hamming(7,4), flipping any single bit position changes the decoded 4 data bits, so the decoder always notices the flip." Write code that, for the codeword of [1,0,1,1], flips each of the 7 positions in turn, decodes, and checks whether the recovered data still equals [1,0,1,1]. Predict the seven results by hand. Then decide: is the conjecture true or false, and what does the correct statement of single-error correction say instead?

Part G — Ordinary computation (⭐ / ⭐⭐)

Mechanical practice on the quantities each track reports.

39.26 For the worked Track A key, encryption is $c = m^e \bmod n$ with $e = 17$, $n = 3233$. Using the fast-exponentiation idea (square-and-multiply, Chapter 23), and given that $65^{16} \equiv 1409 \pmod{3233}$, compute $65^{17} \bmod 3233$ by hand and confirm it equals the chapter's reported $c = 2790$.

39.27 † For the chapter's six-person graph, write out the full BFS distance dictionary from source "Dev" (every vertex's distance from Dev). Then state the eccentricity of Dev (its farthest distance) and compare it to the graph's diameter of 3.

39.28 Compute the Hamming distance between the two 7-bit strings [0,1,1,0,0,1,1] (the codeword of [1,0,1,1]) and [0,1,1,1,0,1,1]. Is the second string within correcting range of the first under Hamming(7,4)? State the rule you used.

39.29 † Run the integration test's first assertion by hand: compute $\gcd(252, 198)$ using the Euclidean algorithm (Chapter 22), showing each remainder step, and confirm it equals the asserted 18.

Part H — Interleaved & Deep Dive

These mix techniques from across the book — the capstone draws on everything, so this section does too.

39.30 (Ch. 17 + 39.) Track D's correction guarantee rests on codewords being spread far apart in $\{0,1\}^7$. Using the pigeonhole principle (Chapter 17), argue informally why a code on length-7 strings cannot have arbitrarily many codewords if every pair must differ in at least 3 positions. (Think of disjoint "balls" of radius 1 around each codeword that must not overlap.)

39.31 † (Ch. 5 + 39.) When a Sudoku has no solution, Track C's solver returns False by exhausting the search tree. Identify which Chapter 5 proof technique "check every case and find none works" is an instance of, and explain in two sentences how exhausting a finite tree establishes the universal negative "no satisfying assignment exists."

39.32 (Ch. 9 + 23 + 39.) Track A's text_to_int is a function in the precise sense of Chapter 9. State its domain and codomain, argue that it is injective (one-to-one) on byte strings of a fixed length, and explain why injectivity is exactly what makes int_to_text a well-defined inverse — tying this to why RSA's $m^{ed} \equiv m$ (Chapter 23) must hold for recovery, not just for some output.

39.33 † (Ch. 28 + 30 + 39.) Track B's components runs BFS once per component for total work $O(V + E)$, while a naive "BFS from every vertex" would be $O(V(V+E))$. Explain how the seen set achieves the speedup, and classify both running times using the Big-O notation of Chapter 30. For a sparse social graph where $E = O(V)$, what does each bound simplify to?

39.34 (Deep Dive — cross-track.) The chapter's Deep Dive path suggests building Tracks A and B together. Pick one shared discipline that both tracks demand — for example, "cite the proven theorem under the code" or "state the exact guarantee, not one inch more" — and write a one-paragraph argument, grounded in the two tracks' correctness sections, for why that discipline is what separates a mathematics capstone from a coding exercise.

39.35 (Deep Dive — the whole book.) Choose any one track and draft the five-section technical write-up of §39.7 for its core (problem/model, design/build, correctness, evidence, complexity and limits) in roughly half a page per section. The graded qualities are: a model stated before any code, a correctness claim that cites an earlier theorem by chapter, and a "limits" section that names at least one thing you did not verify. This is the capstone of the capstone — and the closest this book comes to asking you to produce a portfolio artifact.

Solutions to † and odd-numbered exercises: see appendices/answers-to-selected.md. For the modeling and write-up problems, the rubric rewards: the right discrete-math object chosen and named, a correctness claim that cites (does not re-derive) the load-bearing theorem, the exact guarantee stated with no overclaim, and an honest accounting of limits.