Exercises: Algorithms and Complexity

DataField.Dev

Exercises: Algorithms and Complexity

These run from mechanical warm-ups (count the operations) up to full asymptotic proofs, instrumented code, modeling, and a first lower bound. Difficulty: ⭐ foundational, ⭐⭐ intermediate, ⭐⭐⭐ challenging. Worked solutions to the daggered (†) and odd-numbered problems are in the appendix answers-to-selected.md — try them before you peek. For every Big-O/Ω/Θ proof, the rubric rewards exhibited witnesses $(c, n_0)$ and a verified inequality "for all $n \ge n_0$," exactly as in §14.3.

Part A — Warm-ups: count the work (⭐)

These reinforce the two mechanical skills of §14.2–14.5: counting operations in the RAM model, and naming the resulting growth class. Do them quickly; they are the muscle memory the rest of the set assumes.

14.1 † For each line of maximum from §14.1, state how many primitive RAM operations it performs on a list of length $n$ (as a constant, or as a function of $n$). Then write the total $T(n)$ as a sum and name its growth class with $\Theta$.

14.2 A loop runs for i in range(n): total = total + values[i]. Give the per-pass operation count in the RAM model and the overall asymptotic running time.

14.3 † Classify each into the standard hierarchy by giving the simplest $g$ with the expression $= \Theta(g)$: (a) $8n + 200$, (b) $3n^2 + 17n$, (c) $\frac{n(n-1)}{2}$, (d) $6$, (e) $4n^3 + n^2 + n$.

14.4 Put these in increasing order of growth (slowest first), using the §14.5 hierarchy: $n!,\ n^2,\ \log n,\ 2^n,\ n,\ n \log n,\ \sqrt{n},\ 1$.

14.5 † An integer $N$ is given as input to a factoring routine. Is the input size $N$ or $\log_2 N$? If the routine performs $\sqrt{N}$ trial divisions, express that work as a function of the input size $n$, and say whether it is polynomial or exponential in $n$.

14.6 For binary_search from §14.4 on a sorted list of length $n = 1{,}000{,}000$, give the maximum number of loop iterations (you may use $\log_2(10^6) \approx 20$). Compare to the worst case of linear_search on the same list, and use the comparison to explain in one sentence why a sorted input is worth the cost of sorting when you will search it many times.

Part B — Prove it (⭐⭐)

Each claim must be proved straight from the definition in §14.3: exhibit witnesses $c$ and $n_0$ and verify the inequality for all $n \ge n_0$. State your strategy in one sentence first, as the chapter does. Remember the recipe: for an upper bound, pick $c$ slightly above the leading coefficient and let the slack absorb the lower-order terms; for a lower bound, throw away the helpful smaller terms and keep the leading one. Witnesses are never unique — any valid pair earns full credit.

14.7 † Prove $5n^2 + 3n + 100 = O(n^2)$ by giving explicit witnesses $c$ and $n_0$.

14.8 Prove $5n^2 + 3n + 100 = \Omega(n^2)$, and conclude $5n^2 + 3n + 100 = \Theta(n^2)$. (For the $\Omega$ direction, note $5n^2 + 3n + 100 \ge 5n^2$ for every $n \ge 0$.)

14.9 † (Scaffolded — fill in the missing step.) Prove $2n^3 + 4n = O(n^3)$. Fill the blanks: - Strategy: choose $c$ just above the leading coefficient $2$, leaving margin to absorb $4n$. - Proof. Choose $c = \underline{\hphantom{xx}}$ and $n_0 = \underline{\hphantom{xx}}$. For all $n \ge n_0$ we have $4n \le \underline{\hphantom{xxxx}}$ (because $n^2 \ge \underline{\hphantom{xx}}$ once $n \ge n_0$), and therefore $$2n^3 + 4n \le 2n^3 + \underline{\hphantom{xxxx}} = \underline{\hphantom{xxxx}} = c\,n^3.$$ This is the definition of $2n^3 + 4n = O(n^3)$ with witnesses $\big(\underline{\hphantom{xx}},\, \underline{\hphantom{xx}}\big)$. $\blacksquare$

14.10 Prove that $n = O(n^2)$ but $n^2 \ne O(n)$. (For the second part, assume witnesses existed and derive a contradiction, as in the §14.3 Check Your Understanding.)

14.11 † Prove the log-base-change fact the chapter relied on: for any fixed bases $a, b > 1$, $\log_a n = \Theta(\log_b n)$. (Use $\log_a n = \frac{\log_b n}{\log_b a}$ and treat $\frac{1}{\log_b a}$ as a constant.) Conclude that writing $O(\log n)$ with no base is justified.

14.12 Prove $\log_2 n = O(n)$. (Strategy: it suffices to show $\log_2 n \le n$ for $n \ge 1$; you may use that $n < 2^n$, the chapter's exercise-set fact $n! \ge 2^{n-1}$ is not needed here.) Then explain in one sentence why this, together with the hierarchy of §14.5, means a $\Theta(\log n)$ algorithm is never asymptotically slower than a $\Theta(n)$ one.

14.13 † Prove the transitivity of $O$: if $f(n) = O(g(n))$ and $g(n) = O(h(n))$, then $f(n) = O(h(n))$. (Combine the two witness pairs; take the larger threshold.)

14.14 Prove the sum rule for $O$: if $f_1 = O(g_1)$ and $f_2 = O(g_2)$, then $f_1 + f_2 = O(\max(g_1, g_2))$. (Let $g = \max(g_1, g_2)$; bound each summand by a constant times $g$.)

14.15 † Prove that $n! \ge 2^{n-1}$ for all $n \ge 1$, and use it to justify the chapter's claim that $\Theta(n!)$ grows at least as fast as $\Theta(2^n)$ — i.e. $2^n = O(n!)$. (Induction on $n$ is the clean route; this also reuses Chapter 6.)

Part C — Implement it (⭐⭐)

Write Python for each. Do not run it on the grader's machine — hand-trace and include an # Expected output: comment, matching the book's convention. The point of these is to make the operation count visible: instead of returning the algorithm's normal answer, each function returns the number of primitive operations it performed, so you can compare the count against the asymptotic bound you derived on paper. Reference implementations live in code/exercise-solutions.py.

14.16 † Write a function count_comparisons(values) that runs the has_duplicate_slow algorithm from §14.4 but, instead of returning a boolean, returns the number of values[i] == values[j] comparisons it performs in the worst case (no duplicate present). Hand-trace it on a 4-element list with no duplicates and confirm the count matches $n^2$ (the loops never short-circuit). Then state in one line why the best case (a duplicate in the first two slots) is only $O(1)$.

14.17 Write count_halvings(n) that returns how many times you can integer-divide n by 2 before reaching 0 (i.e. simulate the trip count of a halving loop). Hand-trace count_halvings(16) and relate the result to $\lfloor \log_2 n \rfloor + 1$.

14.18 † Implement linear_search_compares(values, target) returning the number of element comparisons made (not the index). Then write three lines computing it for the best, worst, and a middle position of target in a 5-element list, and confirm they match the §14.6 analysis ($\Theta(1)$, $\Theta(n)$, and something in between).

14.19 Instrument the count_pairs algorithm from §14.4 to return the number of times the inner if executes, and verify by hand on a list of length $n = 5$ that the count equals $\frac{n(n-1)}{2} = 10$.

Part D — Find the error (⭐⭐)

Each "proof" or claim below is wrong. State precisely which part fails and why — name the category of the mistake (e.g. "a constant that secretly depends on $n$", "confusing the bound shape with the input case"), not just the symptom. Diagnosing a flawed argument is harder than writing a correct one, and it is exactly the skill a code reviewer needs when a colleague's complexity claim "looks right."

14.20 † Claim: $n^2 = O(n)$. "Proof": "Take $c = n$. Then $n^2 = n \cdot n \le c \cdot n$ for all $n \ge 1$, so $n^2 = O(n)$ with witness $c = n$." — Find the flaw.

14.21 Claim: "Two nested loops always run in $O(n^2)$." A student supports it with for i in range(n): for j in range(5): work(). Explain why the claim is false and give the correct running time of this specific code.

14.22 † Claim: $2^{n+1} = O(2^n)$ is false because "the exponent grew." "Proof": "No constant $c$ can satisfy $2^{n+1} \le c \cdot 2^n$, since the left side keeps doubling." — Is the claim ($2^{n+1} = O(2^n)$) true or false? Find the error in the reasoning and give correct witnesses if the claim holds.

14.23 Claim & "proof": "Since the average-case time of linear search is $\Theta(n)$, its worst-case time is also $\Theta(n)$, and therefore $O$ and 'worst case' mean the same thing." — Two things are conflated here. Name the category error (cf. the §14.6 pitfall) and explain why the conclusion does not follow even though both numbers happen to be $\Theta(n)$.

14.24 † Claim: "An $O(n^{100})$ algorithm is exponential, because $100$ is a huge power." Diagnose the error using the §14.5 definitions of polynomial vs. exponential time, and state whether the algorithm is nonetheless "fast" in practice.

Part E — Model it (⭐⭐⭐)

Translate each real scenario into the discrete-math objects of this chapter (a running-time function, a growth class, a best/worst/average distinction, or a lower bound). You are setting up the analysis, not just guessing an answer.

14.25 † A photo-sharing service stores $n$ user records. A new feature compares every pair of users to suggest "people you may know." Model the number of comparisons as a function $T(n)$, name its $\Theta$-class, and use the §14.5 timing table (1 op = 1 ns) to estimate the wall-clock cost at $n = 100{,}000$. Then state, in one sentence, what asymptotic improvement would make the feature feasible at $n = 10^8$.

14.26 A login system checks a typed password against a stored list of $n$ banned passwords using typed in banned_list. (a) Model the per-login cost as a growth class and identify the worst case. (b) The engineers switch banned_list to a set. Model the new per-login cost and name which §14.2 pitfall the original code fell into. (c) Frame the difference as a choice between two algorithms computing the same function (cf. §14.1). (d) An attacker who controls the typed input wants the slowest possible check. Which data structure leaves them no leverage, and how does this connect to the §14.6 remark that worst-case guarantees matter for security?

14.27 † (Model a lower bound.) A puzzle game must determine the minimum of $n$ distinct scores using only pairwise comparisons. Adapt the §14.6 "count the losers" argument to model why any such algorithm needs at least $n - 1$ comparisons, and state the resulting $\Omega$ bound on the problem. Is the obvious single-pass scan optimal? Justify with a matching $O$ bound.

14.28 A batch job runs two phases on $n$ records: phase 1 sorts them ($\Theta(n \log n)$), phase 2 does an all-pairs scan ($\Theta(n^2)$). A manager proposes spending the quarter optimizing phase 1 down to $\Theta(n)$. Model the total cost before and after using the sum rule (§14.4), and explain why the optimization barely helps. What would you optimize instead?

Part F — Conjecture and test (⭐⭐⭐)

Use code to gather evidence on many cases (theme four), then prove or disprove. Remember: a passing test is evidence, not a proof — and a failing test is a disproof, since a single counterexample refutes a universal claim. The workflow is always the same: form a conjecture, instrument something to test it across a range, look at the trend (not one data point), and then settle the question with a definition-level argument.

14.29 † Using the Toolkit's doubling_ratios from this chapter's Project Checkpoint, instrument a routine that does a single pass over its input (an $O(n)$ scan that counts one op per element). Predict the doubling ratios before running it in your head, report what doubling_ratios returns for start=10, doublings=3, and then prove that an $O(1)$-body single loop of $n$ passes is $\Theta(n)$ (so the ratio must approach $2$). Why is the empirical ratio "$\approx 2$" not by itself a proof of $\Theta(n)$?

14.30 Conjecture a closed form for the trip count of the doubly nested loop for i in range(n): for j in range(i): body(). Instrument it to count body() calls, tabulate the counts for $n = 1, \dots, 6$ by hand, conjecture a formula in $n$, then prove it by induction (Chapter 6) and give its $\Theta$-class.

14.31 † (Conjecture, then disprove a tempting "rule.") A classmate conjectures: "if $f(n) = O(g(n))$ then $2^{f(n)} = O\!\left(2^{g(n)}\right)$." Test it on the pair $f(n) = 2n$, $g(n) = n$ (note $2n = O(n)$). Compute $2^{f(n)}/2^{g(n)}$ for $n = 1, 2, 3, 4$ in your head, observe the trend, and disprove the conjecture with this counterexample. What does this reveal about treating $O$-statements inside an exponent?

14.32 Conjecture which is larger for large $n$: $n^{10}$ or $2^n$. Evaluate both at $n = 10, 30, 60$ (you may use $2^{10} \approx 10^3$), watch the crossover, and then prove that $n^{10} = O(2^n)$ using the chapter's fact that every polynomial is eventually beaten by any exponential. Identify, roughly, the $n$ beyond which $2^n$ wins permanently.

Part G — Interleaved & Deep Dive

Mixing techniques from earlier chapters keeps them sharp.

14.33 † (Ch. 12 + 14.) In §14.3 we noted $\Theta$ is symmetric and $O$ is not. Prove that the relation "$f \sim g$ iff $f = \Theta(g)$" is an equivalence relation on functions $\mathbb{N} \to \mathbb{R}_{>0}$ (reflexive, symmetric, transitive — Chapter 12). Then explain why "$f \preceq g$ iff $f = O(g)$" is a preorder but not a partial order, naming the property it lacks.

14.34 (Ch. 11 + 14.) Evaluate the loop cost $\sum_{i=1}^{n} \sum_{j=1}^{i} 1$ as a closed form using Chapter 11's summation results, then give its $\Theta$-class. This is the exact trip count of a common triangular nested loop.

14.35 † (Ch. 9 + 14.) Binary search's iteration count is $\lceil \log_2 n \rceil$. Recall from Chapter 9 that $\lceil \cdot \rceil$ is the ceiling function. Explain, using domain/codomain language, why the number of iterations must be a natural number even though $\log_2 n$ is generally irrational, and why this rounding does not change the $\Theta(\log n)$ classification.

14.36 (Ch. 6 + 14.) Prove by induction on $k$ that a halving loop starting from $n = 2^k$ performs exactly $k + 1$ iterations before the range size reaches $0$. Connect the result to $k = \log_2 n$ and to the §14.4 claim that binary search is $O(\log n)$.

14.37 † (Deep Dive — Ch. 12 + 14.) The hierarchy in §14.5 uses $\prec$. Prove that $\prec$ (defined by "$f \prec g$ iff $f = O(g)$ and $g \ne O(f)$") is transitive and irreflexive, so it behaves like a strict order on $\Theta$-classes. Then exhibit two functions that are incomparable under $\prec$ (neither is $O$ of the other), showing the order is only partial. (Hint for the last part: try a function that oscillates.)

14.38 (Deep Dive.) Prove $\log(n!) = \Theta(n \log n)$ — the bound the §14.5 hierarchy needs to place comparison sorting. (Upper bound: $n! \le n^n$, so $\log(n!) \le n \log n$. Lower bound: at least half the factors in $n!$ exceed $n/2$, so $n! \ge (n/2)^{n/2}$; take logs.) State which growth class this puts $\log(n!)$ in, and why that matters for sorting lower bounds you'll meet later.

Solutions to † and odd-numbered exercises: see appendices/answers-to-selected.md. For each asymptotic proof, the rubric rewards: a clearly stated strategy, explicit witnesses $(c, n_0)$, a verified inequality for all $n \ge n_0$, and the correct conclusion ($O$, $\Omega$, or $\Theta$). For "find the error" problems, name the category of the mistake, not just the symptom.