Exercises: Introduction to Complexity Theory

DataField.Dev

Exercises: Introduction to Complexity Theory

These build from mechanical class-placement warm-ups to full reductions, code, and modeling. Difficulty: ⭐ foundational, ⭐⭐ intermediate, ⭐⭐⭐ challenging. Worked solutions to the starred-with-a-dagger (†) and odd-numbered problems are in the appendix answers-to-selected.md; do them before you peek. Reference implementations for the "implement it" problems live in code/exercise-solutions.py.

A standing reminder for this chapter: the direction of a reduction is the single most error-prone point. Whenever you write $A \le_p B$, say out loud "$A$ is no harder than $B$; $B$ is at least as hard" before moving on.

Part A — Warm-ups: place the problem (⭐)

37.1 † For each decision problem, give a one-line certificate and the verifier's job, then say it is in NP. (a) PARTITION: can a multiset of integers be split into two parts with equal sum? (b) SUBGRAPH ISOMORPHISM: does graph $H$ appear as a subgraph of graph $G$? (c) DOMINATING SET: is there a set of $k$ vertices such that every vertex is in the set or adjacent to it?

37.2 Classify each statement as true or false (no justification yet — just commit): (a) $\text{P} \subseteq \text{NP}$. (b) NP means "not polynomial." (c) Every problem in P is in NP. (d) If a problem is NP-complete, no algorithm can solve it at any speed. (e) The decision version of TSP is in NP.

37.3 † "Sort a list" is not a decision problem. Phrase two different decision problems related to sorting, and for each, say whether it is in P and why.

37.4 Write the optimization problem "find the largest clique in $G$" as a decision problem with a budget parameter, and explain in one sentence how a fast algorithm for the decision version would let you find the size of the largest clique.

37.5 † State, from memory, the two things you must prove to establish that a problem $B$ is NP-complete, and the direction the reduction must go. (This is the chapter's keystone; if you cannot do it without notes, reread §37.4.)

Part B — Computation and short arguments (⭐⭐)

37.6 A 3-CNF formula has $m = 5$ clauses. In the $\text{3-SAT} \le_p \text{CLIQUE}$ reduction of §37.5, how many vertices does the constructed graph have, and what value of $k$ do we ask CLIQUE about? What is the maximum possible number of edges?

37.7 † Run the SAT verifier by hand. For the formula $\phi = (x_1 \lor \neg x_2) \land (\neg x_1 \lor x_3) \land (x_2 \lor \neg x_3)$, determine whether each assignment satisfies $\phi$, clause by clause: (a) $x_1 = T, x_2 = T, x_3 = T$; (b) $x_1 = F, x_2 = F, x_3 = F$; (c) $x_1 = T, x_2 = F, x_3 = T$.

37.8 Suppose $A \le_p B$ and you are given an algorithm for $B$ running in time $O(n^3)$, and the reduction $f$ runs in time $O(n^2)$ and produces an instance $f(x)$ of size at most $O(n^2)$. Give a big-O bound on the running time of the resulting algorithm for $A$, and explain each term. (This is the "compose two polynomials" argument of §37.4 made concrete.)

37.9 † In the $\text{3-SAT} \le_p \text{CLIQUE}$ reduction, consider the formula $\phi = (a \lor b \lor \neg c) \land (\neg a \lor \neg b \lor c)$. Build the graph by hand: list the 6 vertices, and decide for each of the 9 cross-clause pairs whether it is an edge (connected unless contradictory). How many edges are there? Does a clique of size $k = 2$ exist, and if so, what satisfying assignment does it correspond to?

37.10 The class P is closed under complement: if a decision problem is in P, so is its complement (swap yes and no answers). Argue this in two sentences. Then explain why the same simple argument does not obviously work for NP (this gap is the source of the open class co-NP, mentioned in the further reading).

37.11 † Give the certificate and verifier for GRAPH 3-COLORING ("can $G$ be properly colored with 3 colors?"). State the running time of your verifier in terms of $V$ and $E$, and confirm it is polynomial.

Part C — Prove it (with scaffolding) (⭐⭐)

37.12 † (Fill in the missing step.) Prove that PARTITION $\in$ NP. Recall PARTITION: given a multiset $S = \{s_1, \dots, s_n\}$ of positive integers, is there a subset $T \subseteq S$ with $\sum_{s \in T} s = \sum_{s \in S \setminus T} s$? Fill the blanks:

Certificate: a subset $T \subseteq S$ (encode it as $\underline{\hphantom{xxxxxxxx}}$, e.g. a bit string of length $n$). Its size is $\underline{\hphantom{xxxx}}$ bits, which is polynomial in the input.
Verifier $V(S, T)$: compute $\sigma = \sum_{s \in S} s$, then compute $\underline{\hphantom{xxxxxx}}$ and accept iff it equals $\underline{\hphantom{xxxx}}$. This takes $\underline{\hphantom{xxxx}}$ time (one pass), which is polynomial.
Correctness: if $S$ is a yes-instance there exists a valid $T$, so a certificate exists; if $S$ is a no-instance then $\underline{\hphantom{xxxxxxxxxxxx}}$, so the verifier rejects every candidate. $\blacksquare$

37.13 Prove that $\le_p$ is reflexive: every decision problem $A$ satisfies $A \le_p A$. (Hint: what is the simplest possible reduction function?) State explicitly why your $f$ is polynomial-time and why it preserves yes/no answers.

37.14 † (Fill in the missing step.) Prove: if $B \in \text{P}$ and $A \le_p B$, then $A \in \text{P}$. (This is "easiness flows backward" — the mirror of "hardness flows forward.") Fill the blanks:

Let $M_B$ solve $B$ in time $\underline{\hphantom{xxxx}}$, and let $f$ be the reduction from $A$ to $B$, computable in time $\underline{\hphantom{xxxx}}$, with $x \in A \iff \underline{\hphantom{xxxx}} \in B$.
Algorithm for $A$ on input $x$: compute $\underline{\hphantom{xxxx}}$, then return $\underline{\hphantom{xxxxxx}}$.
Running time: computing $f(x)$ is polynomial; $\lvert f(x) \rvert$ is $\underline{\hphantom{xxxxxx}}$ in $\lvert x \rvert$; so running $M_B$ on it is polynomial. The composition of polynomials is $\underline{\hphantom{xxxx}}$. Hence $A \in \text{P}$. $\blacksquare$
In one sentence, state the contrapositive of this theorem and explain why it is the engine of every NP-hardness proof.

37.15 Prove that INDEPENDENT SET $\le_p$ CLIQUE. (Recall: an independent set is a set of pairwise non-adjacent vertices.) Use the complement graph $\overline{G}$. State the reduction $f$, prove the iff ($S$ is an independent set of size $k$ in $G$ $\iff$ $S$ is a clique of size $k$ in $\overline{G}$), and confirm $f$ is polynomial-time. Which problem does this prove is NP-hard, given that CLIQUE is NP-complete — or does it prove nothing of the sort? (Mind the direction!)

Part D — Find the error (⭐⭐)

Each argument below is wrong. State precisely which part fails and why. Reversed-direction reductions are the most common bug in this whole subject — at least one of these is exactly that.

37.16 † Claim: "TSP (optimization) is in NP." "Proof": a tour is a certificate; given a tour we can add up its cost in $O(n)$ time and check it; therefore TSP is in NP. — Find the flaw. (Hint: what exactly is the verifier supposed to confirm for an optimization problem?)

37.17 Claim: "I proved my problem $B$ is NP-hard by exhibiting a polynomial-time reduction $B \le_p \text{3-SAT}$. Since 3-SAT is NP-complete, $B$ is NP-hard." — Find the flaw, and state the reduction the student should have built.

37.18 † Claim: "NP-complete problems are unsolvable, like the halting problem." — This conflates two different walls from the book. Explain the category error in two sentences, naming the precise difference between undecidable (Chapter 36) and intractable.

37.19 Claim: "Since $\text{P} \subseteq \text{NP}$ and we showed SAT $\in$ NP, SAT must be in P." — Find the flaw in this reasoning. What would it take to conclude SAT $\in$ P, and what would that imply?

37.20 † Claim: "2-SAT is in P and 3-SAT is NP-complete; therefore $k$-SAT gets steadily harder as $k$ grows, so 4-SAT is harder than 3-SAT, 5-SAT harder still, and so on." — There is a subtle error. Show that $\text{3-SAT} \le_p \text{4-SAT}$ (pad each clause with a fresh duplicated literal or a new variable) and explain what that says about the "steadily harder" claim. (Hint: all of 3-SAT, 4-SAT, $k$-SAT for $k \ge 3$ are NP-complete — they are equally hard, not steadily harder.)

Part E — Model it (⭐⭐⭐)

Translate each real scenario into the discrete-math objects of this chapter (a decision problem, a certificate, a verifier, and where relevant a known NP-complete problem to reduce from or to).

37.21 † (Model it.) A conference organizer must schedule talks into time slots so that no two talks sharing an attendee are in the same slot, using at most $k$ slots. Model this as a decision problem. Which classic NP-complete problem from the §37.5 zoo is this exactly (give the graph construction: what are the vertices, what are the edges, what is $k$)? State the certificate and verifier.

37.22 (Model it.) A delivery company has a depot and $n$ addresses with known pairwise driving times, and asks: "Is there a route starting and ending at the depot, visiting every address once, taking at most $B$ minutes?" Name the decision problem precisely, give its certificate and verifier, and name the reduction chain (from SAT) that establishes its NP-hardness — using only links stated in §37.5.

37.23 † (Model it.) A chip-verification team wants to know whether two Boolean circuits compute the same function. They reduce it to checking whether a certain formula is unsatisfiable (the circuits differ iff some input makes their XOR output true). (a) Explain why "are these circuits equivalent?" is naturally a question about unsatisfiability, and (b) discuss why this is good news in practice even though SAT is NP-complete — connect to the §37.6 coping menu and the remark about industrial SAT solvers.

37.24 (Model it.) A social network (the book's anchor, Chapters 27–34) wants to find the largest group of users who are all mutual friends, to seed a group chat. (a) State this as the decision problem CLIQUE on the friendship graph: what are vertices, edges, and the parameter? (b) Your manager says "just write an efficient exact algorithm." Using this chapter, explain in two sentences why you should push back, and propose one item from the coping menu suited to a real social graph.

Part F — Conjecture and test (⭐⭐⭐)

Use code to gather evidence on many cases, then prove or disprove. Remember theme four: a test suggests, a proof settles.

37.25 † (Conjecture and test, then prove.) Conjecture: for the brute-force SAT solver that tries all assignments, a formula on $v$ variables requires checking exactly $2^v$ assignments in the worst case. (a) Write a function count_assignments(v) that returns the number of truth assignments over $v$ variables and test it for $v = 0, 1, 2, 3, 4$ (hand-trace the output). (b) Prove the count is exactly $2^v$ by a one-line counting argument (each variable independently true/false — connect to Chapter 1's truth tables and the product rule). (c) Explain why this exponential blow-up is consistent with SAT being in NP but is not a proof that $\text{P} \ne \text{NP}$.

37.26 (Conjecture and test.) For the complement-graph reduction (Exercise 37.15), conjecture a formula relating the number of edges in $G$ and in $\overline{G}$ on $n$ vertices. (a) Write edge_counts(n, edges) returning the pair $(\lvert E(G) \rvert, \lvert E(\overline{G}) \rvert)$ and test it on the path graph $0\!-\!1\!-\!2\!-\!3$ (hand-trace). (b) Conjecture and then prove that $\lvert E(G) \rvert + \lvert E(\overline{G}) \rvert = \binom{n}{2}$ for every simple graph on $n$ vertices.

37.27 † (Conjecture and test, then disprove.) A classmate conjectures: "The nearest-neighbor heuristic for TSP (always go to the closest unvisited city) always returns the optimal tour." (a) Using the distance matrix from §37.6's Example 5 — or a small one you write out — compute the nearest-neighbor tour by hand from a start city, then compute at least one other tour's cost. (b) Exhibit a small instance where nearest-neighbor is provably not optimal (you may construct one). (c) Conclude: nearest-neighbor is an instance of which coping strategy, and what guarantee does it sacrifice?

Part G — Interleaved & Deep Dive

Mixing techniques from earlier chapters keeps them sharp — and complexity theory is where the whole book converges.

37.28 † (Ch. 36 + 37.) Both Chapter 36 and this chapter use reductions to spread hardness. State, in two sentences, the single key difference between a reduction used to prove a problem undecidable and a polynomial-time reduction used to prove a problem NP-hard. Why does the complexity setting need the extra adjective?

37.29 (Ch. 1 + 37.) SAT is "the logic of Chapter 1 made computational." Write the Boolean formula (in CNF) that is satisfiable iff at least one of the propositional variables $p, q, r$ is true and not all three are true. Then state its certificate and confirm it is a SAT instance.

37.30 † (Ch. 14 + 37.) Chapter 14 showed trial-division factoring is exponential in the number of bits of the input. (a) Explain why FACTORING (decision form: "does $N$ have a factor $\le k$ with $1 < k$?") is in NP. (b) The chapter notes FACTORING is not known to be NP-complete. Explain in two sentences why a fast factoring algorithm would not by itself prove $\text{P} = \text{NP}$, and why this matters for RSA (Chapter 25).

37.31 (Ch. 30 + 37.) Chapter 30 contrasted Euler circuits (easy) with Hamilton circuits (hard). (a) State which class each decision problem (EULER CIRCUIT, HAMILTON CIRCUIT) is known to belong to (be precise: P, NP, NP-complete). (b) Give the polynomial-time algorithm that puts EULER CIRCUIT in P (one sentence — recall the degree condition). (c) Explain why this pair is the perfect illustration of "small change in a problem can cross the tractability boundary."

37.32 † (Ch. 9 + 37.) A polynomial-time reduction is a function $f$ (Chapter 9) with two extra properties. Name both properties precisely (one about resources, one about correctness), and state what it would mean, in function terms, for $f$ to additionally be a bijection — and whether reductions are required to be bijective. (They are not; explain why surjectivity/injectivity are irrelevant to the definition.)

37.33 (Deep Dive.) Prove the transitivity of $\le_p$ in full (the §37.4 result), being careful about the size of intermediate instances: if $A \le_p B$ via $f$ and $B \le_p C$ via $g$, show $h = g \circ f$ is a polynomial-time reduction from $A$ to $C$. Identify the one place where "the output of a polynomial-time algorithm has polynomial length" is essential, and explain what would go wrong without it.

37.34 (Deep Dive.) The "pivotal consequence" of §37.4 says: if one NP-complete problem is in P, then $\text{P} = \text{NP}$. Reconstruct this proof from scratch, then state and prove the contrapositive form ("if $\text{P} \ne \text{NP}$, then no NP-complete problem is in P"). Explain why the two NP-complete problems CLIQUE and 3-SAT "stand or fall together."

37.35 (Deep Dive — synthesis.) In one page, explain to a skeptical friend who has taken no complexity theory why almost all experts believe $\text{P} \ne \text{NP}$, despite there being no proof. Your argument must use at least three ideas from this chapter: the verify/find asymmetry (§37.3), the reduction web tying thousands of problems into one knot (§37.5), and the decades of collective failure to find a single fast NP-complete algorithm. Be honest about the difference between strong evidence and proof (theme four).

Solutions to † and odd-numbered exercises: see appendices/answers-to-selected.md. For each reduction, the rubric rewards: the correct direction (the known-hard problem is the source), a polynomial-time construction $f$, a proven iff (both directions), and a membership argument ($B \in \text{NP}$) where NP-completeness is claimed.