Chapter 37 β Key Takeaways
Introduction to Complexity Theory: P, NP, and the Biggest Open Problem in CS β the reread-before-the-exam card.
The one question this chapter answers
When I can't make a problem fast: is it my fault (a fast algorithm exists, I haven't found it) or the problem's (no fast algorithm exists for anyone)?
Complexity theory gives the vocabulary to tell these apart.
Setup conventions
- Study decision problems (yes/no output). Optimization β decision: binary-search the budget $B$ β same tractability.
- Input size $n$ = bits/symbols to write the instance. Cost = worst-case asymptotic time.
- Efficient := polynomial time. Robust because polynomial time is model-independent (any reasonable model simulates another with polynomial overhead β the strong ChurchβTuring thesis).
P and NP (memorize these)
| Class | Formal | Plain English | Examples |
|---|---|---|---|
| P | solvable in $O(n^k)$, fixed $k$ | can find the answer fast | connectivity, shortest path, PRIMES, Euler circuit, 2-SAT |
| NP | yes-instance has poly-size certificate, verifier runs in poly time | can check a proposed answer fast | SAT, TSP (dec.), Hamilton circuit, CLIQUE, $k$-coloring |
- NP $\ne$ "not polynomial". NP = Nondeterministic Polynomial = poly-time verifiable.
- Certificate = the proposed solution (tour, assignment, vertex set). Verifier $V(x,c)$ = poly-time checker.
- yes-instance β some certificate passes; no-instance β no certificate passes.
$\text{P} \subseteq \text{NP}$ (proved): a verifier can ignore the certificate and re-solve. (Open: is the inclusion strict?)
The verify/find gap (πͺ threshold)
$\text{P} \stackrel{?}{=} \text{NP}$ asks: is finding a solution as easy as recognizing one?
- The existential quantifier "β a solution" makes finding hard; a named witness discharges it, leaving only checking.
- Almost everyone bets $\text{P} \ne \text{NP}$ (checking is genuinely easier). Unproven.
Reductions β DIRECTION IS EVERYTHING
$$A \le_p B \quad\Longleftrightarrow\quad \exists\ \text{poly-time } f:\ \big(x \in A \iff f(x) \in B\big)$$
- Means: $A$ is no harder than $B$; $B$ is at least as hard as $A$. (Mnemonic: $\le$ β left side is the easier one.)
- A fast algo for $B$ β a fast algo for $A$ (translate, then solve). Hardness flows backward: $A$ hard β $B$ hard.
- Transitive: $A \le_p B$ and $B \le_p C$ β $A \le_p C$ (compose $g \circ f$; poly-of-poly is poly).
| Concept (Ch. 36) | Concept (Ch. 37) | Difference |
|---|---|---|
| reduction for undecidability | reduction for NP-hardness | undecidability reduction need only be computable; complexity reduction must be polynomial-time |
NP-hard vs. NP-complete
| Term | Definition | In NP? |
|---|---|---|
| NP-hard | every $A \in \text{NP}$ has $A \le_p H$ | not necessarily (can be harder / not even a decision problem) |
| NP-complete | in NP AND NP-hard | yes β the hardest problems within NP |
Pivotal consequence: ONE NP-complete problem in P β $\text{P} = \text{NP}$ (all collapse together). So all NP-complete problems stand or fall together.
To prove $B$ is NP-complete: 1. Show $B \in \text{NP}$ (give a poly-time verifier / short certificate). 2. Show $B$ is NP-hard: reduce a known NP-complete problem TO $B$ β $\text{SAT} \le_p B$ (or $\text{3-SAT} \le_p B$). - β οΈ The known-hard problem is the SOURCE. Reducing $B \le_p \text{SAT}$ proves nothing about $B$'s hardness.
CookβLevin theorem (1971) β the keystone
SAT is NP-complete.
- SAT = "is this Boolean formula satisfiable?" (β a true/false assignment making it true).
- Proof idea: encode any NP verifier's poly-time computation as a Boolean formula that is satisfiable iff the input is a yes-instance. (Variables = machine config over time; clauses = correct start + valid steps + accepting end.)
- Why it matters: gives the first NP-complete problem from scratch. After it, prove others NP-complete by a single reduction (transitivity) instead of arguing about all of NP.
- SAT is in NP via the verifier: check each clause has β₯1 true literal β $O(\text{formula size})$.
The NP-complete zoo (all interreducible from SAT)
Cook-Levin (from ALL of NP)
|
SAT
| (break long clauses)
3-SAT βββββββββββββββββββββββββ
/ \ \ \
CLIQUE 3-COLORING SUBSET SUM HAMILTON CIRCUIT
| |
VERTEX COVER / INDEPENDENT SET TSP (decision)
- 3-SAT $\le_p$ CLIQUE gadget: vertex per literal-occurrence; $m$ clauses = $m$ "columns"; edge iff different clause AND not contradictory ($x_i$ vs $\neg x_i$); set $k = m$. Clique of size $m$ = one true literal per clause, consistent.
- Parameter-sensitivity: 2-SAT β P, but 3-SAT is NP-complete. Tiny change crosses the tractability line.
P vs. NP β the \$1,000,000 question
- One of 7 Millennium Prize Problems (Clay Institute, 2000), \$1M.
- $\text{P} = \text{NP}$ βΊ some (βΊ every) NP-complete problem has a poly-time algorithm.
- If $\text{P} = \text{NP}$: intractable problems become easy; most cryptography collapses (breaking a code is an NP problem).
- If $\text{P} \ne \text{NP}$: the hardness is permanent; finding β checking, provably.
- Factoring (RSA, Ch. 25): in NP, but not known to be NP-complete β possibly a middle zone.
Coping menu β give up exactly ONE of
| Sacrifice | Strategy | Example | Keep |
|---|---|---|---|
| general | exploit input structure | 2-SAT, trees, small parameters (FPT) | exact + fast on restricted class |
| exact | approximation / heuristics | Christofides ($\le\frac32\times$, metric TSP, Ch.30); nearest-neighbor; 2-opt | fast + general, near-optimal |
| fast | smart exponential search | branch-and-bound; HeldβKarp $O(n^2 2^n)$; modern SAT solvers (millions of vars!) | exact + general, tolerable in practice |
"NP-complete" bounds the WORST case over ALL inputs β not a verdict on your instances. Real inputs are rarely worst-case. The label tells you which tool to reach for, not "give up."
Common pitfalls
| Pitfall | Fix |
|---|---|
| Reading NP as "non-polynomial" | NP = poly-verifiable; $\text{P} \subseteq \text{NP}$ (sorting is in NP). |
| Reducing $B \le_p \text{SAT}$ to prove $B$ hard | Backwards. Need $\text{SAT} \le_p B$ β known-hard problem is the source. |
| Forgetting step (1) "$B \in \text{NP}$" | NP-complete = NP-hard AND in NP. Both required. |
| "NP-complete β unsolvable / give up" | It's solvable (exp. search); worst-case β your case; use the coping menu. |
| Confusing undecidable (Ch.36) with intractable (Ch.37) | Undecidable = no algorithm at any speed; intractable = exists but super-poly. |
| Assuming a fast factoring algo proves $\text{P}=\text{NP}$ | Factoring isn't known NP-complete; the implication doesn't follow. |
Toolkit addition (graphs.py, builds on Ch.27β34)
| Function | Returns | Cost | Point |
|---|---|---|---|
verify_hamilton_circuit(g, tour) |
is tour a valid Hamilton circuit? |
$O(V+E)$ | the NP verifier β checks, never finds |
Toolkit holds verifiers and tractable solvers β never an efficient NP-complete solver (that would prove $\text{P}=\text{NP}$). Encodes the verify/find gap in code.
Cross-references
- Builds on: Ch.14 (Big-O, polynomial vs. exponential, tractable/intractable, lower bounds), Ch.30 (TSP, Hamilton, NP-hard preview, the easy/hard boundary, coping menu), Ch.36 (Turing machines, decidable/undecidable, reductions, ChurchβTuring thesis).
- Synthesizes: Ch.1 (SAT = Boolean logic), Part V graphs (CLIQUE, Hamilton, coloring), Ch.9 (reductions = poly-time functions).
- (No forward references β Ch.37 is a terminal node in the dependency map.)