Chapter 37 — Key Takeaways

DataField.Dev

Chapter 37 — Key Takeaways

Introduction to Complexity Theory: P, NP, and the Biggest Open Problem in CS — the reread-before-the-exam card.

The one question this chapter answers

When I can't make a problem fast: is it my fault (a fast algorithm exists, I haven't found it) or the problem's (no fast algorithm exists for anyone)?

Complexity theory gives the vocabulary to tell these apart.

Setup conventions

Study decision problems (yes/no output). Optimization ↔ decision: binary-search the budget $B$ → same tractability.
Input size $n$ = bits/symbols to write the instance. Cost = worst-case asymptotic time.
Efficient := polynomial time. Robust because polynomial time is model-independent (any reasonable model simulates another with polynomial overhead — the strong Church–Turing thesis).

P and NP (memorize these)

Class	Formal	Plain English	Examples
P	solvable in $O(n^k)$, fixed $k$	can find the answer fast	connectivity, shortest path, PRIMES, Euler circuit, 2-SAT
NP	yes-instance has poly-size certificate, verifier runs in poly time	can check a proposed answer fast	SAT, TSP (dec.), Hamilton circuit, CLIQUE, $k$-coloring

NP $\ne$ "not polynomial". NP = Nondeterministic Polynomial = poly-time verifiable.
Certificate = the proposed solution (tour, assignment, vertex set). Verifier $V(x,c)$ = poly-time checker.
yes-instance ⇒ some certificate passes; no-instance ⇒ no certificate passes.

$\text{P} \subseteq \text{NP}$ (proved): a verifier can ignore the certificate and re-solve. (Open: is the inclusion strict?)

The verify/find gap (🚪 threshold)

$\text{P} \stackrel{?}{=} \text{NP}$ asks: is finding a solution as easy as recognizing one?

The existential quantifier "∃ a solution" makes finding hard; a named witness discharges it, leaving only checking.
Almost everyone bets $\text{P} \ne \text{NP}$ (checking is genuinely easier). Unproven.

Reductions — DIRECTION IS EVERYTHING

$$A \le_p B \quad\Longleftrightarrow\quad \exists\ \text{poly-time } f:\ \big(x \in A \iff f(x) \in B\big)$$

Means: $A$ is no harder than $B$; $B$ is at least as hard as $A$. (Mnemonic: $\le$ → left side is the easier one.)
A fast algo for $B$ ⇒ a fast algo for $A$ (translate, then solve). Hardness flows backward: $A$ hard ⇒ $B$ hard.
Transitive: $A \le_p B$ and $B \le_p C$ ⇒ $A \le_p C$ (compose $g \circ f$; poly-of-poly is poly).

Concept (Ch. 36)	Concept (Ch. 37)	Difference
reduction for undecidability	reduction for NP-hardness	undecidability reduction need only be computable; complexity reduction must be polynomial-time

NP-hard vs. NP-complete

Term	Definition	In NP?
NP-hard	every $A \in \text{NP}$ has $A \le_p H$	not necessarily (can be harder / not even a decision problem)
NP-complete	in NP AND NP-hard	yes — the hardest problems within NP

Pivotal consequence: ONE NP-complete problem in P ⇒ $\text{P} = \text{NP}$ (all collapse together). So all NP-complete problems stand or fall together.

To prove $B$ is NP-complete: 1. Show $B \in \text{NP}$ (give a poly-time verifier / short certificate). 2. Show $B$ is NP-hard: reduce a known NP-complete problem TO $B$ → $\text{SAT} \le_p B$ (or $\text{3-SAT} \le_p B$). - ⚠️ The known-hard problem is the SOURCE. Reducing $B \le_p \text{SAT}$ proves nothing about $B$'s hardness.

Cook–Levin theorem (1971) — the keystone

SAT is NP-complete.

SAT = "is this Boolean formula satisfiable?" (∃ a true/false assignment making it true).
Proof idea: encode any NP verifier's poly-time computation as a Boolean formula that is satisfiable iff the input is a yes-instance. (Variables = machine config over time; clauses = correct start + valid steps + accepting end.)
Why it matters: gives the first NP-complete problem from scratch. After it, prove others NP-complete by a single reduction (transitivity) instead of arguing about all of NP.
SAT is in NP via the verifier: check each clause has ≥1 true literal → $O(\text{formula size})$.

The NP-complete zoo (all interreducible from SAT)

              Cook-Levin (from ALL of NP)
                       |
                      SAT
                       |  (break long clauses)
                    3-SAT  ────────────────────────┐
                    /   \              \            \
              CLIQUE   3-COLORING   SUBSET SUM   HAMILTON CIRCUIT
                |                                      |
          VERTEX COVER / INDEPENDENT SET            TSP (decision)

3-SAT $\le_p$ CLIQUE gadget: vertex per literal-occurrence; $m$ clauses = $m$ "columns"; edge iff different clause AND not contradictory ($x_i$ vs $\neg x_i$); set $k = m$. Clique of size $m$ = one true literal per clause, consistent.
Parameter-sensitivity: 2-SAT ∈ P, but 3-SAT is NP-complete. Tiny change crosses the tractability line.

P vs. NP — the \$1,000,000 question

One of 7 Millennium Prize Problems (Clay Institute, 2000), \$1M.
$\text{P} = \text{NP}$ ⟺ some (⟺ every) NP-complete problem has a poly-time algorithm.
If $\text{P} = \text{NP}$: intractable problems become easy; most cryptography collapses (breaking a code is an NP problem).
If $\text{P} \ne \text{NP}$: the hardness is permanent; finding ≠ checking, provably.
Factoring (RSA, Ch. 25): in NP, but not known to be NP-complete — possibly a middle zone.

Sacrifice	Strategy	Example	Keep
general	exploit input structure	2-SAT, trees, small parameters (FPT)	exact + fast on restricted class
exact	approximation / heuristics	Christofides ($\le\frac32\times$, metric TSP, Ch.30); nearest-neighbor; 2-opt	fast + general, near-optimal
fast	smart exponential search	branch-and-bound; Held–Karp $O(n^2 2^n)$; modern SAT solvers (millions of vars!)	exact + general, tolerable in practice

"NP-complete" bounds the WORST case over ALL inputs — not a verdict on your instances. Real inputs are rarely worst-case. The label tells you which tool to reach for, not "give up."

Common pitfalls

Pitfall	Fix
Reading NP as "non-polynomial"	NP = poly-verifiable; $\text{P} \subseteq \text{NP}$ (sorting is in NP).
Reducing $B \le_p \text{SAT}$ to prove $B$ hard	Backwards. Need $\text{SAT} \le_p B$ — known-hard problem is the source.
Forgetting step (1) "$B \in \text{NP}$"	NP-complete = NP-hard AND in NP. Both required.
"NP-complete ⇒ unsolvable / give up"	It's solvable (exp. search); worst-case ≠ your case; use the coping menu.
Confusing undecidable (Ch.36) with intractable (Ch.37)	Undecidable = no algorithm at any speed; intractable = exists but super-poly.
Assuming a fast factoring algo proves $\text{P}=\text{NP}$	Factoring isn't known NP-complete; the implication doesn't follow.

Toolkit addition (`graphs.py`, builds on Ch.27–34)

Function	Returns	Cost	Point
`verify_hamilton_circuit(g, tour)`	is `tour` a valid Hamilton circuit?	$O(V+E)$	the NP verifier — checks, never finds

Toolkit holds verifiers and tractable solvers — never an efficient NP-complete solver (that would prove $\text{P}=\text{NP}$). Encodes the verify/find gap in code.

Cross-references

Builds on: Ch.14 (Big-O, polynomial vs. exponential, tractable/intractable, lower bounds), Ch.30 (TSP, Hamilton, NP-hard preview, the easy/hard boundary, coping menu), Ch.36 (Turing machines, decidable/undecidable, reductions, Church–Turing thesis).
Synthesizes: Ch.1 (SAT = Boolean logic), Part V graphs (CLIQUE, Hamilton, coloring), Ch.9 (reductions = poly-time functions).
(No forward references — Ch.37 is a terminal node in the dependency map.)

Chapter 37 — Key Takeaways

The one question this chapter answers

Setup conventions

P and NP (memorize these)

The verify/find gap (🚪 threshold)

Reductions — DIRECTION IS EVERYTHING

NP-hard vs. NP-complete

Cook–Levin theorem (1971) — the keystone

The NP-complete zoo (all interreducible from SAT)

P vs. NP — the \$1,000,000 question

Coping menu — give up exactly ONE of

Common pitfalls

Toolkit addition (graphs.py, builds on Ch.27–34)

Cross-references

Toolkit addition (`graphs.py`, builds on Ch.27–34)