39 min read

> "The problem, then, is not 'can it be done?' but 'can it be done before the heat death of the universe?'"

Prerequisites

  • 27
  • 28

Learning Objectives

  • Characterize exactly which graphs have an Euler path or Euler circuit, and prove the characterization.
  • Distinguish Hamilton paths and circuits from Euler paths and circuits, and explain why the two problems behave so differently.
  • Explain why deciding Euler is efficient while deciding Hamilton is believed to be intractable, and place this on the easy/hard boundary (an NP-hard preview).
  • State the Traveling Salesman Problem in both decision and optimization form, and argue why brute force is hopeless.
  • Apply and analyze heuristics (nearest-neighbor, 2-opt, Christofides) that produce good-enough tours when optimal is out of reach.

Chapter 30: Euler Paths, Hamilton Paths, and the Traveling Salesman Problem

"The problem, then, is not 'can it be done?' but 'can it be done before the heat death of the universe?'" — a sentiment every working programmer eventually internalizes

Overview

Two problems are about to look almost identical, and one of them will quietly ruin your week if you try to brute-force it at scale.

The first: a mail carrier wants to walk down every street in a neighborhood exactly once and return to the truck. The second: a delivery driver wants to visit every house exactly once and return to the truck. These sound like the same problem with the nouns swapped — "every street" versus "every house." They are not. The first is easy in a precise, provable sense: there is a simple rule that tells you instantly whether such a walk exists, and a fast algorithm that finds it. The second is hard in an equally precise sense: nobody knows a fast algorithm, an enormous amount of money and effort suggests none exists, and finding the cheapest such tour is one of the most famously intractable problems in computer science.

This chapter is where the boundary between tractable and intractable first becomes visible. You have spent Part V learning algorithms that just work — breadth-first search, Dijkstra's algorithm — each running in time you can write down as a polynomial in the size of the graph. Here you meet the wall. The Traveling Salesman Problem (TSP) is the canonical example of a problem that is easy to state, easy to check an answer for, and apparently impossible to solve efficiently. Learning to recognize that wall — and what to do when you hit it — is one of the most valuable instincts a programmer can develop. The alternative is to ship code that works beautifully on your 10-city test case and times out catastrophically on the customer's 200 cities.

We start, as the field itself did, on a river in eighteenth-century Prussia.

In this chapter, you will learn to:

  • Decide whether a graph has an Euler path or circuit by counting odd-degree vertices, and prove why that count is the whole story.
  • Find an Euler circuit efficiently when one exists.
  • Recognize a Hamilton path/circuit problem and explain why no comparable shortcut is known.
  • State the TSP precisely and quantify why brute force fails (it's worse than exponential).
  • Reach for the right heuristic when optimal is unaffordable, and reason about how far from optimal a heuristic can land.

Learning Paths

🏃 Fast Track: Read 30.1–30.2 for Euler's theorem (the one clean, provable result here), then jump to 30.5 (TSP) and 30.6 (heuristics) for the practical payoff. Skim 30.3–30.4. Do the ⭐⭐ and ⭐⭐⭐ exercises.

📖 Standard Path: Read in order. Work every 🔄 Check Your Understanding. Before reading our proof of Euler's theorem in 30.2, try to convince yourself why odd-degree vertices are the obstruction.

🔬 Deep Dive: After the chapter, prove that nearest-neighbor can be made arbitrarily bad relative to optimal (Exercise set, Part E), and read ahead to how Chapter 37 turns the informal "NP-hard" of this chapter into a formal theorem about TSP.


30.1 The bridges of Königsberg

The Prussian city of Königsberg sat on both banks of the Pregel River, which split around two islands, leaving four landmasses connected by seven bridges. The townspeople had a Sunday pastime: could you take a stroll that crossed every one of the seven bridges exactly once? Not twice, not zero times — each bridge once, and end up wherever you liked. Nobody could find such a walk, but nobody could prove it was impossible either. It was a puzzle, the kind that feels like it should have an answer if you're just clever enough about the route.

In 1736 Leonhard Euler settled it, and in doing so arguably invented graph theory. His move is the move at the heart of this entire book: strip away everything irrelevant. The exact shape of the islands does not matter. The lengths of the bridges do not matter. What matters is only which landmasses are connected to which, and by how many bridges. So Euler replaced the city with four dots — one per landmass — and seven lines — one per bridge. The map became a graph.

Here is the structure. Label the landmasses $A$ (north bank), $B$ (south bank), $C$ (east island), $D$ (west island). The bridge counts were:

Between Number of bridges
$A$ and $D$ (west island) 2
$B$ and $D$ 2
$A$ and $C$ (east island) 1
$B$ and $C$ 1
$C$ and $D$ 1

That last point — more than one bridge between the same pair — means Königsberg is not a simple graph but a multigraph: a graph that allows multiple edges between the same two vertices. Everything in this section works for multigraphs; the degree of a vertex just counts edge-ends, so a double bridge contributes 2 to each endpoint's degree.

💡 Intuition: The whole question — "can I cross every bridge exactly once?" — is really "can I draw this graph without lifting my pen and without retracing any line?" That childhood puzzle ("draw the envelope without lifting your pen") is the bridges of Königsberg in disguise. By the end of 30.2 you'll be able to glance at any such figure and answer instantly.

Now compute the degree of each vertex (recall from Chapter 27 that $\deg(v)$ counts the edges incident to $v$):

  • $\deg(A) = 2 + 1 = 3$ (two bridges to $D$, one to $C$).
  • $\deg(B) = 2 + 1 = 3$.
  • $\deg(C) = 1 + 1 + 1 = 3$.
  • $\deg(D) = 2 + 2 + 1 = 5$.

Every vertex has odd degree. Euler's insight was that this is exactly what dooms the walk, and to see why, think about what a vertex experiences during a stroll.

Imagine you are walking a route that uses every bridge once. Consider any landmass that is not where you start or stop. Every time you walk into it across a bridge, you must later walk out of it across a different bridge — bridges come in "in-out" pairs at every intermediate stop. So the number of bridges touching an intermediate landmass must be even. The only landmasses that can get away with an odd number of bridges are the two endpoints of your walk: the start (which you leave without first arriving) and the finish (which you arrive at without leaving).

So a walk crossing every bridge exactly once can have at most two odd-degree landmasses. Königsberg has four. Therefore no such walk exists — not because the townspeople weren't clever, but because it is logically impossible. That is the difference between "we couldn't find one" and "there isn't one": a proof of impossibility, not the mere absence of a discovered route. (This is theme two of the book — proofs guarantee what testing never can.)

🔗 Connection. This "every entrance pairs with an exit" counting argument is a cousin of the handshaking lemma from Chapter 27 (the sum of all degrees is even because every edge contributes 2). Both are parity arguments — reasoning about evenness — and parity is one of the sharpest tools in the discrete-math kit. We'll make the Königsberg argument fully rigorous in the next section.

🔄 Check Your Understanding 1. Königsberg has four odd-degree vertices. If the city built one new bridge between $A$ and $B$, how many odd-degree vertices would there be, and would a once-each walk then be possible? 2. Why does a double bridge between two landmasses contribute 2 to each of their degrees rather than 1?

Answers

  1. Adding an $A$–$B$ bridge raises $\deg(A)$ from 3 to 4 and $\deg(B)$ from 3 to 4, both now even. That leaves only $C$ and $D$ odd — exactly two — so a once-each walk would now exist (starting at one of $C, D$ and ending at the other). This previews the theorem in 30.2. 2. Degree counts edges incident to the vertex (equivalently, edge-ends at the vertex), and a double bridge is two distinct edges. Each of the two edges has one end at each landmass, so each landmass gains 2.

30.2 Euler paths and circuits (and the theorem)

Let's promote Euler's argument from bridges to a general, provable theorem about graphs. First, the definitions — these are reserved to this chapter, so we state them carefully.

An Euler path in a graph $G$ is a path (more precisely, a trail: a walk that may repeat vertices but uses each edge exactly once) that traverses every edge of $G$ exactly once. An Euler circuit (also called an Euler tour) is an Euler path that starts and ends at the same vertex.

Note the careful wording: an Euler path may revisit vertices freely — you can pass through the same intersection many times — but it uses each edge exactly once. This is the key contrast with the Hamilton notions of the next section, where it's the vertices that must each be visited exactly once.

The Königsberg walk the townspeople wanted was an Euler path (cross every bridge once, ending anywhere). The mail carrier who must walk every street and return to the truck wants an Euler circuit. When does each exist? Euler's theorem gives a complete, instantly checkable answer.

Euler's Theorem (existence of Euler circuits and paths). Let $G$ be a connected graph (or multigraph) with at least one edge.

  1. $G$ has an Euler circuit if and only if every vertex has even degree.
  2. $G$ has an Euler path (that is not a circuit) if and only if exactly two vertices have odd degree. The Euler path must start at one odd-degree vertex and end at the other.

Before the proof, sit with how remarkable this is. To decide whether an Euler circuit exists, you do not search through possible routes — of which there can be astronomically many. You just count odd-degree vertices. That count is computable in time proportional to the size of the graph (one pass to tally degrees). A potentially exponential search collapses to a linear scan. This is what an easy problem looks like. Hold that feeling; in 30.4 we'll meet a problem that looks almost the same and offers no such shortcut.

💡 Intuition: Even degree means "every time you arrive, there's an unused edge to leave by." If every vertex has that property, you can never get stranded mid-tour — you only run out of edges back where you started. Odd degree is a "dead end risk": the two odd vertices are forced to be your start and finish, the two places where arriving and leaving don't have to balance.

The "only if" direction (the necessity, Euler's original argument)

The strategy first. The "only if" direction is the easy half and it's exactly the Königsberg counting argument, cleaned up. We assume an Euler circuit exists and show every degree must be even, by tracking how the circuit enters and leaves each vertex. Each visit to a vertex consumes exactly two edges (one in, one out), and the circuit uses every edge, so each vertex's edges are consumed two at a time — forcing even degree.

Proof (necessity for a circuit). Suppose $G$ has an Euler circuit $C$. Pick any vertex $v$ and walk along $C$. Each time the circuit passes through $v$, it arrives on one edge and departs on a different edge, consuming two of the edges incident to $v$. Because the circuit starts and ends at the same vertex, even the starting vertex is "passed through" in this paired sense: the very first departure pairs with the very last arrival. Since $C$ uses every edge of $G$ exactly once, every edge incident to $v$ is consumed, and they are consumed strictly in in/out pairs. Therefore the number of edges incident to $v$ — that is, $\deg(v)$ — is even. As $v$ was arbitrary, every vertex has even degree. $\blacksquare$

The path version is the same argument with one adjustment: the start and end vertices are no longer the same, so the start has one unpaired departure and the end has one unpaired arrival, making exactly those two vertices odd while all others stay even.

The "if" direction (sufficiency)

The hard half is the converse: if every degree is even (and $G$ is connected), then an Euler circuit actually exists. Necessity tells us where to not bother looking; sufficiency is what lets us build the tour. We give the standard constructive proof, because the construction is the algorithm we'll implement in the Project Checkpoint.

The strategy first. We argue by strong induction on the number of edges. The key construction: starting anywhere and greedily walking along unused edges, we can never get stuck except back at the start — because even degree guarantees an exit whenever we enter. That produces a closed circuit, but it might miss some edges. We then splice in additional closed circuits on the leftover edges, using connectivity to find a splice point. This "walk until you loop, then patch the gaps" idea is exactly Hierholzer's algorithm.

Proof (sufficiency for a circuit). Let $G$ be connected with every vertex of even degree, and argue by strong induction on the number of edges $m$.

Base case. If $m = 0$ there is nothing to traverse (or, for the smallest nontrivial connected even graph, a single vertex with a self-loop or a pair of vertices joined by two edges forms an obvious circuit). The claim holds vacuously / trivially.

Inductive step. Assume every connected even-degree graph with fewer than $m$ edges has an Euler circuit. Take such a $G$ with $m$ edges. Start at any vertex $v_0$ and walk, never reusing an edge, choosing any unused incident edge each time. Claim: this walk can only terminate by returning to $v_0$. Indeed, suppose the walk arrives at some vertex $w \ne v_0$ and gets stuck. Each previous visit to $w$ used edges in pairs (one in, one out); the current arrival uses one more edge into $w$. So an odd number of $w$'s edges have been used, but $\deg(w)$ is even, leaving at least one unused edge to depart by — contradicting "stuck." Hence the walk can only halt at $v_0$, producing a closed circuit $C_0$.

If $C_0$ uses every edge, we are done. Otherwise, remove the edges of $C_0$ from $G$. In the remaining graph $G'$, every vertex still has even degree (we removed an even number of edges at each vertex, since $C_0$ entered and left each vertex it touched in pairs). $G'$ may be disconnected, but because the original $G$ is connected, at least one vertex $u$ of $C_0$ is incident to leftover edges. The connected component of $G'$ containing $u$ is connected, has even degrees, and has fewer than $m$ edges, so by the inductive hypothesis it has an Euler circuit $C_u$ starting and ending at $u$. Now splice: traverse $C_0$ until you reach $u$, detour through the entire circuit $C_u$ (returning to $u$), then finish $C_0$. The result is a single closed circuit using strictly more edges. Repeat the splicing until all edges are used. This yields an Euler circuit of $G$. $\blacksquare$

⚠️ Common Pitfall: connectivity is not optional. Euler's theorem requires the graph to be connected (ignoring isolated vertices, which have degree 0 and don't matter). A graph can have every vertex of even degree and still have no Euler circuit if it falls into two separate pieces — you cannot traverse edges in a component you can't reach. Always check connectivity and the degree condition. The full rule is: connected, plus the right parity.

Worked example: does this graph have an Euler circuit?

Consider a graph on five vertices $\{1,2,3,4,5\}$ with edges $$\{1,2\},\{2,3\},\{3,4\},\{4,5\},\{5,1\},\{1,3\},\{3,5\}.$$

Compute degrees by tallying: vertex 1 appears in $\{1,2\},\{5,1\},\{1,3\}$ → $\deg(1)=3$; vertex 2 in $\{1,2\},\{2,3\}$ → $\deg(2)=2$; vertex 3 in $\{2,3\},\{3,4\},\{1,3\},\{3,5\}$ → $\deg(3)=4$; vertex 4 in $\{3,4\},\{4,5\}$ → $\deg(4)=2$; vertex 5 in $\{4,5\},\{5,1\},\{3,5\}$ → $\deg(5)=3$.

Two vertices (1 and 5) have odd degree; the rest are even. By Euler's theorem, there is no Euler circuit, but there is an Euler path, and it must start at one of $\{1,5\}$ and end at the other. (One such path: $1\to2\to3\to4\to5\to1\to3\to5$ — check that it uses each of the seven edges exactly once and starts at 1, ends at 5.)

Now the same logic in code. We won't search for the path yet — we'll just decide existence, which is the whole point of the theorem: deciding is cheap.

def euler_status(adj):
    """adj: dict vertex -> list of neighbors (undirected, may repeat for multigraph).
    Assumes the graph is connected (ignoring isolated vertices)."""
    odd = [v for v in adj if len(adj[v]) % 2 == 1]
    if len(odd) == 0:
        return "Euler circuit exists"
    if len(odd) == 2:
        return f"Euler path exists, between {odd[0]} and {odd[1]}"
    return "No Euler path or circuit"

g = {1: [2, 5, 3], 2: [1, 3], 3: [2, 4, 1, 5], 4: [3, 5], 5: [4, 1, 3]}
print(euler_status(g))
# Expected output:
# Euler path exists, between 1 and 5

The function does a single pass over the vertices, tallying parities — $O(V + E)$ work, exactly as promised. No route-searching. That's a tractable problem.

🔄 Check Your Understanding 1. A connected graph has degree sequence $4, 4, 4, 2, 2$. Euler circuit, Euler path, or neither? 2. In the sufficiency proof, where exactly did we use the assumption that every degree is even? Name both places. 3. The mail carrier wants to walk every street and return to the depot. Which of {Euler path, Euler circuit} is that, and what must be true of every intersection's degree?

Answers

  1. All degrees even and connected → Euler circuit. 2. (a) To show the greedy walk can only get stuck at the start: at any other vertex an odd number of used edges plus even total degree leaves an exit. (b) To show the leftover graph $G'$ still has all-even degrees, so the inductive hypothesis applies. 3. An Euler circuit (start = end = depot). Every intersection must have even degree, and the street network must be connected.

30.3 Hamilton paths and circuits

Swap the noun. Euler is about edges; the next problem is about vertices.

A Hamilton path in a graph $G$ is a path that visits every vertex of $G$ exactly once. A Hamilton circuit (or Hamilton cycle) is a Hamilton path that returns to its starting vertex — a cycle passing through every vertex exactly once.

The name honors William Rowan Hamilton, who in 1857 sold a puzzle ("the Icosian game") asking players to find a cycle visiting all 20 corners of a dodecahedron exactly once. Where Euler wanted to traverse every connection, Hamilton wanted to visit every place.

The delivery driver who must visit every house exactly once and return to the depot wants a Hamilton circuit. A circuit board tester that must probe every test point in one continuous stylus motion wants a Hamilton path. A genome-assembly algorithm stitching together overlapping DNA fragments is, under the hood, hunting for a Hamilton path through a graph of fragments.

The two problems look like mirror images, so you'd expect a mirror-image theorem: "a graph has a Hamilton circuit if and only if [some easily checked local condition]." No such theorem is known. This is not a gap in your education or in this textbook — it is one of the genuine asymmetries of mathematics. There is no known degree-counting shortcut, no local test, that decides Hamiltonicity. The best general method known is, in essence, search: try to build the path and backtrack when stuck.

💡 Intuition: Euler's condition is local — it depends only on each vertex's degree, checkable one vertex at a time. Hamiltonicity is global — whether a Hamilton circuit exists can depend on the entire structure in a way no single vertex reveals. Local properties are cheap to check; global ones, in general, are not. That distinction — local versus global — is much of why one problem is easy and the other is hard.

Sufficient conditions exist — but no clean characterization

We do have sufficient conditions: rules guaranteeing a Hamilton circuit exists. They're useful, but note what they are not — they are not "if and only if." A graph can fail them and still be Hamiltonian.

Dirac's Theorem (1952). If $G$ is a simple graph with $n \ge 3$ vertices and every vertex has degree at least $n/2$, then $G$ has a Hamilton circuit.

The intuition: if every vertex is connected to at least half the others, the graph is so densely woven that you can always route a cycle through all of them. Dirac's theorem is a real, provable result — but it's a one-way street. Many Hamiltonian graphs are sparse and violate the $n/2$ bound (a simple cycle $C_n$ is Hamiltonian — it is a Hamilton circuit — yet every vertex has degree only 2). So Dirac's condition lets you sometimes say "yes, definitely," but never "no, definitely." There is no known condition that lets you efficiently say "no" in general.

Worked example: contrasting Euler and Hamilton on the same graph

Take the complete graph $K_4$ — four vertices, every pair joined by an edge (recall $K_n$ from Chapter 27).

  • Euler: every vertex of $K_4$ has degree 3 (joined to the other three). Three is odd, and all four vertices are odd, so by Euler's theorem $K_4$ has neither an Euler path nor an Euler circuit.
  • Hamilton: $K_4$ obviously has a Hamilton circuit — e.g., $1\to2\to3\to4\to1$ visits every vertex once and returns. In fact every complete graph $K_n$ with $n \ge 3$ has a Hamilton circuit (just visit the vertices in any order; all edges exist).

So here is a graph with a Hamilton circuit but no Euler circuit. And Königsberg was (essentially) a graph being asked for an Euler path. The two properties are genuinely independent: a graph can have either, both, or neither.

from itertools import permutations

def has_hamilton_circuit_bruteforce(vertices, edge_set):
    """Brute force: try every ordering of the vertices as a candidate cycle.
    edge_set: set of frozenset({u, v}). WARNING: O(n!) — tiny graphs only."""
    v0 = vertices[0]                      # fix a start to avoid rotations
    rest = vertices[1:]
    for perm in permutations(rest):
        cycle = [v0] + list(perm) + [v0]
        if all(frozenset({cycle[i], cycle[i+1]}) in edge_set
               for i in range(len(cycle) - 1)):
            return True
    return False

V = [1, 2, 3, 4]
E = {frozenset(e) for e in [(1,2),(1,3),(1,4),(2,3),(2,4),(3,4)]}  # K4
print(has_hamilton_circuit_bruteforce(V, E))
# Expected output:
# True

Look hard at that permutations call. It is the tell. To decide Hamiltonicity by this method we march through orderings of the vertices — and there are $(n-1)!$ of them to consider (fixing the start). For $K_4$ that's $3! = 6$ candidates, trivial. For 15 cities it's $14! \approx 8.7 \times 10^{10}$. For 20 cities it's $19! \approx 1.2 \times 10^{17}$. The next section makes precise why this explosion is not a coding inefficiency you can optimize away.

🔄 Check Your Understanding 1. State the one-word difference between the definitions of an Euler circuit and a Hamilton circuit (what gets visited exactly once?). 2. Dirac's theorem says degree $\ge n/2$ guarantees a Hamilton circuit. The cycle graph $C_6$ (six vertices in a ring) is Hamiltonian but every vertex has degree 2. Does this contradict Dirac's theorem? 3. Does $K_5$ have an Euler circuit? A Hamilton circuit?

Answers

  1. Euler: every edge exactly once. Hamilton: every vertex exactly once. 2. No contradiction. Dirac's theorem is a sufficient condition (an "if-then"), not an "if and only if." $C_6$ fails the hypothesis ($2 < 3 = 6/2$), so the theorem simply says nothing about it; the theorem is never violated by a Hamiltonian graph that fails its hypothesis. 3. $K_5$: every vertex has degree 4 (even) and it's connected, so it has an Euler circuit. It also has a Hamilton circuit (any ordering of the 5 vertices, since all edges exist).

30.4 Why Hamilton is hard but Euler is easy

We've asserted the asymmetry; now let's make it sharp. Both problems ask "does a certain kind of route through the graph exist?" Yet:

  • Euler is decided by counting odd-degree vertices: $O(V + E)$ time. A polynomial in the input size — in fact linear. We call such problems tractable.
  • Hamilton has no known polynomial algorithm. Every general method known is, at its core, a search that can blow up exponentially. We strongly believe (though no one has proved) that no polynomial algorithm exists.

🚪 Threshold Concept: the tractable/intractable boundary. Up to now, "we have an algorithm" has felt the same as "the problem is solved." This chapter breaks that equivalence. There is a profound, structural difference between problems solvable in polynomial time ($n$, $n^2$, $n^3$, …) and problems for which the only known methods take exponential or factorial time ($2^n$, $n!$). Polynomial means "scales to large inputs"; exponential means "dies past a few dozen elements, forever, no matter how fast your hardware." Euler is on the easy side of this line; Hamilton is (apparently) on the hard side. Once you can feel this boundary, you read every new problem differently: the first question becomes not "how do I code this?" but "which side of the line is this on?" Almost everything in Chapter 37 — the theory of P, NP, and NP-completeness — is the project of mapping that line precisely.

Just how bad is exponential?

It is worth internalizing the numbers, because "exponential is slow" is an abstraction until you see it bite. Suppose a computer checks one billion ($10^9$) candidate routes per second. A brute-force Hamilton/TSP search examines roughly $(n-1)!$ orderings:

Cities $n$ Orderings $\approx (n-1)!$ Time at $10^9$/sec
10 $362{,}880$ < 0.001 second
15 $8.7 \times 10^{10}$ ~87 seconds
20 $1.2 \times 10^{17}$ ~3.9 years
25 $6.2 \times 10^{23}$ ~20 million years
30 $8.8 \times 10^{30}$ ~$2.8 \times 10^{14}$ years

The age of the universe is roughly $1.4 \times 10^{10}$ years. By 25 cities, brute force already needs a million times that. Buying a computer a thousand times faster moves you from 25 cities to … about 27. You cannot hardware your way out of factorial growth. This is the concrete meaning of "intractable," and it's why the epigraph asks whether a thing can be done "before the heat death of the universe."

💡 Intuition: Polynomial algorithms turn a 10× bigger input into a constant-factor more work (10× bigger input, 100× the time for an $n^2$ algorithm — annoying but survivable). Exponential algorithms turn each single added element into a multiplicative blowup. Polynomial scales; exponential surrenders.

NP-hard: naming the difficulty (a preview)

Computer scientists have a precise vocabulary for "as hard as the hardest problems of this type." A problem is NP-hard if it is at least as hard as every problem in the class NP — informally, the class of problems whose proposed solutions can be checked quickly even if finding them seems to require search. Both the Hamilton circuit problem and the TSP are NP-hard.

We are deliberately keeping this informal here; this is a preview. The formal machinery — what NP is, what a polynomial-time reduction is, what it means for TSP to be NP-complete — is the subject of Chapter 37, which builds directly on this chapter. For now, carry three takeaways:

  1. Checking is easy; finding is (apparently) hard. Given a proposed Hamilton circuit, you can verify it in linear time (walk it, confirm each vertex appears once and each consecutive pair is an edge). The asymmetry between verifying a solution and producing one is the entire drama of NP.
  2. NP-hard problems are linked. A fast algorithm for any one NP-hard problem would yield a fast algorithm for all of them. That's why a breakthrough on TSP would be earth-shaking — and why, after decades without one, most researchers believe none exists.
  3. "NP-hard" is a license to stop searching for a perfect fast algorithm and start looking for something cheaper-but-imperfect — which is exactly 30.6.

🔗 Connection. The question of whether NP-hard problems truly have no polynomial algorithm is the famous P vs. NP problem — one of the seven Millennium Prize Problems, with a \$1,000,000 bounty. Chapter 37 states it precisely. When you recognize TSP as NP-hard in a code review, you are invoking, informally, the same boundary that million-dollar problem is about.

🐛 Find the Error. A teammate says: "Euler and Hamilton are both 'visit-everything-once' problems, so I'll just adapt our fast Euler-circuit code to find Hamilton circuits — same idea, swap edges for vertices." What's wrong with this plan?

Answer

The Euler-circuit algorithm is fast precisely because Euler's theorem reduces existence to a local degree count, and Hierholzer's construction (30.2/Project Checkpoint) exploits that local structure to build a tour greedily without backtracking. Hamiltonicity has no such local characterization — there is no degree count or greedy splice that decides it. "Swap edges for vertices" is a verbal analogy, not an algorithmic one; the underlying mathematics that makes Euler easy simply does not transfer. The teammate would end up reinventing exponential backtracking search and calling it "the same idea." The deep lesson: surface similarity of problem statements says nothing about similarity of computational difficulty.


30.5 The Traveling Salesman Problem

Hamilton asks a yes/no question: does a tour visiting every vertex once exist? The Traveling Salesman Problem asks the harder, money-driven question every logistics company actually faces: of all such tours, which is cheapest?

The Traveling Salesman Problem (TSP) is: given a set of $n$ cities and the cost (distance, time, or price) of travel between each pair, find a tour that visits every city exactly once and returns to the start, minimizing the total cost.

A TSP tour is a Hamilton circuit — on the complete weighted graph where every pair of cities is joined by an edge labeled with its cost. TSP layers an optimization demand on top of Hamilton's existence demand. (On a complete graph a Hamilton circuit always exists, so TSP's difficulty is entirely in the minimization, not in finding some tour.)

TSP is everywhere logistics is: UPS and Amazon routing delivery vehicles, a CNC drilling machine ordering the holes to punch in a circuit board (move the drill the least total distance), a telescope scheduling the order to slew between targets, a tour operator sequencing stops. Anywhere "visit a set of things in the cheapest order and come back" appears, TSP is underneath.

Two forms: decision and optimization

It helps to state TSP two ways, because Chapter 37's theory is phrased in terms of decision problems (yes/no), while practice cares about optimization (the actual best tour).

  • Optimization TSP: output the minimum-cost tour (and its cost).
  • Decision TSP: given a budget $B$, is there a tour of total cost $\le B$? (Yes/no.)

These are tightly linked: if you could answer the decision version quickly for every budget, you could binary-search the budget to find the optimum quickly. So they stand or fall together on tractability. The decision form is the one proved NP-complete in Chapter 37; the optimization form is NP-hard.

Why brute force is hopeless (counting the tours)

How many distinct tours are there on $n$ cities? Fix a starting city (a tour is a cycle, so the start is just a rotation and doesn't create a new tour). That leaves $(n-1)!$ orderings of the remaining cities. If costs are symmetric (cost from $A$ to $B$ equals $B$ to $A$), each tour and its reverse are the same cycle, halving the count to $(n-1)!/2$.

So the number of genuinely distinct symmetric tours is $\frac{(n-1)!}{2}$. We already saw what factorial growth does. Exhaustively pricing every tour is the table from 30.4 all over again: fine to about 12 cities, hopeless by 25. Real delivery routes have hundreds of stops.

Let's make the brute force concrete on a tiny instance so the structure is unmistakable — then never speak of running it at scale again.

from itertools import permutations

def tsp_bruteforce(dist):
    """dist: n x n cost matrix. Returns (best_cost, best_tour). O(n!) — n <= ~10."""
    n = len(dist)
    cities = range(1, n)                       # fix city 0 as start/end
    best_cost, best_tour = float("inf"), None
    for perm in permutations(cities):
        tour = (0,) + perm + (0,)
        cost = sum(dist[tour[i]][tour[i+1]] for i in range(n))
        if cost < best_cost:
            best_cost, best_tour = cost, tour
    return best_cost, best_tour

D = [[0,   1,   2, 100],
     [1,   0,   3,   4],
     [2,   3,   0,   5],
     [100, 4,   5,   0]]
print(tsp_bruteforce(D))
# Expected output:
# (12, (0, 1, 3, 2, 0))

Let me hand-verify that output, because reasoning out the answer is the discipline. With city 0 fixed as start, the $3! = 6$ permutations of $\{1,2,3\}$ give 6 tours; by symmetry they pair into 3 distinct cycles:

  • $0\to1\to2\to3\to0$: $1 + 3 + 5 + 100 = 109$.
  • $0\to1\to3\to2\to0$: $1 + 4 + 5 + 2 = 12$.
  • $0\to2\to1\to3\to0$: $2 + 3 + 4 + 100 = 109$.

(The other three permutations are these reversed, with identical costs.) The minimum is 12, achieved by $0\to1\to3\to2\to0$ — matching the printed output. Note that this tour avoids the expensive edge between cities 0 and 3 (cost 100) by never traversing it. On four cities this is instant; the point is purely to expose the $(n-1)!$ structure before we abandon it.

🔗 Connection. Held–Karp dynamic programming solves TSP exactly in $O(n^2 2^n)$ time — dramatically better than $O(n!)$ (for $n=20$, about $4 \times 10^8$ versus $10^{17}$), and a lovely application of the recurrence techniques from Chapters 18–19. But $2^n$ is still exponential: Held–Karp pushes the wall from ~12 cities to ~25, not to thousands. Exact TSP remains intractable at scale. When you truly need exactness on bigger instances, specialized branch-and-bound solvers (e.g., Concorde) routinely handle thousands of cities by aggressive pruning — but their worst case is still exponential, because TSP is NP-hard.

🔄 Check Your Understanding 1. How many distinct tours must brute force consider for a symmetric TSP on 6 cities? 2. Why is finding some tour on a complete graph trivial, even though finding the cheapest tour is NP-hard? 3. If someone hands you a tour and claims it costs at most \$500, how long does it take to check that claim? What does that say about which complexity class TSP's decision form sits in?

Answers

  1. $\frac{(6-1)!}{2} = \frac{120}{2} = 60$ distinct tours. 2. On a complete graph every pair of cities is connected, so any ordering of the cities is a valid tour — existence is free. All the difficulty is in minimizing the cost over the $(n-1)!/2$ tours. 3. Checking is fast: sum the $n$ edge costs along the given tour and compare to \$500 — $O(n)$ time. Because a proposed solution is verifiable in polynomial time, decision-TSP lies in the class NP (the "easy to check" class), even though finding the solution seems to require exponential search. That verify/find gap is the heart of P vs. NP (Chapter 37).

30.6 Heuristics: coping with hard problems

If TSP is NP-hard, what does the Amazon driver do every morning? They don't wait twenty million years for the optimal route. They use a heuristic: a method that runs fast and produces a good answer, while giving up the guarantee of the best answer.

A heuristic is an algorithm that trades the guarantee of an optimal (or even correct) solution for speed and practicality — it aims to be good enough, fast enough. A heuristic with a proven bound on how far from optimal it can land is called an approximation algorithm.

This is a genuine engineering philosophy, not a defeat. For an NP-hard problem, "fast and within a few percent of optimal" is usually worth far more than "optimal but never finishes." The art is choosing a heuristic whose answers are good enough for the stakes and knowing how good — or how bad — they can be.

Heuristic 1: nearest-neighbor (greedy, fast, sometimes terrible)

The simplest idea: from your current city, always go to the nearest unvisited city; when all are visited, return to the start. This is a greedy algorithm (the strategy you met informally with Dijkstra in Chapter 29 — repeatedly take the locally cheapest option).

def nearest_neighbor(dist, start=0):
    """Greedy TSP heuristic. dist: n x n matrix. Returns (cost, tour). O(n^2)."""
    n = len(dist)
    unvisited = set(range(n)) - {start}
    tour, cost, current = [start], 0, start
    while unvisited:
        nxt = min(unvisited, key=lambda c: dist[current][c])  # nearest unvisited
        cost += dist[current][nxt]
        current = nxt
        tour.append(current)
        unvisited.remove(current)
    cost += dist[current][start]                               # return home
    tour.append(start)
    return cost, tour

D = [[0,   1,   2, 100],     # same instance as the brute-force example above
     [1,   0,   3,   4],
     [2,   3,   0,   5],
     [100, 4,   5,   0]]
print(nearest_neighbor(D, start=0))
# Expected output:
# (109, (0, 1, 2, 3, 0))

Hand-trace it on $D$, starting at 0. From 0 the nearest is city 1 (cost 1; compare 1, 2, 100) → cost 1. From 1 the nearest unvisited is city 2 (cost 3; compare 3 to city 2, 4 to city 3) → cost 4. From 2 the only unvisited is city 3 (cost 5) → cost 9. Now the trap springs: the only way home is the edge $3\to0$, which costs 100 → total 109, tour $(0,1,2,3,0)$.

Notice: nearest-neighbor returned 109, but the true optimum (from 30.5) was 12. Greed grabbed the cheap edges 0→1→2 and walked itself into a corner, forced to pay the 100-cost edge to get home; the optimal tour pays a little more early ($0\to1\to3$) precisely to avoid that edge. Nearest-neighbor runs in $O(n^2)$ (fast!), but as this 9×-overshoot shows, its tours can be arbitrarily worse than optimal in the worst case. It's the first thing to try and rarely the last.

⚠️ Common Pitfall: greedy is not optimal. It is tempting to believe that always taking the cheapest immediate step yields the cheapest overall tour. It does not — TSP is exactly a problem where local greed misleads. (Contrast Chapter 29's Dijkstra and Chapter 32's MST algorithms, where greedy is provably optimal. Greedy works for some problems and fails for others; knowing which is which is the skill, and "it's NP-hard" is a strong hint that simple greed won't be optimal.)

Heuristic 2: 2-opt (local search / improvement)

A different strategy: start with any tour (say nearest-neighbor's), then repeatedly improve it by small local edits until no edit helps. The classic edit is the 2-opt swap: pick two edges of the current tour, remove them, and reconnect the two resulting paths the other way (which reverses a segment). If the swap shortens the tour, keep it; otherwise undo it. Repeat until no 2-opt swap improves the tour — a local optimum.

💡 Intuition: A 2-opt swap untangles a "crossing" in the tour. If you draw a tour on a map and two of its legs cross like an X, swapping those two edges uncrosses them and is always shorter (the triangle inequality). 2-opt systematically irons out such crossings. It typically gets within a few percent of optimal on realistic instances — a huge improvement over raw nearest-neighbor for modest extra cost.

2-opt is a local search heuristic: it explores the neighborhood of the current solution for an improving move. It can get stuck in a local optimum that isn't global (no single swap helps, yet a better tour exists). Richer methods — 3-opt, Or-opt, Lin–Kernighan, simulated annealing, genetic algorithms — escape local optima with cleverer or randomized moves, all under the same banner: don't demand optimal; iterate toward good.

Heuristic 3: Christofides — a heuristic with a guarantee

Nearest-neighbor and 2-opt are fast but offer no promise about quality. For metric TSP — where the costs satisfy the triangle inequality ($\text{cost}(A,C) \le \text{cost}(A,B) + \text{cost}(B,C)$, true for real road/air distances) — there is a celebrated approximation algorithm.

Christofides' algorithm (1976). For metric TSP, Christofides' algorithm produces a tour whose cost is at most $\tfrac{3}{2}$ times the optimal, in polynomial time.

A factor-$\frac{3}{2}$ guarantee means: whatever the optimal tour costs, Christofides' tour costs at most 50% more — provably, on every metric instance, in polynomial time. That's the gold standard for an NP-hard problem: you can't get the exact optimum fast, but you can get provably close fast. (Christofides builds on a minimum spanning tree and a minimum-weight matching — machinery from Chapters 32 and 34 — so the full construction will make more sense after those chapters; here, absorb the idea that approximation can come with a proof.)

🔗 Connection. The hierarchy of responses to an NP-hard problem is a template you'll reuse across CS: (1) exact but exponential (brute force, Held–Karp, branch-and-bound) when $n$ is small or exactness is mandatory; (2) approximation with a proven bound (Christofides) when you need a quality guarantee; (3) fast heuristics with no guarantee (nearest-neighbor, 2-opt, metaheuristics) when speed dominates and "usually good" suffices. Chapter 37 ("Coping with intractability") formalizes this menu. Recognizing which response a situation calls for is exactly the engineering judgment this chapter is teaching.

🔄 Check Your Understanding 1. What does a "$\frac{3}{2}$-approximation" guarantee, precisely? 2. Nearest-neighbor and 2-opt both run fast. What's the key difference in how they search? 3. Why might a company deliberately choose nearest-neighbor over Christofides even though Christofides has a quality guarantee?

Answers

  1. On every (metric) instance, the tour it returns costs at most $1.5\times$ the cost of the optimal tour — a worst-case ratio proven to hold always, in polynomial time. 2. Nearest-neighbor constructs a tour once, greedily, in a single pass and stops. 2-opt starts from a complete tour and iteratively improves it via local swaps until no improvement is found (local search). 3. Christofides is more complex to implement (it needs MST + minimum-weight matching) and has higher constant-factor overhead; if the instances are not metric, its guarantee doesn't even apply; and for many real workloads nearest-neighbor + a few 2-opt passes is simpler and empirically good enough. Engineering is about fit, not just worst-case bounds.

Project Checkpoint: Euler circuits for the Toolkit

This chapter's clean, tractable result is Euler's theorem, and its constructive proof is an algorithm — Hierholzer's "walk until you loop, then splice the gaps." Let's add Euler detection and construction to graphs.py (the module begun in Chapter 27, where the Graph class lives, and extended with traversal in Chapters 28–29). These functions compose with the existing Graph by reading its adjacency structure.

We add three functions: a parity check, an existence test, and the Hierholzer constructor.

# dmtoolkit/graphs.py  (add to the Chapter 27 Graph module)

def has_euler_circuit(g):
    """True iff connected graph g has an Euler circuit: every vertex even degree."""
    return all(len(g.neighbors(v)) % 2 == 0 for v in g.vertices())

def has_euler_path(g):
    """True iff g has an Euler path (0 or 2 odd-degree vertices)."""
    odd = sum(1 for v in g.vertices() if len(g.neighbors(v)) % 2 == 1)
    return odd == 0 or odd == 2

def euler_circuit(g, start):
    """Hierholzer's algorithm: return an Euler circuit (vertex list) or None.
    Assumes g is connected with all even degrees. Edges consumed via a copy."""
    if not has_euler_circuit(g):
        return None
    # adjacency as multiset counters so we can 'remove' used edges
    adj = {v: list(g.neighbors(v)) for v in g.vertices()}
    stack, circuit = [start], []
    while stack:
        v = stack[-1]
        if adj[v]:                       # unused edge out of v?
            w = adj[v].pop()             # take it
            adj[w].remove(v)             # remove the twin (undirected)
            stack.append(w)              # walk forward
        else:
            circuit.append(stack.pop())  # dead end -> backtrack, record
    return circuit[::-1]                 # reverse for forward order

The algorithm walks forward along unused edges (the greedy walk from the sufficiency proof); when it dead-ends, it pops vertices into the circuit, which automatically splices subtours in the right place. It runs in $O(V + E)$ — linear, the hallmark of a tractable problem.

Here's the checkpoint test, on the graph $1\!-\!2\!-\!3\!-\!1$ (a triangle, all degrees 2):

# project-checkpoint.py
g = Graph(directed=False)
for u, v in [(1, 2), (2, 3), (3, 1)]:
    g.add_edge(u, v)
print(has_euler_circuit(g))
print(euler_circuit(g, start=1))
# Expected output:
# True
# [1, 3, 2, 1]

Hand-trace (Python's list.pop() removes the last element). After the build, adj = {1:[2,3], 2:[1,3], 3:[2,1]}. Start stack [1]. From 1, adj[1].pop() takes 3: stack [1,3]. From 3, pop takes 2: stack [1,3,2]. From 2, pop takes 1: stack [1,3,2,1]. Now 1 has no unused edges → pop 1 into the circuit [1]; 2 none → [1,2]; 3 none → [1,2,3]; 1 none → [1,2,3,1]. Reverse → [1,3,2,1], a valid Euler circuit. (Edge-pop order can vary the specific tour; any valid Euler circuit is correct — [1,2,3,1] is equally valid and is what a front-popping variant would yield.)

Toolkit so far: logic.py (Ch.1–3), sets.py (Ch.8), relations.py (Ch.12–13), combinatorics.py (Ch.15–17), recurrences.py (Ch.18–19), probability.py (Ch.20–21), numbertheory.py / crypto.py (Ch.22–25), coding.py (Ch.26), and the growing graphs.py (Ch.27–29), now with Euler-circuit detection and construction. We deliberately stop at Euler — the tractable side. Your Toolkit will not contain an efficient Hamilton/TSP solver, because none is known; instead, a later checkpoint can add the nearest-neighbor heuristic, encoding the chapter's real lesson: know which problems you can solve exactly, and which you must merely approximate.


Summary

This chapter drew the line between problems that scale and problems that don't, using two deceptively similar route-finding questions.

The two route problems (the one-word difference is everything):

Visits exactly once Existence test Difficulty
Euler path/circuit every edge count odd-degree vertices easy — $O(V+E)$
Hamilton path/circuit every vertex none known (search) hard — NP-hard, no known polynomial algorithm

Euler's theorem (connected graph, $\ge 1$ edge):

Walk Exists iff
Euler circuit (closed) every vertex has even degree
Euler path (open) exactly two vertices have odd degree (start/end at those two)
Neither more than two odd-degree vertices
  • Necessity is a parity argument: each pass through a vertex consumes edges in in/out pairs.
  • Sufficiency is constructive (Hierholzer): walk until you loop, splice subtours into the gaps; this is the $O(V+E)$ algorithm in the Project Checkpoint.
  • Königsberg has four odd-degree vertices → no Euler path; this 1736 result launched graph theory.

Hamilton & TSP:

  • No known efficient characterization of Hamiltonicity. Sufficient conditions exist (Dirac: degree $\ge n/2 \Rightarrow$ Hamilton circuit) but they are not "if and only if."
  • TSP = cheapest Hamilton circuit on a complete weighted graph; an optimization layered on Hamilton's existence.
  • Distinct symmetric tours: $\frac{(n-1)!}{2}$. Brute force is factorial; Held–Karp DP is $O(n^2 2^n)$ — better but still exponential.
  • Both Hamilton and TSP are NP-hard (preview; formalized in Chapter 37). Checking a tour is easy ($O(n)$); finding the best is (apparently) not — the verify/find gap behind P vs. NP.

The tractable/intractable boundary (threshold concept): polynomial time scales to large inputs; exponential/factorial time dies past a few dozen elements regardless of hardware. The first question about any new problem: which side of the line is it on?

Coping with NP-hardness (the response menu):

Response Example Trade-off
Exact, exponential brute force, Held–Karp, branch-and-bound optimal; only small $n$
Approximation (proven bound) Christofides ($\frac{3}{2}\times$ optimal, metric TSP) provably close; polynomial
Heuristic (no guarantee) nearest-neighbor ($O(n^2)$), 2-opt (local search) fast; "usually good," sometimes bad

Spaced Review

Retrieval keeps knowledge available. Answer from memory before checking.

  1. (Ch. 28) Euler-circuit detection needs the graph to be connected. Which traversal from Chapter 28 would you run to verify connectivity before applying Euler's theorem, and what is its time complexity?
  2. (Ch. 28) Both BFS and DFS run in $O(V + E)$. Our Euler existence test also runs in $O(V+E)$. In one sentence, why is "$O(V+E)$" the signature of a tractable graph problem, in contrast to the $(n-1)!$ of brute-force TSP?
  3. (Ch. 27) State the handshaking lemma. How is the parity argument behind Euler's theorem (each visit consumes two edge-ends at a vertex) related to it?
  4. (Ch. 27) $K_5$ is the complete graph on 5 vertices. Using only degree facts, decide whether $K_5$ has an Euler circuit — and explain why you can answer without searching for one.

Answers

  1. Run BFS or DFS from any vertex and check that every (non-isolated) vertex was reached; it runs in $O(V + E)$. 2. $O(V+E)$ is polynomial (in fact linear) in the input size, so it scales to huge graphs; $(n-1)!$ is factorial, so it explodes past ~12–25 elements no matter the hardware — polynomial scales, factorial surrenders. 3. The handshaking lemma: $\sum_{v}\deg(v) = 2\lvert E\rvert$ (the degree sum is even because each edge contributes 2). Euler's necessity argument is the local version of the same idea — at a single vertex, the circuit consumes edge-ends two at a time, forcing each individual degree to be even. 4. Every vertex of $K_5$ has degree 4 (even) and $K_5$ is connected, so by Euler's theorem it has an Euler circuit — we answer purely from the degree count, the point of the theorem: deciding is a linear scan, not a search.

What's Next

You now hold both halves of a crucial contrast: a problem with a clean, local, polynomial characterization (Euler), and a problem that resists every such shortcut (Hamilton, TSP). That contrast is the seed of two later chapters. In Chapter 33 we return to graph structure — coloring vertices, drawing graphs without crossing edges (planarity), and Euler's other famous formula relating a planar graph's vertices, edges, and faces — and we'll meet more problems straddling the easy/hard line (some colorings are easy, optimal coloring is hard). Then in Chapter 37 the informal "NP-hard" of this chapter becomes a precise theorem: we define the classes P and NP, prove what it means for TSP and the Hamilton circuit problem to be NP-complete, and confront the million-dollar question of whether the wall we hit here is permanent. The bridges of Königsberg were the start of graph theory; the Traveling Salesman is the gateway to the theory of computation's deepest open problem.