Case Study: Building a Delivery-Route Optimizer for a Hard Problem

DataField.Dev

Case Study: Building a Delivery-Route Optimizer for a Hard Problem

"NP-hard is a license to stop searching for a perfect fast algorithm and start looking for something cheaper-but-imperfect." — paraphrasing this chapter's §30.4

Executive Summary

A regional courier, SwiftDrop, dispatches a single van each morning from its depot to a list of delivery stops, and wants the cheapest route that visits every stop once and returns to the depot. That is the Traveling Salesman Problem (§30.5) — NP-hard, with no known fast exact algorithm. Our job is to build a small, honest route optimizer: an exact solver for tiny days, a fast greedy heuristic, and a local-search improver, then measure how each behaves so SwiftDrop's dispatcher knows which tool to reach for as the stop count grows.

This is a build case study (contrast Case Study 1, which analyzed an existing network). We write three solvers from scratch — tsp_bruteforce, nearest_neighbor, and two_opt — exactly mirroring §30.5–30.6, hand-derive every output, and assemble them into a tiny decision tool. The lesson is the chapter's central engineering instinct: recognize the wall (factorial growth), then deploy the right response (exact when small, heuristic when not), rather than shipping code that works on the 8-stop demo and times out on the customer's 200 stops.

Skills applied

Implementing exact (brute-force) and heuristic (nearest-neighbor, 2-opt) TSP solvers from scratch (§30.5–30.6).
Hand-deriving expected output for each — never running code (a Hard Rule of this book).
Quantifying factorial blow-up to set the brute-force cutoff (§30.4).
Reasoning about heuristic quality: when greedy fails, and how local search recovers (§30.6).
Composing the pieces into a single "pick the right solver" dispatcher.

Background

The scenario

SwiftDrop's depot is location $0$. On a representative morning there are four delivery stops, so five locations total: $\{0, 1, 2, 3, 4\}$. The dispatcher has a cost matrix D (drive-time in minutes between every pair; symmetric, so D[i][j] == D[j][i], and D[i][i] = 0). We will use this matrix throughout — chosen small enough that we can hand-check every tour, exactly as this book's "reason out the output" discipline demands:

$$D = \begin{bmatrix} 0 & 2 & 9 & 10 & 7 \\ 2 & 0 & 6 & 4 & 3 \\ 9 & 6 & 0 & 8 & 5 \\ 10 & 4 & 8 & 0 & 6 \\ 7 & 3 & 5 & 6 & 0 \end{bmatrix}$$

As code:

D = [[ 0,  2,  9, 10,  7],
     [ 2,  0,  6,  4,  3],
     [ 9,  6,  0,  8,  5],
     [10,  4,  8,  0,  6],
     [ 7,  3,  5,  6,  0]]

The dispatcher wants three things: the true cheapest route (while the day is small enough to afford it), a fast route for mornings with too many stops to solve exactly, and a sense of how far off the fast route tends to be.

💡 Intuition: Every route is a Hamilton circuit on the complete weighted graph of locations (§30.5). With the depot fixed as start and end, a route on five locations is an ordering of the four stops — $(5-1)!/2 = 12$ genuinely distinct routes. Twelve is small enough to enumerate by hand; the structure (a permutation search) is what matters, because it is the same structure that becomes hopeless at 25 stops.

Why this matters

SwiftDrop is exactly the situation §30.5 describes: "anywhere 'visit a set of things in the cheapest order and come back' appears, TSP is underneath." If the dispatcher naively runs an exact solver on a big day, the van sits idle while the laptop grinds for years (the §30.4 table is unforgiving). If they always use a crude greedy route, they quietly burn fuel on bad tours. The professional move is to build the menu and choose deliberately. That choice is the whole point of recognizing TSP as NP-hard.

Phase 1: Build the exact solver (and find the cutoff)

Start with truth: brute force, so we have a correct answer to measure heuristics against. This mirrors tsp_bruteforce from §30.5 exactly.

from itertools import permutations

def tsp_bruteforce(dist):
    """Exact TSP. dist: n x n cost matrix. Returns (best_cost, best_tour). O(n!)."""
    n = len(dist)
    best_cost, best_tour = float("inf"), None
    for perm in permutations(range(1, n)):        # fix city 0 as start/end
        tour = (0,) + perm + (0,)
        cost = sum(dist[tour[i]][tour[i+1]] for i in range(n))
        if cost < best_cost:
            best_cost, best_tour = cost, tour
    return best_cost, best_tour

Because there are only $12$ distinct tours, we can enumerate them all by hand and be certain of the optimum the way §30.5 insists — reason it out, don't just trust the print. Fixing the depot $0$ as start and end, each tour is $0 \to a \to b \to c \to d \to 0$ for an ordering $(a,b,c,d)$ of $\{1,2,3,4\}$; reversal symmetry pairs the $24$ orderings into $12$ distinct tours. Here are all twelve, priced edge by edge (using $D[0][1]{=}2,\,D[0][2]{=}9,\,D[0][3]{=}10,\,D[0][4]{=}7,\,D[1][2]{=}6,\,D[1][3]{=}4,\,D[1][4]{=}3,\,D[2][3]{=}8,\,D[2][4]{=}5,\,D[3][4]{=}6$):

Tour	Cost
$0\,1\,2\,3\,4\,0$	$2+6+8+6+7 = 29$
$0\,1\,2\,4\,3\,0$	$2+6+5+6+10 = 29$
$0\,1\,3\,2\,4\,0$	$2+4+8+5+7 = \mathbf{26}$
$0\,1\,3\,4\,2\,0$	$2+4+6+5+9 = \mathbf{26}$
$0\,1\,4\,2\,3\,0$	$2+3+5+8+10 = 28$
$0\,1\,4\,3\,2\,0$	$2+3+6+8+9 = 28$
$0\,2\,1\,3\,4\,0$	$9+6+4+6+7 = 32$
$0\,2\,1\,4\,3\,0$	$9+6+3+6+10 = 34$
$0\,2\,3\,1\,4\,0$	$9+8+4+3+7 = 31$
$0\,2\,4\,1\,3\,0$	$9+5+3+4+10 = 31$
$0\,3\,1\,2\,4\,0$	$10+4+6+5+7 = 32$
$0\,3\,2\,1\,4\,0$	$10+8+6+3+7 = 34$

The minimum is 26, achieved by two equivalent tours. The brute-force loop scans orderings in itertools.permutations order — $(1,2,3,4), (1,2,4,3), (1,3,2,4), \dots$ — so the first tour reaching cost 26 is $(1,3,2,4)$, i.e. $0\to1\to3\to2\to4\to0$. That is what the solver returns:

print(tsp_bruteforce(D))
# Expected output:
# (26, (0, 1, 3, 2, 4, 0))

Where is the cutoff? Our five-location day was trivial, but brute force is $O(n!)$. Using the §30.4 table, at a billion tours per second: ~12 stops is the practical ceiling; by 15 it is ~87 seconds and by 20 it is ~3.9 years. So our exact solver earns its keep only for small mornings — and we must set the boundary in code, not by feel.

import math

def brute_force_feasible(n, tours_per_sec=1e9, max_seconds=60.0):
    """Can we brute-force n locations within max_seconds? (distinct symmetric tours)."""
    return math.factorial(n - 1) / 2 / tours_per_sec <= max_seconds

print([n for n in range(4, 18) if brute_force_feasible(n)])
# Expected output:
# [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

Hand-check the boundary at a 60-second budget: $(15-1)!/2 = 14!/2 \approx 4.36\times10^{10}$ tours, which at $10^9$/sec is about $43.6$ s — under 60, so $n=15$ passes. The next size, $n=16$, is $15!/2 \approx 6.5\times10^{11}$ tours $\approx 654$ s — over the minute, so it fails. Hence the list stops at 15. The cutoff is not a fixed number — it depends entirely on your time budget, which is exactly why you compute it rather than guess.

⚠️ Common Pitfall: Do not eyeball the brute-force cutoff. "It felt fast on my test data" is how a solver ships that dies on the customer's first big day. Compute $(n-1)!/2$ against your time budget explicitly, as above, and set the cutoff in code. Factorial growth gives almost no warning: it is fine, fine, fine — then catastrophic within a stop or two.

🔄 Check Your Understanding The §30.5 Connection notes Held–Karp solves TSP exactly in $O(n^2 2^n)$. For SwiftDrop, would swapping brute force for Held–Karp let the dispatcher solve a 200-stop day exactly?

Answer

No. Held–Karp is dramatically better than $O(n!)$ — it pushes the exact-solving ceiling from ~12 to ~25 stops — but $2^n$ is still exponential. At $n = 200$, $2^{200}$ is astronomically beyond reach. Held–Karp moves the wall; it does not remove it. For 200 stops, exact is off the table and we must use a heuristic.

Phase 2: Build the fast greedy heuristic

When the day is too big for exact, reach for speed. Nearest-neighbor (§30.6): from the current location, always drive to the nearest unvisited stop, then return to the depot.

def nearest_neighbor(dist, start=0):
    """Greedy TSP heuristic. Returns (cost, tour). O(n^2)."""
    n = len(dist)
    unvisited = set(range(n)) - {start}
    tour, cost, current = [start], 0, start
    while unvisited:
        nxt = min(unvisited, key=lambda c: dist[current][c])   # nearest unvisited
        cost += dist[current][nxt]
        current = nxt
        tour.append(current)
        unvisited.remove(current)
    cost += dist[current][start]                               # return home
    tour.append(start)
    return cost, tour

Hand-trace it on D from the depot $0$, exactly as §30.6 models:

At $0$: nearest unvisited is $1$ (cost $D[0][1]=2$; compare $2, 9, 10, 7$). Go to $1$. Cost $2$.
At $1$: unvisited $\{2,3,4\}$; costs $D[1][\cdot] = 6, 4, 3$. Nearest is $4$ (3). Cost $5$.
At $4$: unvisited $\{2,3\}$; costs $D[4][\cdot] = 5, 6$. Nearest is $2$ (5). Cost $10$.
At $2$: only $3$ left; $D[2][3] = 8$. Cost $18$.
Return $3 \to 0$: $D[3][0] = 10$. Total cost $28$, tour $(0, 1, 4, 2, 3, 0)$.

print(nearest_neighbor(D, start=0))
# Expected output:
# (28, (0, 1, 4, 2, 3, 0))

Nearest-neighbor produced 28, while the true optimum (Phase 1) is 26. That's about 8% over optimal — the greedy van grabbed the cheap early hops ($0\to1\to4$) and was then forced into the costly closing legs $2\to3$ (8) and $3\to0$ (10), the two priciest edges in the whole matrix. Fast ($O(n^2)$), but suboptimal, just as §30.6 warns. We don't stop here.

🔗 Connection. Greed misled us again — the same trap as the §30.6 worked example and Case Study 1's snowplow. Contrast Dijkstra (Chapter 29) and Kruskal's MST (Chapter 32), where greedy is provably optimal. "It's NP-hard" is itself the hint that simple greed won't be optimal here (§30.6 pitfall).

Phase 3: Build the local-search improver (2-opt)

Take nearest-neighbor's tour and iteratively improve it with 2-opt swaps (§30.6): repeatedly pick two edges, remove them, and reconnect the other way (reversing a segment); keep the swap if it shortens the tour; stop when no swap helps (a local optimum).

def tour_cost(dist, tour):
    return sum(dist[tour[i]][tour[i+1]] for i in range(len(tour) - 1))

def two_opt(dist, tour):
    """Local search: apply improving 2-opt swaps until none remain. Returns (cost, tour)."""
    best = tour[:]
    improved = True
    while improved:
        improved = False
        for i in range(1, len(best) - 2):              # endpoints fixed at depot
            for k in range(i + 1, len(best) - 1):
                candidate = best[:i] + best[i:k+1][::-1] + best[k+1:]   # reverse segment
                if tour_cost(dist, candidate) < tour_cost(dist, best):
                    best, improved = candidate, True
    return tour_cost(dist, best), best

Let's hand-trace it starting from nearest-neighbor's tour $T_0 = (0, 1, 4, 2, 3, 0)$, cost $28$. The double loop scans positions $i = 1, 2, 3$ and $k > i$, reversing the segment best[i:k+1] each time. Scanning from $i=1$:

$i=1, k=2$ (reverse $[1,4]\to[4,1]$): $0\,4\,1\,2\,3\,0 = 7+3+6+8+10 = 34$. Not $< 28$; reject.
$i=1, k=3$ (reverse $[1,4,2]\to[2,4,1]$): $0\,2\,4\,1\,3\,0 = 9+5+3+4+10 = 31$. Reject.
$i=1, k=4$ (reverse $[1,4,2,3]\to[3,2,4,1]$): $0\,3\,2\,4\,1\,0 = 10+8+5+3+2 = 28$. Equal, not strictly less; reject.
$i=2, k=3$ (reverse $[4,2]\to[2,4]$): $0\,1\,2\,4\,3\,0 = 2+6+5+6+10 = 29$. Reject.
$i=2, k=4$ (reverse $[4,2,3]\to[3,2,4]$): $0\,1\,3\,2\,4\,0 = 2+4+8+5+7 = \mathbf{26}$. Improves $28 \to 26$! Keep it; set improved = True.

The improving swap untangled a crossing: the legs $1\to4$ and $2\to3$ in $T_0$ crossed, and reversing the middle segment uncrossed them. Because a swap was found, the outer while loop runs again from the new tour $T_1 = (0, 1, 3, 2, 4, 0)$, cost $26$. Re-scanning every $(i,k)$ on $T_1$ by hand (twelve candidates) finds no strictly improving swap — every reversal costs $26$ or more. So 2-opt halts:

nn_cost, nn_tour = nearest_neighbor(D, start=0)     # (28, (0,1,4,2,3,0))
print(two_opt(D, nn_tour))
# Expected output:
# (26, (0, 1, 3, 2, 4, 0))

So 2-opt improved nearest-neighbor's 28 down to 26 — and on this small instance it happened to reach the global optimum (Phase 1 found 26), closing the entire gap for almost no extra engineering. On realistic, larger instances, 2-opt usually lands within a few percent of optimal (§30.6) — not always at it — which is the practical sweet spot.

⚠️ Common Pitfall: Do not over-generalize from this happy ending. 2-opt finds a local optimum: a tour no single swap can improve. Here that local optimum coincided with the global one (26), but in general it need not — local search can stall in a valley that isn't the deepest. The §30.6 worked example is exactly such a case. Escaping a bad local optimum needs richer moves (3-opt, Or-opt, Lin–Kernighan) or randomized restarts / simulated annealing — all variations on "don't demand optimal; iterate toward good."

🐛 Find the Error. A teammate sets the 2-opt loop bounds as for i in range(0, len(best)) and for k in range(0, len(best)), reversing segments that include position $0$ (the depot) or run past the end. Why does this break the optimizer for a closed tour that must start and end at the depot?

Answer

The depot is fixed as both the first and last element of the tour list. Reversing a segment that includes index $0$ (or the final index) can move the depot out of its endpoint position, producing a "tour" that no longer starts/ends at the depot — an invalid route. The correct bounds keep $i \ge 1$ and $k \le \text{len}-2$ so the two depot endpoints are never disturbed (as in the two_opt above). Also, $k$ must exceed $i$ so the reversed segment is non-trivial. Off-by-one bounds here silently corrupt the route rather than crash — the worst kind of bug.

Phase 4: Assemble the dispatcher (pick the right solver)

Now compose the three solvers into one decision tool that embodies the §30.6 response menu: exact when the day is small enough, heuristic-plus-improvement when it isn't.

def plan_route(dist, exact_cutoff=12):
    """Pick a solver by problem size. Returns (method, cost, tour)."""
    n = len(dist)
    if n <= exact_cutoff:
        cost, tour = tsp_bruteforce(dist)          # exact; only for small n
        return ("exact", cost, tour)
    cost, tour = nearest_neighbor(dist, start=0)    # fast start
    cost, tour = two_opt(dist, tour)                # then improve
    return ("heuristic (NN + 2-opt)", cost, tour)

print(plan_route(D)[0], plan_route(D)[1])
# Expected output:
# exact 26

For SwiftDrop's five-location morning, plan_route chooses the exact solver and returns the optimal cost 26 — because five is well under the cutoff. On a 50-stop day it would instead return the heuristic route, fast and "good enough," because exact is off the table. That branch — which side of the tractability line am I on? — is the entire lesson of §30.4 turned into a line of code.

🔗 Connection. The cutoff embeds the §30.4 threshold concept: below it, exact is affordable; above it, you must approximate. A production system would also offer Held–Karp ($O(n^2 2^n)$) as a middle tier (exact up to ~25 stops) and reach for a library solver (e.g. Concorde, or OR-Tools' metaheuristics) for the hard middle ground — but the shape of the decision is exactly this one.

Discussion Questions

plan_route uses a conservative exact_cutoff=12. Using brute_force_feasible and the §30.4 table, work out the largest cutoff that still "finishes within a minute," then redo it for "within a millisecond." How much does the safe cutoff move between those two budgets, and why might you still pick a value below the maximum?
Nearest-neighbor's result depended on starting at the depot ($0$). Would starting the greedy search at a different location (then rotating the tour back to start at the depot) ever give a cheaper nearest-neighbor tour? Sketch how you'd build a "best-of-all-starts" nearest-neighbor and what it costs in time.
On this small instance 2-opt happened to reach the optimum, but in general it stalls at a local optimum above the global one (see the §30.6 worked example, where it cannot recover from the trap). Describe one concrete way to escape a bad local optimum, and explain the trade-off it introduces (hint: more moves or randomness means more time per route).
The dispatcher cares about minutes saved, not big-O. On this instance, raw greedy cost $28$ while both exact and 2-opt found $26$ — a $2$-minute saving that 2-opt captured for almost no extra compute. Argue when each solver's extra compute time is worth its extra route savings — i.e., when is "good enough, instantly" better than "optimal, eventually"?
SwiftDrop's costs are drive-times, which obey the triangle inequality (a detour never beats the direct leg). What does that unlock from §30.6 that a non-metric cost (e.g., with surge pricing that makes some direct legs more expensive than a two-hop path) would forbid?

Your Turn: Extensions

Option A (verify a claimed route). Write verify_tour(dist, tour, claimed_cost) that confirms tour visits every location exactly once, returns to the depot, and actually costs claimed_cost. Run it on the optimal tour and the nearest-neighbor tour above; include hand-derived # Expected output: comments. This is the linear-time verifier §30.5 highlights.
Option B (add Held–Karp as a middle tier). Research the Held–Karp recurrence (it uses the dynamic-programming ideas of Chapters 18–19) and sketch — in pseudocode, no need to run — how you'd slot it into plan_route as the solver for $13 \le n \le 25$. State its time and space complexity.
Option C (measure the heuristic gap). For three different $5\times 5$ matrices you write out by hand, compute nearest_neighbor, two_opt, and tsp_bruteforce, and tabulate the ratio heuristic_cost / optimal_cost. Conjecture from your three data points whether 2-opt is reliably within, say, 15% of optimal on small metric instances. (Connect to Exercise 30.26.)
Option D (multi-start 2-opt). Combine Options from Discussion Q2 and Q3: run nearest-neighbor from every possible start, apply 2-opt to each, and keep the best overall. Hand-trace it for two starts on D and report the best tour found. Explain why this "multi-start local search" often beats a single run for modest extra cost.

Key Takeaways

Build the whole menu, then choose. A professional TSP tool is not one algorithm but three: exact for small $n$, a fast greedy start, and a local-search improver — selected by problem size (§30.6).
Compute the brute-force cutoff; never guess it. Factorial growth gives almost no warning. Set the exact/heuristic boundary in code against an explicit time budget, as brute_force_feasible does.
Greedy is fast but not optimal; local search recovers the gap. Nearest-neighbor hit 28 (8% over), and 2-opt improved it to 26 — for little extra work — matching the optimum on this instance.
Heuristics stall in local optima — in general. 2-opt reached the global optimum here, but its only guarantee is a local optimum; on harder instances it can stall above the best tour (the §30.6 trap). Knowing a heuristic's limits is part of using it responsibly.
The dispatcher's if n <= cutoff is the threshold concept in code. "Which side of the tractability line am I on?" is the first question to ask of any new problem — and for TSP the honest answer reshapes the whole design (§30.4).