Appendix F: Graph Algorithms Reference

This is the page to keep open while you work through Part V (Chapters 27–34) or reach for a graph algorithm in real code. It gives the nine algorithms every programmer is expected to know one quick-reference card each: what the algorithm is for, what it eats and returns, how fast it runs, the single idea that makes it work, and where the book develops it in full.

Throughout, $V$ is the number of vertices and $E$ the number of edges of $G = (V, E)$. Running times are for the adjacency-list representation, the right default for the sparse graphs you meet in practice (Chapter 28); adjacency-matrix bounds are usually a factor of $V$ worse for traversal, noted where it matters. A graph is sparse when $E$ is close to $V$ and dense when $E$ is close to $V^2$ — a distinction that decides several choices below.

💡 Intuition: Almost every algorithm on this page is one of two traversals — breadth-first or depth-first search — with a small twist. Dijkstra is BFS with a priority queue. Topological sort is DFS writing down each vertex as it leaves it. Prim is Dijkstra with a different key. Learn the two traversals deeply and the rest is variations on a theme.

The cards at a glance

Algorithm	Solves	Time (adj. list)	Chapter
BFS	Reachability; shortest paths in unweighted graphs	$\Theta(V + E)$	28
DFS	Reachability; cycle detection; components; basis for topo sort	$\Theta(V + E)$	28
Topological sort	Linear order of a DAG respecting all dependencies	$\Theta(V + E)$	13, 28
Dijkstra	Single-source shortest paths, non-negative weights	$O((V + E)\log V)$	29
Bellman–Ford	Single-source shortest paths with negative edges; detects negative cycles	$O(V E)$	29
Kruskal	Minimum spanning tree	$O(E \log V)$	32
Prim	Minimum spanning tree	$O(E \log V)$	32
Ford–Fulkerson	Maximum flow in a capacitated network	$O(E \cdot \lvert f^* \rvert)$; Edmonds–Karp $O(V E^2)$	34
Bipartite matching	Largest set of independent edges in a bipartite graph	$O(V E)$ via flow	34

The remaining sections expand each row into a full card.

Breadth-First Search (BFS)

Purpose. Explore a graph in expanding "rings" from a start vertex. Answers reachability (which vertices can I get to?) and, crucially, the shortest path in an unweighted graph — the fewest edges between two vertices, the famous "degrees of separation" in a social network.
Input / output. In: a graph $G$ and a source $s$. Out: for every reachable vertex, its distance from $s$ (in edges) and a predecessor pointer; following predecessors back from any vertex reconstructs a shortest path to it.
Time / space. $\Theta(V + E)$ time on an adjacency list (each vertex dequeued once, each edge examined once); $\Theta(V^2)$ on a matrix. $O(V)$ space for the queue and visited/distance arrays.
Key idea. A FIFO queue guarantees you finish all vertices at distance $d$ before any at distance $d+1$, so the first time you reach a vertex is along a shortest path.
Watch for. Mark a vertex visited when you enqueue it, not when you dequeue it, or you will enqueue the same vertex many times. BFS gives shortest paths only when every edge counts equally; the moment edges carry weights, you need Dijkstra.
Toolkit: bfs(g, s). Chapter 28.

Depth-First Search (DFS)

Purpose. Explore a graph by plunging as deep as possible along each branch before backtracking. The workhorse behind cycle detection, finding connected components, topological sort, and strongly-connected-components algorithms.
Input / output. In: a graph $G$ and a source $s$ (or run it from every unvisited vertex to cover the whole graph). Out: the set of reachable vertices, a DFS tree (via predecessor pointers), and — if you record them — discovery and finish times that classify every edge.
Time / space. $\Theta(V + E)$ time on an adjacency list; $\Theta(V^2)$ on a matrix. $O(V)$ space for the visited set plus the recursion (or explicit) stack, whose depth can reach $V$.
Key idea. A LIFO stack (often the call stack via recursion) drives the "go deep first" order; the finish order — when a vertex's whole subtree is done — is what powers topological sort and cycle detection.
Watch for. On large graphs, deep recursion can overflow the call stack; convert to an explicit stack. To detect a cycle in a directed graph, look for a back edge — an edge to a vertex still on the recursion stack (gray), not merely one already visited.
Toolkit: dfs(g, s). Chapter 28.

Topological Sort

Purpose. Order the vertices of a directed acyclic graph (DAG) in a line so that every edge points forward — every prerequisite comes before what depends on it. This is exactly what a build system, a course-prerequisite checker, a spreadsheet recalculator, or a task scheduler needs.
Input / output. In: a DAG. Out: a linear ordering $v_1, v_2, \dots, v_n$ of all vertices such that whenever $(u, v)$ is an edge, $u$ appears before $v$. (The order is not unique in general.) If the graph has a cycle, no such order exists — and a good implementation reports the cycle instead.
Time / space. $\Theta(V + E)$ time, $O(V)$ space, by either method below.
Key idea — two equivalent routes. (1) DFS post-order: run DFS and prepend each vertex to the output as it finishes; reversing the finish order yields a topological order. (2) Kahn's algorithm: repeatedly remove a vertex with in-degree $0$ and decrement its neighbors' in-degrees.
Watch for. Topological sort requires acyclicity — it both produces the order and serves as a cycle test (if Kahn's algorithm cannot empty the graph, a cycle remains). Chapter 13 first defines it abstractly as a linear extension of a partial order; Chapter 28 re-derives it as a DFS post-order. Those are two views of one idea.
Toolkit: topo_sort(dag) (in relations.py). Chapters 13 and 28.

Dijkstra's Algorithm

Purpose. Find the shortest path from one source to every other vertex in a graph whose edge weights are non-negative — the canonical "fastest route on a road map" or "cheapest sequence of hops" problem.
Input / output. In: a weighted graph $G$ with $w(e) \ge 0$ for every edge, and a source $s$. Out: for each vertex $v$, the true shortest-path distance $\delta(s, v)$ and a predecessor pointer for reconstructing the path itself (not just its length).
Time / space. $O((V + E)\log V)$ with a binary-heap priority queue — each vertex extracted once and each edge triggering at most one push, both $O(\log V)$. A Fibonacci heap gives $O(E + V \log V)$, relevant only for very dense graphs. $O(V)$ space.
Key idea. Greedily finalize the closest unfinalized vertex. Because no edge is negative, once a vertex is the nearest remaining one, no later (longer-prefix) path can beat the one you already have — so its distance is settled. Each finalization relaxes its outgoing edges (updates a neighbor's tentative distance if going through this vertex is cheaper).
Watch for. Dijkstra is wrong on negative edges — non-negativity is the one hinge of the correctness proof (Chapter 29). Use a lazy heap (push duplicates, skip a vertex when you pop it already finalized) unless your priority queue supports decrease-key.
Toolkit: dijkstra(g, s). Chapter 29.

🔗 Connection: Dijkstra is the book's first certified greedy algorithm — greedy strategies are usually wrong, so the loop-invariant proof in Chapter 29 is the point, not a formality.

Bellman–Ford Algorithm

Purpose. Single-source shortest paths that survive negative edge weights, and — just as important — detect a negative-weight cycle (a loop you could traverse forever to drive the cost to $-\infty$, which makes "shortest path" meaningless). Think arbitrage in currency-exchange graphs.
Input / output. In: a weighted graph $G$ (weights may be negative) and a source $s$. Out: shortest-path distances and predecessors from $s$; or a report that a negative cycle is reachable from $s$, in which case no finite shortest paths exist.
Time / space. $O(V E)$ time — relax all $E$ edges, $V - 1$ times — and $O(V)$ space. Slower than Dijkstra, but strictly more general.
Key idea. Relax every edge, repeatedly. A shortest path has at most $V - 1$ edges, so after $V - 1$ full passes of relaxing all edges, every distance is final. One extra pass that still improves some distance proves a negative cycle is reachable.
Watch for. Do not stop early just because one path "looks" settled — the guarantee is the full $V-1$ rounds. The negative-cycle check is the $V$-th pass; skipping it means silently returning garbage on graphs that have one.
Toolkit: not a separate Toolkit function; built in Chapter 29 alongside dijkstra.

⚠️ Common Pitfall: Reaching for Bellman–Ford "to be safe" when all weights are non-negative. If there are no negative edges, Dijkstra is asymptotically faster ($O((V+E)\log V)$ vs. $O(VE)$) and just as correct. Pay the Bellman–Ford price only when you actually have negative edges.

Kruskal's Algorithm

Purpose. Build a minimum spanning tree (MST) — the cheapest set of edges that connects all vertices of a weighted, connected, undirected graph with no cycle. Classic for laying out a network (cable, road, pipe) at minimum total cost.
Input / output. In: a connected, weighted, undirected graph $G$. Out: a set of $V - 1$ edges forming a spanning tree of minimum total weight.
Time / space. $O(E \log V)$ time, dominated by sorting the edges; the cycle checks cost only $O(E\,\alpha(V))$ with union-find, effectively constant. $O(V)$ space for the union-find structure.
Key idea. Sort edges cheapest-first; add an edge iff it joins two different components. A union-find (disjoint-set) structure answers "are these endpoints already connected?" in nearly constant time, so you can cheaply skip any edge that would close a cycle.
Watch for. Kruskal grows a forest of fragments that gradually merge — it does not keep one connected tree until the end. Without path compression and union by rank, the union-find degrades and the "negligible" cycle-check cost stops being negligible.
Toolkit: mst_kruskal(g). Chapter 32.

Prim's Algorithm

Purpose. The other classic minimum spanning tree algorithm — same output as Kruskal, opposite strategy. Often the better fit on dense graphs and when the graph is given as an adjacency list.
Input / output. In: a connected, weighted, undirected graph $G$ and any start vertex. Out: an MST (the same total weight as Kruskal's; the identical tree when all edge weights are distinct).
Time / space. $O(E \log V)$ with a binary heap — the same asymptotic bound as Kruskal. A Fibonacci heap gives $O(E + V \log V)$, an advantage only on very dense graphs. $O(V)$ space.
Key idea. Grow one tree from a seed, always adding the cheapest edge that reaches a new vertex. Picture an ink blot spreading across the graph: at every step it crosses its own boundary (a cut) at the cheapest crossing edge.
Watch for. Prim looks almost identical to Dijkstra — same priority-queue skeleton — but the heap key is the weight of the single connecting edge, not the accumulated path distance. Confusing the two keys is the classic bug.
Toolkit: not a separate Toolkit function; built in Chapter 32 alongside mst_kruskal.

🔗 Connection: Both MST algorithms are proved correct by the cut property (Chapter 32): the cheapest edge crossing any cut is safe to include. Kruskal and Prim just choose which cut to cross at each step — different greedy choices, one shared theorem.

Ford–Fulkerson Method (Maximum Flow)

Purpose. Push as much "flow" as possible from a source $s$ to a sink $t$ through a network whose edges have capacities — the maximum throughput of a pipe system, road network, or data link. A surprising range of problems (scheduling, project selection, image segmentation, and the matching below) reduce to max flow.
Input / output. In: a directed graph with a non-negative capacity $c(u, v)$ on each edge, a source $s$, and a sink $t$. Out: a maximum flow (a valid per-edge flow of greatest total value), and — by the max-flow min-cut theorem — a minimum cut that certifies the value is optimal.
Time / space. It is a method, not one algorithm: it does not say how to find an augmenting path. Using BFS for the shortest augmenting path gives the Edmonds–Karp algorithm, $O(V E^2)$, independent of capacities. The generic method runs in $O(E \cdot \lvert f^* \rvert)$, where $\lvert f^* \rvert$ is the max-flow value — slow on large integer capacities. $O(V + E)$ space.
Key idea. Repeatedly find an augmenting path in the residual network and push its bottleneck amount. The residual network's crucial feature is back edges, which let later paths undo and reroute earlier flow — the single trick that makes the method correct. When no augmenting path remains, the flow is maximum.
Watch for. Forgetting back edges is the number-one error and gives wrong (too-small) answers. Use BFS (Edmonds–Karp) to guarantee termination and the $O(VE^2)$ bound; with arbitrary path choice and irrational capacities the generic method may not even terminate.
Toolkit: max_flow(g, s, t) (implemented with BFS = Edmonds–Karp). Chapter 34.

🚪 Threshold Concept: Duality. The max-flow min-cut theorem says the maximum flow equals the minimum cut capacity. A flow being below some cut proves nothing; optimality is certified only by a cut whose capacity equals the flow. Seeing one quantity bounded — and met — by its dual is a pattern you will recognize again and again in algorithms and optimization.

Bipartite Matching

Purpose. Find the largest possible set of pairings with no shared endpoint — assign workers to jobs, students to projects, applicants to slots — when the underlying "is-compatible-with" graph is bipartite (two sides, edges only across).
Input / output. In: a bipartite graph $G = (L \cup R, E)$ with every edge joining the left set $L$ to the right set $R$. Out: a maximum matching — a largest set of edges, no two sharing a vertex.
Time / space. $O(V E)$ via the flow reduction below (each of up to $V$ augmenting paths costs an $O(E)$ search). The specialized Hopcroft–Karp algorithm improves this to $O(E\sqrt{V})$.
Key idea. Reduce to max flow. Add a super-source $s$ joined to every vertex in $L$ and a super-sink $t$ joined from every vertex in $R$, give every edge capacity $1$, and run Ford–Fulkerson; the edges carrying flow between $L$ and $R$ are a maximum matching. (Equivalently, repeatedly find augmenting paths directly in the bipartite graph.)
Watch for. The graph must genuinely be bipartite — two-color it first (BFS, Chapter 28) to confirm. Unit capacities are what make the flow come out as a clean matching.
Theory hook. König's theorem (in a bipartite graph the maximum matching size equals the minimum vertex cover size) is exactly the max-flow min-cut theorem in disguise; Hall's marriage theorem characterizes when a perfect matching of $L$ exists.
Toolkit: built on max_flow in Chapter 34.

Which algorithm should I use?

Start from the question you are asking, not the algorithm you remember. This guide takes you from a problem to the right tool.

What do you want to compute?

├─ Can I get from A to B? / What's reachable from here?
│     → BFS or DFS  (either works; both Θ(V+E))                      [Ch. 28]
│
├─ Fewest *hops* (edges) between two vertices, weights all equal?
│     → BFS  (the queue gives shortest unweighted paths for free)   [Ch. 28]
│
├─ Cheapest path, edges have *weights*?
│     ├─ all weights ≥ 0          → DIJKSTRA   O((V+E) log V)        [Ch. 29]
│     ├─ some weights negative    → BELLMAN–FORD   O(VE)            [Ch. 29]
│     └─ need to *detect* a negative cycle (e.g. arbitrage)
│                                 → BELLMAN–FORD (the V-th pass)     [Ch. 29]
│
├─ Order tasks so every prerequisite comes first? (build, schedule)
│     → TOPOLOGICAL SORT  (requires a DAG; also detects cycles)      [Ch. 13, 28]
│
├─ Is there a cycle?
│     ├─ directed   → DFS, look for a back edge                      [Ch. 28]
│     └─ undirected → DFS, or union-find while adding edges          [Ch. 28, 32]
│
├─ Connect everything as cheaply as possible? (a tree, no cycle)
│     → MINIMUM SPANNING TREE
│         ├─ sparse / edge-list input → KRUSKAL   O(E log V)         [Ch. 32]
│         └─ dense / adjacency-list   → PRIM      O(E log V)         [Ch. 32]
│
├─ Maximum throughput source→sink with capacities?
│     → MAX FLOW (Ford–Fulkerson; use BFS = Edmonds–Karp, O(VE²))   [Ch. 34]
│
└─ Largest set of valid pairings (workers↔jobs, two sides)?
      → BIPARTITE MATCHING via max flow   O(VE)                      [Ch. 34]
        (Hopcroft–Karp O(E√V) if you need speed)

A few decisions deserve a sentence of "why," because they trip people up:

BFS vs. Dijkstra. BFS is Dijkstra's special case when every edge has weight $1$. Unweighted (or all-equal) edges → use BFS; it is simpler and skips the heap's $\log V$ factor. Add weights → Dijkstra.
Dijkstra vs. Bellman–Ford. Default to Dijkstra (faster). Switch to Bellman–Ford only for possible negative weights, or to detect a negative cycle. Paying $O(VE)$ on a non-negative graph is a needless slowdown.
Kruskal vs. Prim. Same MST, same $O(E \log V)$ bound, so the choice is pragmatic: Kruskal suits a sparse graph given as a sortable edge list; Prim suits a dense adjacency-list graph (and a Fibonacci heap pushes it to $O(E + V\log V)$). Either is certified-optimal.
Topological sort vs. cycle detection. Same DFS, different bookkeeping: topo sort succeeds on a DAG and fails (reporting a cycle) otherwise, so it doubles as a directed-cycle test.
"It reduces to max flow." When a problem asks for the largest/cheapest set of choices under capacity-like constraints — matching, disjoint paths, project selection — look for a flow network hiding inside it (Chapter 34); the reduction often turns "no idea" into "solved."

🔄 Check Your Understanding: You must find the cheapest route between two cities on a map where every road has a positive toll; which algorithm, and why not BFS?

Answer
Dijkstra (Chapter 29). BFS counts edges, so it would minimize the number of roads, not the total toll — wrong objective the moment edges have unequal weights. Dijkstra's greedy finalization is valid because the tolls are non-negative.

Cross-references. Graph definitions, degree, and paths: Chapter 27. Representations (adjacency matrix vs. list) and the $\Theta(V+E)$ analysis: Chapter 28. Big-O / $\Theta$ notation used throughout: Chapter 14. Partial orders behind topological sort: Chapter 13. The full, runnable implementations of bfs, dfs, dijkstra, mst_kruskal, and max_flow are assembled in the graphs.py module of the Discrete Math Toolkit (Appendix I).