39 min read

> "How far apart are two people in a social network? Knock on every door one ring at a time, and count the rings."

Prerequisites

  • 27
  • 14

Learning Objectives

  • Represent a graph in code using both an adjacency matrix and an adjacency list, and choose the right one for a given problem.
  • Implement breadth-first search and depth-first search from scratch, and trace their execution by hand.
  • Use traversal to decide connectivity, find connected components, detect cycles, and compute shortest paths in unweighted graphs.
  • Re-derive topological sort as a depth-first-search post-order on a directed acyclic graph.
  • Analyze the running time of BFS and DFS as $\Theta(V + E)$ and explain why the representation determines the bound.

Chapter 28: Graph Representations, Traversal, and Search

"How far apart are two people in a social network? Knock on every door one ring at a time, and count the rings."

Overview

In Chapter 27 you learned to describe a graph — vertices, edges, degrees, paths, cycles. But a description is not a program. The moment you want a computer to answer a question about a graph — Are these two servers connected? What is the shortest chain of friends between Ana and Bo? Does this build configuration contain a circular dependency? — you face two questions a definition never asked. First: how do you store a graph in memory so the machine can work with it? Second: how do you visit its vertices systematically, so that you reach every relevant one exactly once and miss nothing?

This chapter answers both. The storage question has two classic answers — the adjacency matrix and the adjacency list — and the choice between them is one of the most common space–time tradeoffs in all of computing. The visiting question has two equally classic answers — breadth-first search (BFS) and depth-first search (DFS) — and almost every graph algorithm you will ever write is one of these two with a small twist. Dijkstra's algorithm (Chapter 29) is BFS with a priority queue. Cycle detection is DFS watching its own footprints. Topological sort is DFS writing down vertices as it leaves them. Learn these two traversals deeply and the rest of Part V is variations on a theme.

The payoff is concrete and immediate. By the end of this chapter you will be able to take the friendship graph from Chapter 27 and compute, for any two people, their degrees of separation — the famous "six degrees" number — with a dozen lines of code whose correctness you can prove. That is the social-network thread of this book reaching its first real algorithm.

In this chapter you will learn to:

  • Store a graph as an adjacency matrix or an adjacency list, and predict the memory each uses.
  • Run BFS to explore a graph in expanding rings and read off shortest unweighted distances.
  • Run DFS to plunge deep along each branch, and recognize its recursive (and stack-based) structure.
  • Apply traversal to four staple problems: connectivity, connected components, cycle detection, and topological ordering.
  • Bound the running time of both traversals and explain why the bound is $\Theta(V+E)$ on an adjacency list but $\Theta(V^2)$ on a matrix.

Learning Paths

🏃 Fast Track: If you have implemented BFS and DFS before, skim 28.1 (representations) and 28.6 (complexity) for the precise bounds, then read 28.5 (topological sort via DFS) carefully — the post-order argument is subtler than it looks. Do the ⭐⭐ and ⭐⭐⭐ exercises.

📖 Standard Path: Read in order. Hand-trace every worked example on the sample graph before reading the trace tables, and answer each 🔄 Check Your Understanding from memory. The Project Checkpoint adds bfs and dfs to your Toolkit.

🔬 Deep Dive: After the chapter, prove that BFS computes shortest unweighted distances (we sketch the invariant in 28.2; the full proof is exercise Part C), and prove that a directed graph has a topological ordering if and only if it is acyclic. Then read CLRS Chapter 20 for the white/gray/black edge-classification machinery we only gesture at.


28.1 Representing a graph: adjacency matrix vs. adjacency list

Suppose you are handed the friendship graph from Chapter 27 — six people, some pairs friends — and asked to put it into a program. You cannot pass a picture to a function. You need a data structure. Two structures dominate practice, and they make opposite bets about what you will do with the graph.

Take this small undirected graph as our running example for the whole chapter. The vertices are $0,1,2,3,4,5$; the edges are $\{0,1\}, \{0,2\}, \{1,3\}, \{2,3\}, \{3,4\}$, and an isolated extra edge $\{?\}$ — actually, let us keep vertex $5$ isolated (no edges) so we have something to say about disconnected graphs later.

        0
       / \
      1   2
       \ /
        3
        |
        4          5   (isolated)

The adjacency matrix

The most literal idea: make a table that says, for every pair of vertices, whether an edge joins them.

Definition. For a graph $G=(V,E)$ with vertices labeled $0,1,\dots,n-1$, the adjacency matrix is the $n \times n$ matrix $A$ where $$A[i][j] = \begin{cases} 1 & \text{if } \{i,j\}\in E \ (\text{or } (i,j)\in E \text{ for a directed graph}),\\ 0 & \text{otherwise.}\end{cases}$$

For our example graph, the adjacency matrix is

$$A = \begin{pmatrix} 0 & 1 & 1 & 0 & 0 & 0\\ 1 & 0 & 0 & 1 & 0 & 0\\ 1 & 0 & 0 & 1 & 0 & 0\\ 0 & 1 & 1 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 \end{pmatrix}.$$

Read row $3$: it has $1$s in columns $1, 2, 4$, meaning vertex $3$ is adjacent to vertices $1, 2, 4$ — exactly its neighbors in the picture. A few structural facts fall out immediately, and they are worth internalizing because they show how a representation encodes graph properties:

  • For an undirected graph the matrix is symmetric: $A[i][j] = A[j][i]$, because the edge $\{i,j\}$ is the same edge as $\{j,i\}$.
  • The degree of vertex $i$ is the sum of row $i$ (equivalently, column $i$): $\deg(i) = \sum_j A[i][j]$. Vertex $3$ has row sum $3$, so $\deg(3)=3$. (This is the handshaking machinery of Chapter 27, now visible as arithmetic on a matrix.)
  • The diagonal is all $0$ for a simple graph (no self-loops).
# The adjacency matrix of our example graph, as a list of lists.
A = [
    [0, 1, 1, 0, 0, 0],  # vertex 0
    [1, 0, 0, 1, 0, 0],  # vertex 1
    [1, 0, 0, 1, 0, 0],  # vertex 2
    [0, 1, 1, 0, 1, 0],  # vertex 3
    [0, 0, 0, 1, 0, 0],  # vertex 4
    [0, 0, 0, 0, 0, 0],  # vertex 5 (isolated)
]
print("3 and 4 adjacent?", A[3][4] == 1)
print("degree of 3:", sum(A[3]))
# Expected output:
# 3 and 4 adjacent? True
# degree of 3: 3

The matrix's strength is the edge-existence query: "is there an edge from $i$ to $j$?" is a single array lookup, $A[i][j]$, in $O(1)$ time. Its weakness is space and iteration. It always uses $n^2$ entries, whether or not the graph has many edges. Our example has $6$ vertices and $5$ edges, yet the matrix has $36$ cells, $26$ of which are $0$. And to list the neighbors of a vertex you must scan its entire row — $n$ cells — even if it has only one neighbor.

💡 Intuition: An adjacency matrix is like a full attendance grid for a class: one checkbox for every possible pairing. Cheap to ask "are these two paired?", expensive to store when almost no pairs are checked, and slow to ask "who is this person paired with?" because you must read their whole row.

The adjacency list

Most real graphs are sparse: the number of edges $m$ is far less than the maximum possible $\binom{n}{2}$. A social network with a billion users does not have anywhere near $\binom{10^9}{2} \approx 5\times 10^{17}$ friendships. Storing mostly-zero rows is wasteful. The adjacency list stores only what is there.

Definition. The adjacency list of a graph $G=(V,E)$ is, for each vertex $v$, a list of the vertices adjacent to $v$. Concretely it is a dictionary (or array) mapping each vertex to its list of neighbors.

For our example:

# The same graph as an adjacency list.
adj = {
    0: [1, 2],
    1: [0, 3],
    2: [0, 3],
    3: [1, 2, 4],
    4: [3],
    5: [],            # isolated vertex still appears, with an empty list
}
print("neighbors of 3:", adj[3])
print("degree of 3:", len(adj[3]))
# Expected output:
# neighbors of 3: [1, 2, 4]
# degree of 3: 3

Now listing a vertex's neighbors is instant — you just hand back its list — and the total storage is proportional to the number of edges, not their square. The cost: testing whether a specific edge $\{i,j\}$ exists means scanning $i$'s neighbor list, which takes time proportional to $\deg(i)$ rather than $O(1)$.

The two representations are duals, and the choice is a textbook space–time tradeoff. Here is the decision table; we will justify the complexity column rigorously in 28.6.

Operation Adjacency matrix Adjacency list
Space $\Theta(n^2)$ $\Theta(n+m)$
Is $\{i,j\}$ an edge? $\Theta(1)$ $\Theta(\deg(i))$
List neighbors of $v$ $\Theta(n)$ $\Theta(\deg(v))$
Iterate all edges $\Theta(n^2)$ $\Theta(n+m)$
Add an edge $\Theta(1)$ $\Theta(1)$

Here $n = \lvert V\rvert$ is the number of vertices and $m = \lvert E\rvert$ the number of edges. We will write $V$ and $E$ for these counts inside Big-O when context makes it natural — a standard abuse, used in CLRS, that lets us write $\Theta(V+E)$ for the traversal bound.

⚠️ Common Pitfall: "The adjacency list is always better." Not so. For a dense graph — one where $m$ is close to $n^2$, such as a complete graph $K_n$ — the list stores roughly $n^2$ entries anyway, loses the $O(1)$ edge test, and pays pointer/overhead costs the matrix avoids. Matrices also shine when you want to do linear algebra on the graph (counting walks via $A^k$, spectral methods). Use the matrix for dense graphs and edge-existence-heavy code; use the list for sparse graphs and traversal-heavy code. Since most graphs in practice are sparse and most algorithms traverse, the adjacency list is the default — but "default" is not "always."

🔗 Connection. This is the same representation question you met for relations in Chapter 12, where a relation on a finite set was stored either as a Boolean matrix or as a digraph's edge lists. A graph is a (symmetric, irreflexive) relation, so it is no accident the two storage schemes coincide. The adjacency matrix here is exactly the relation matrix there.

🔄 Check Your Understanding 1. For a graph with $n=1000$ vertices and only $m=2000$ edges, roughly how many cells does the adjacency matrix use, and how many total list entries does the adjacency list use? (For the list, count each undirected edge twice.) 2. You need to answer ten million queries of the form "is $\{i,j\}$ an edge?" on a fixed, dense graph. Which representation do you choose, and why? 3. Why does vertex $5$ still appear in the adjacency list even though it has no edges?

Answers

  1. Matrix: $1000^2 = 1{,}000{,}000$ cells. List: each of the $2000$ undirected edges contributes two entries (one in each endpoint's list), so $4000$ entries plus the $1000$ keys — about $5000$ total. The list uses ~200× less space here. 2. The matrix: ten million $O(1)$ lookups beat ten million $O(\deg)$ scans, and on a dense graph the matrix's space cost is not a penalty. 3. Because it is still a vertex of the graph. Connectivity, traversal, and counting all need to know vertex $5$ exists; an empty neighbor list records "this vertex is present but isolated."

28.2 Breadth-first search: exploring in rings

Here is a question with a tangible answer. In the friendship graph, start at a person $s$. Who are $s$'s friends? Who are the friends-of-friends that you didn't already count? Friends-of-friends-of-friends? If you peel the graph outward in these shells, the shell number at which you first reach a person is their degrees of separation from $s$. Computing those shells, in order, is exactly breadth-first search.

💡 Intuition: Drop a pebble in a pond at vertex $s$. The ripple reaches all distance-1 vertices, then all distance-2 vertices, then distance-3, in expanding rings. BFS is that ripple. The key discipline: fully finish one ring before starting the next. A queue (first-in, first-out) enforces exactly that discipline — vertices discovered earlier (closer rings) come out earlier.

Definition. Breadth-first search (BFS) from a source vertex $s$ visits the vertices of a graph in order of increasing distance from $s$: first $s$ itself (distance 0), then all neighbors of $s$ (distance 1), then all not-yet-visited neighbors of those (distance 2), and so on. It uses a FIFO queue of discovered-but-not-yet-processed vertices and marks each vertex visited the moment it is discovered, so no vertex is enqueued twice.

The marking is the crucial bookkeeping. A graph has cycles, so the same vertex can be reached by several paths; without a "visited" set, BFS would loop forever or re-process vertices. Mark a vertex the instant you discover it (enqueue it), not when you later process it — marking at discovery is what guarantees each vertex enters the queue exactly once.

The algorithm

from collections import deque

def bfs_order(adj, s):
    """Return the vertices reachable from s, in BFS (ring) order."""
    visited = {s}                 # mark the source as discovered
    queue = deque([s])
    order = []
    while queue:
        v = queue.popleft()       # FIFO: take the oldest discovered vertex
        order.append(v)
        for w in adj[v]:          # scan v's neighbors
            if w not in visited:
                visited.add(w)    # mark at DISCOVERY, not at processing
                queue.append(w)
    return order

Hand-tracing BFS

Let us trace bfs_order(adj, 0) on our example graph, where adj is the adjacency list from 28.1 (neighbor lists in increasing order). The table shows the queue (front on the left) and the action at each step. This kind of hand-trace is the single best way to believe a traversal; do it yourself before reading ours.

Step Dequeue Append to order Newly discovered (enqueued) Queue after
start $0$ $[0]$
1 $0$ $0$ $1, 2$ $[1, 2]$
2 $1$ $1$ $3$ (neighbor $0$ already visited) $[2, 3]$
3 $2$ $2$ — ($0,3$ already visited) $[3]$
4 $3$ $3$ $4$ ($1,2$ already visited) $[4]$
5 $4$ $4$ — ($3$ already visited) $[\,]$

So bfs_order(adj, 0) returns [0, 1, 2, 3, 4]. Vertex $5$ never appears — it is unreachable from $0$, which is the correct answer and a fact we will exploit in 28.4 to detect disconnected graphs. Notice the order respects the rings: distance 0 is $\{0\}$, distance 1 is $\{1,2\}$, distance 2 is $\{3\}$, distance 3 is $\{4\}$.

BFS computes shortest unweighted distances

The ring structure is not a coincidence; it is a guarantee, and it is the reason BFS answers "degrees of separation." Let us upgrade the algorithm to record the distance and the predecessor (the vertex we came from), so we can both measure and reconstruct shortest paths.

from collections import deque

def bfs_distances(adj, s):
    """Shortest-path distances (in edges) from s, plus a predecessor map."""
    dist = {s: 0}
    parent = {s: None}
    queue = deque([s])
    while queue:
        v = queue.popleft()
        for w in adj[v]:
            if w not in dist:          # 'in dist' doubles as 'visited'
                dist[w] = dist[v] + 1  # one ring further out than v
                parent[w] = v
                queue.append(w)
    return dist, parent

The strategy first. We claim that when BFS first discovers a vertex $w$, the value dist[w] it assigns equals the true shortest-path distance from $s$ to $w$ (counting edges). The intuition is the ripple: BFS processes vertices in nondecreasing order of the distance it assigns, so it cannot discover a far vertex before a near one. The clean way to capture "processes in nondecreasing order" is a loop invariant on the queue — exactly the loop-invariant technique from Chapter 6, now applied to a graph algorithm. We state the invariant, show it holds initially and is maintained, and read off the result.

Theorem (BFS correctness). Let $\delta(s,w)$ denote the minimum number of edges on any path from $s$ to $w$ (and $\infty$ if none exists). After bfs_distances(adj, s), for every vertex $w$ reachable from $s$, $\mathrm{dist}[w] = \delta(s,w)$.

Proof sketch (by the queue invariant). Invariant: at every moment, the vertices in the queue have $\mathrm{dist}$ values that are nondecreasing from front to back and span at most two consecutive values $d$ and $d+1$.

Initialization: the queue holds only $s$ with $\mathrm{dist}=0$; the invariant holds trivially.

Maintenance: when we dequeue a front vertex $v$ with $\mathrm{dist}[v]=d$, every neighbor $w$ we newly discover is assigned $\mathrm{dist}[w]=d+1$ and appended to the back. Since the front values were $d$ (possibly followed by $d+1$), appending $d+1$ keeps the queue values within $\{d, d+1\}$ then $\{d+1\}$ — still nondecreasing, still spanning at most two consecutive values.

Why this gives shortest distances: one shows by induction on $d$ that the set of vertices assigned $\mathrm{dist}=d$ is exactly $\{w : \delta(s,w)=d\}$. Base case $d=0$: only $s$, and $\delta(s,s)=0$. Inductive step: a vertex $w$ with $\delta(s,w)=d+1$ has a neighbor $u$ with $\delta(s,u)=d$ (take the second-to-last vertex on a shortest path); by the hypothesis $u$ is assigned $d$ and, when $u$ is processed, $w$ is either already discovered (at value $\le d+1$) or discovered now at value $d+1$. The invariant forbids assigning $w$ any value smaller than $d+1$ (that would require a neighbor processed at value $< d$, i.e. a shorter path, contradiction). Hence $\mathrm{dist}[w]=d+1=\delta(s,w)$. $\blacksquare$

To turn distances into an actual shortest path, walk the parent map backward from the target to $s$ and reverse it:

def shortest_path(adj, s, t):
    """Reconstruct one shortest s-t path (list of vertices), or None."""
    dist, parent = bfs_distances(adj, s)
    if t not in parent:
        return None                      # t unreachable from s
    path = []
    v = t
    while v is not None:
        path.append(v)
        v = parent[v]
    path.reverse()
    return path

print(shortest_path(adj, 0, 4))
# Expected output:
# [0, 1, 3, 4]

The path goes $0\to 1\to 3\to 4$: when BFS processed vertex $1$ (dequeued before vertex $2$, since both sit at distance 1 and $1$ was enqueued first), it claimed vertex $3$ and set $\mathrm{parent}[3]=1$. Had the neighbor lists or queue order differed, $3$ could instead have been reached via $2$, giving the equally-short path $[0,2,3,4]$ — both have length 3, the true distance $\delta(0,4)$.

🔗 Connection — the social-network anchor. "Degrees of separation" between two people is precisely the BFS distance between their vertices in the friendship graph. The famous six degrees of separation claim is the empirical observation that, in real social graphs, BFS distances between random pairs are small — typically under six. Run bfs_distances from a person and you have computed everyone's separation from them in one sweep. This is the first genuine algorithm of the social-graph thread; in Chapter 29 we generalize from "number of edges" to "total edge weight" (e.g., travel time) and BFS becomes Dijkstra's algorithm.

⚠️ Common Pitfall: BFS gives shortest paths only when every edge counts the same (unweighted, or all weights equal). The instant edges carry different weights — distances, costs, latencies — the ring argument breaks, because a path with more edges can have smaller total weight. For weighted shortest paths you need Dijkstra (Chapter 29), not BFS. Reaching for BFS on a weighted graph is one of the most common shortest-path bugs.

🔄 Check Your Understanding 1. Why must a vertex be marked visited at the moment it is enqueued, rather than when it is later dequeued and processed? 2. In the trace of bfs_order(adj, 0), vertex $3$ was discovered from vertex $1$. Could it instead have been discovered from vertex $2$? Would the resulting distance differ? 3. What does bfs_distances(adj, 0) return for the key 5? What does that tell you?

Answers

  1. If you marked at dequeue, a vertex could be enqueued several times before it is first processed — once for each neighbor that discovers it — wasting work and, worse, assigning it whatever distance the last discoverer used rather than the first (shortest). Marking at enqueue guarantees each vertex enters the queue exactly once, at its true shortest distance. 2. Yes — $3$ is a neighbor of both $1$ and $2$, both at distance 1, so whichever is processed first claims $3$. Either way $\mathrm{dist}[3]=2$; only the parent (and thus the specific reconstructed path) might differ. Shortest paths need not be unique. 3. The key 5 is absent from the returned dist map (it was never discovered). That tells you vertex $5$ is unreachable from $0$, i.e. $\delta(0,5)=\infty$ — the graph is disconnected.

28.3 Depth-first search: plunging deep

BFS spreads out cautiously, one ring at a time. Depth-first search does the opposite: it commits to a direction and follows it as far as possible before backing up. Picture exploring a maze by always taking an unexplored corridor at every junction, marking your path, and only retreating to the last junction when you hit a dead end. That retreating-to-the-last-choice behavior is last-in, first-out — a stack — which is the data-structure signature of DFS, just as the queue is the signature of BFS.

💡 Intuition: BFS uses a queue and explores like a flood filling a basin level by level. DFS uses a stack (often the call stack via recursion) and explores like a single explorer threading a maze, going as deep as possible, then backtracking. Same goal — visit everything reachable, once each — opposite shape.

Definition. Depth-first search (DFS) from a source vertex $s$ explores as deeply as possible along each branch before backtracking: it visits $s$, then recursively visits the first unvisited neighbor of $s$, fully exploring everything reachable from there before moving to $s$'s next unvisited neighbor. It marks vertices visited to avoid revisiting, and is naturally expressed by recursion (which uses the call stack) or with an explicit stack.

The recursive algorithm

DFS is most transparent recursively — and, pleasingly, its correctness is an induction of exactly the kind you proved in Chapter 6: each recursive call handles a smaller piece of unexplored graph.

def dfs_order(adj, s):
    """Return vertices reachable from s, in DFS (pre-order) visiting order."""
    visited = set()
    order = []
    def visit(v):
        visited.add(v)            # mark on entry (pre-order)
        order.append(v)
        for w in adj[v]:
            if w not in visited:
                visit(w)          # recurse: go deep before going wide
    visit(s)
    return order

Hand-tracing DFS

Trace dfs_order(adj, 0) on the example graph, with neighbor lists in increasing order. The "call stack" column shows which visit calls are currently open (the deepest on the right).

Step Action Call stack (open visits) order so far
1 enter visit(0); mark 0 $0$ $[0]$
2 $0$'s first unvisited neighbor is $1$; enter visit(1) $0,1$ $[0,1]$
3 $1$'s first unvisited neighbor is $3$ ($0$ done); enter visit(3) $0,1,3$ $[0,1,3]$
4 $3$'s first unvisited neighbor is $2$ ($1$ done); enter visit(2) $0,1,3,2$ $[0,1,3,2]$
5 $2$'s neighbors $0,3$ both visited; return from visit(2) $0,1,3$ $[0,1,3,2]$
6 back in visit(3); next unvisited neighbor is $4$; enter visit(4) $0,1,3,4$ $[0,1,3,2,4]$
7 $4$'s neighbor $3$ visited; return from visit(4) $0,1,3$ $[0,1,3,2,4]$
8 back in visit(3); no more neighbors; return $0,1$ $[0,1,3,2,4]$
9 back in visit(1); no more unvisited; return $0$ $[0,1,3,2,4]$
10 back in visit(0); neighbor $2$ already visited; return (empty) $[0,1,3,2,4]$

So dfs_order(adj, 0) returns [0, 1, 3, 2, 4]. Compare with BFS's [0, 1, 2, 3, 4]: DFS dove $0 \to 1 \to 3$ and reached $3$ at the third position, whereas BFS, exploring by rings, reached $3$ only after finishing the whole distance-1 ring. Neither order reflects distance from the source — that is a BFS specialty. DFS's gift is something else entirely, which the next two sections unlock: the structure of when it enters and leaves each vertex.

The iterative version, and a warning about order

Recursion uses the program's call stack, which on most systems is bounded (a few thousand frames). On a graph with a long path — say a linked-list-shaped graph of a million vertices — recursive DFS overflows that stack. An explicit stack avoids the limit:

def dfs_iterative(adj, s):
    """DFS using an explicit stack instead of recursion."""
    visited = set()
    order = []
    stack = [s]
    while stack:
        v = stack.pop()           # LIFO: take the most recently pushed
        if v in visited:
            continue              # may have been pushed more than once
        visited.add(v)
        order.append(v)
        for w in adj[v]:
            if w not in visited:
                stack.append(w)
    return order

⚠️ Common Pitfall: The iterative DFS above does not produce the same vertex order as the recursive version, even on the same graph. Pushing neighbors in increasing order means the largest neighbor is popped first (LIFO), so to match the recursive left-to-right order you must push neighbors in reverse. Also note we check if v in visited at pop time, because a vertex can be pushed by several neighbors before it is first popped — so a vertex may sit on the stack more than once. (BFS avoided this by marking at enqueue; the cleanest iterative DFS marks at pop and tolerates duplicates on the stack.) Do not assume "DFS" names a single canonical order — it depends on neighbor ordering and on these implementation choices.

🔄 Check Your Understanding 1. What data structure gives BFS its behavior, and what gives DFS its behavior? 2. In the DFS trace, vertex $2$ was reached at step 4 through vertex $3$, not directly from $0$ (even though $0$ and $2$ are adjacent). Why? 3. Give one concrete situation where recursive DFS fails but iterative DFS succeeds.

Answers

  1. A FIFO queue gives BFS its ring-by-ring (breadth) behavior; a LIFO stack (the call stack, for recursion) gives DFS its plunge-deep (depth) behavior. 2. DFS explores neighbors in order and fully finishes each before moving on. From $0$ it took neighbor $1$ first and dove all the way down ($1\to3$); by the time it considered going from $3$ to $2$, that path reached $2$ before DFS ever returned to $0$ to try $0$'s second neighbor. The visiting order is a property of the search, not of edge "directness." 3. A graph that is one long path (or any graph with a very deep DFS tree) of, say, $10^6$ vertices: recursive DFS recurses a million levels deep and overflows the call stack, while iterative DFS uses a heap-allocated list that grows as needed.

28.4 Applications: connectivity, components, and cycles

Traversal is rarely the goal in itself; it is the engine under a question you actually care about. Three of the most common questions in all of graph programming are direct one-liners on top of BFS or DFS. (It does not matter which traversal you use for these — both visit exactly the set of vertices reachable from the source — so pick whichever you find clearer.)

Connectivity: is the whole graph reachable?

A graph is connected (Chapter 27) when there is a path between every pair of vertices. Equivalently: a traversal from any single vertex reaches all vertices. So connectivity is one traversal plus a count.

def is_connected(adj):
    """True iff the undirected graph is connected (and nonempty)."""
    if not adj:
        return True               # vacuously: empty graph
    start = next(iter(adj))       # any vertex
    reached = set(bfs_order(adj, start))
    return len(reached) == len(adj)

print(is_connected(adj))          # our example has isolated vertex 5
# Expected output:
# False

Our example returns False: BFS from $0$ reaches $\{0,1,2,3,4\}$ — five vertices — but the graph has six, so vertex $5$ is stranded. Correct.

Connected components: how many separate pieces?

When a graph is not connected, it splits into maximal connected pieces. This is one of the terms reserved to this chapter, so let us define it carefully.

Definition. A connected component of an undirected graph $G$ is a maximal set of vertices that are all reachable from one another. "Maximal" means: it is connected, and you cannot add any other vertex and keep it connected. Every vertex lies in exactly one component, so the components partition the vertex set.

That last sentence is worth pausing on. "Reachable from one another" is an equivalence relation on vertices — it is reflexive (every vertex reaches itself by the empty path), symmetric (paths reverse in an undirected graph), and transitive (concatenate two paths). By the partition theorem of Chapter 12, an equivalence relation carves its set into disjoint classes; here those classes are exactly the connected components. The graph-theory fact and the relation-theory fact are the same fact wearing different clothes.

To find all components, repeatedly launch a traversal from any not-yet-visited vertex; each launch discovers exactly one component.

def connected_components(adj):
    """Return a list of components, each a sorted list of vertices."""
    seen = set()
    components = []
    for start in adj:                      # consider every vertex
        if start not in seen:              # ...that we haven't placed yet
            comp = bfs_order(adj, start)   # one traversal = one component
            seen.update(comp)
            components.append(sorted(comp))
    return components

print(connected_components(adj))
# Expected output:
# [[0, 1, 2, 3, 4], [5]]

The outer loop touches each vertex once; each vertex is traversed exactly once across all the launches (the seen guard prevents re-launching inside a known component), so the whole routine is still a single linear pass — $\Theta(V+E)$ — even though it may start the traversal many times.

🔗 Connection. This is the algorithmic heart of union-find and of "flood fill" in image editors (the paint-bucket tool floods one connected region of like-colored pixels — a connected component of the pixel grid). In a social network, components are isolated friend groups with no link between them; counting components tells you how fragmented the network is. We will meet a different, faster data structure for the same job — union-find — when we build minimum spanning trees in Chapter 32.

The traversal tree and a first look at spanning trees

Every traversal quietly builds a tree. Each time BFS or DFS discovers a new vertex $w$ from a vertex $v$, record the edge $\{v,w\}$ as a tree edge (it is exactly the parent[w] = v we stored in bfs_distances). Collect all tree edges and you get a subgraph that touches every reachable vertex and contains no cycle — a tree. When the graph is connected, this tree reaches all $n$ vertices using exactly $n-1$ edges.

Definition (intro). A spanning tree of a connected graph $G$ is a subgraph that (i) includes every vertex of $G$, (ii) is connected, and (iii) contains no cycle — equivalently, a tree on all of $G$'s vertices, using exactly $n-1$ of $G$'s edges. The tree edges discovered by a single BFS or DFS from any vertex of a connected graph form a spanning tree (called the BFS tree or DFS tree).

This is only an introduction; we will study spanning trees, and the problem of finding the cheapest one, in depth in Chapter 32. For now the point is conceptual: a traversal does not just visit a graph, it imposes a tree structure on it, and that tree is the scaffold on which cycle detection and topological sort are built.

Cycle detection

Does an undirected graph contain a cycle? Run a traversal and watch for an edge that leads to an already-visited vertex other than the one you just came from. Such an edge — a back edge — closes a loop with the tree edges, exposing a cycle.

The strategy first. During a DFS, every edge you walk is either a tree edge (it discovers a new vertex) or it points back to a vertex already visited. In an undirected graph the edge straight back to your parent is not a real cycle (it is the same edge you arrived on). But any other edge to an already-visited vertex must connect two vertices already joined by a tree path — and an extra edge between two tree-connected vertices is precisely a cycle. So: DFS, ignore the parent edge, and the first other "edge to a visited vertex" proves a cycle exists.

def has_cycle(adj):
    """True iff the undirected graph contains a cycle."""
    visited = set()
    def visit(v, parent):
        visited.add(v)
        for w in adj[v]:
            if w not in visited:
                if visit(w, v):       # recurse, remembering where we came from
                    return True
            elif w != parent:         # visited, and NOT the edge we arrived on
                return True           # -> back edge -> cycle
        return False
    for start in adj:                 # handle each component
        if start not in visited:
            if visit(start, None):
                return True
    return False

print(has_cycle(adj))
# Expected output:
# True

Our example returns True: the four vertices $0,1,3,2$ form the cycle $0\!-\!1\!-\!3\!-\!2\!-\!0$. The DFS finds it when, deep inside, it tries to walk from $3$ to $2$... actually it discovers $2$ as a tree vertex, then from $2$ sees the edge back to $0$ — a visited, non-parent vertex — and reports the cycle. (Directed graphs need a subtler test, distinguishing a vertex still on the recursion stack from one fully finished; that is the engine of the next section.)

🔄 Check Your Understanding 1. Why does connected_components still run in linear time even though it may start a traversal many times? 2. In undirected cycle detection, why must we exclude the edge back to the parent — and why is excluding only the parent enough? 3. How many edges does the BFS tree of a connected graph on $n$ vertices have, and why can it never have a cycle?

Answers

  1. Because the seen set ensures every vertex is traversed exactly once across all launches — a launch from an already-seen vertex never happens, and inside a launch the visited guard stops re-exploration. Total work is proportional to vertices plus edges, $\Theta(V+E)$, regardless of how many components (and hence launches) there are. 2. The parent edge is the very edge DFS used to arrive at $v$; seeing it again is not a second route, just the same edge, so it is not a cycle. Excluding only the parent is enough because any other edge to a visited vertex $w$ joins $v$ to a $w$ already connected through tree edges, and a second connection between two already-connected vertices is by definition a cycle. (Subtlety: in a multigraph two parallel edges to the parent would form a cycle; for simple graphs, excluding the single parent edge is exactly right.) 3. Exactly $n-1$ edges: BFS adds one tree edge each time it discovers a new vertex, and it discovers each of the $n-1$ non-source vertices once. It has no cycle because each vertex gets exactly one parent (one incoming tree edge), so the tree edges form, by definition, a tree.

Now for the application that shows off what makes DFS special. Suppose you have tasks with dependencies: "you must compile utils.c before main.c," "you must install the runtime before the app." Model each task as a vertex and each dependency "$a$ must come before $b$" as a directed edge $a \to b$. The question — in what order can I do the tasks so every dependency is respected? — is the topological sort problem.

🔗 Connection. You have met topological sort before. In Chapter 13 we defined it as a linear extension of a partial order — flattening a poset into a sequence consistent with all its order relations — and discussed it for build systems like make. There the lens was order theory. Here we recover the same ordering from a completely different angle: a depth-first search. Seeing one concept arrive from two directions (Chapter 13's posets and this chapter's DFS) is the book's third theme — abstraction is the master tool — in action. We do not redefine topological sort; we re-derive it.

A topological order can exist only if the directed graph has no directed cycle — if $a$ must precede $b$ and $b$ must precede $a$, no order can satisfy both. A directed graph with no directed cycle is a directed acyclic graph (a DAG; introduced via posets in Chapter 13). On a DAG, DFS produces a topological order almost for free, through one beautiful observation about finishing times.

The key idea — finishing order is reverse topological order. Run DFS and record each vertex the moment you finish it — that is, when its visit call returns, after all its descendants are done. Claim: if $a \to b$ is an edge, then $a$ finishes after $b$. Why? When DFS is processing $a$ and looks at the edge to $b$: either $b$ is unvisited, so DFS dives into $b$ and finishes it before returning to finish $a$; or $b$ is already finished (so it finished before $a$). The only other possibility — $b$ is started but not finished, i.e. $b$ is an ancestor of $a$ still on the stack — would mean a path from $b$ back to $a$ plus the edge $a\to b$, a directed cycle, which a DAG forbids. So in every case $b$ finishes before $a$. Therefore finishing time decreasing lists every edge's head before its tail — exactly a topological order if we reverse the finish order.

This is a post-order DFS: append a vertex after its recursive calls return, then reverse.

def topological_sort(adj):
    """Topological order of a DAG via DFS post-order (reversed)."""
    visited = set()
    finish_stack = []                 # vertices in order of FINISHING
    def visit(v):
        visited.add(v)
        for w in adj[v]:
            if w not in visited:
                visit(w)
        finish_stack.append(v)        # record AFTER all descendants finish
    for v in adj:
        if v not in visited:
            visit(v)
    finish_stack.reverse()            # reverse finish order = topo order
    return finish_stack

A worked example

Take the dependency DAG: 5 -> 0, 5 -> 2, 4 -> 0, 4 -> 1, 2 -> 3, 3 -> 1. (Read "$5\to 0$" as "task 5 must come before task 0.")

dag = {
    5: [0, 2],
    4: [0, 1],
    2: [3],
    3: [1],
    0: [],
    1: [],
}
print(topological_sort(dag))
# Expected output:
# [4, 5, 2, 3, 1, 0]

Let us hand-derive that output to be sure (the loop visits keys in insertion order: $5,4,2,3,0,1$).

Event Detail finish_stack (append order)
visit(5) dive to $0$ (leaf)
finish $0$ $0$ has no unvisited neighbors $[0]$
back in visit(5) dive to $2$, then $2\to 3$, then $3\to 1$
finish $1$ leaf $[0,1]$
finish $3$ its only neighbor $1$ done $[0,1,3]$
finish $2$ its only neighbor $3$ done $[0,1,3,2]$
finish $5$ both neighbors done $[0,1,3,2,5]$
visit(4) neighbors $0,1$ already visited
finish $4$ nothing new $[0,1,3,2,5,4]$

Finish order is $[0,1,3,2,5,4]$; reversed, the topological order is $[4,5,2,3,1,0]$. Check a dependency: the edge $2\to 3$ requires $2$ before $3$, and indeed $2$ precedes $3$ in the output. The edge $5\to 0$ requires $5$ before $0$ — yes. Every edge points "forward" in the list, which is exactly what a topological order promises.

⚠️ Common Pitfall: You must record vertices at finish (post-order), not at discovery (pre-order), and then reverse. Recording at discovery does not give a topological order in general — a vertex can be discovered before a vertex that must precede it. The whole argument rests on the finishing-time property; pre-order has no such guarantee. (A common buggy "topological sort" is just a DFS pre-order, which passes on lucky inputs and fails on the rest — exactly the trap that theme two of this book warns about: passing some tests is not correctness.)

🐛 Find the Error. A teammate proposes detecting whether a directed graph has a cycle by reusing the undirected has_cycle from 28.4 — "just ignore edge directions." Spot the flaw.

Answer

Ignoring directions conflates two different questions. The directed graph $a\to b,\ a\to c,\ b\to d,\ c\to d$ (a "diamond") has no directed cycle — it is a perfectly good DAG — yet its underlying undirected graph has the 4-cycle $a\!-\!b\!-\!d\!-\!c\!-\!a$. The undirected test would wrongly report a cycle. Directed cycle detection must respect direction: it marks vertices currently on the recursion stack (started but not finished) and reports a cycle only when an edge leads to such a vertex — a back edge to an ancestor. A correct topological_sort typically folds this check in, raising an error if it ever meets a vertex that is on the stack rather than finished.

🔄 Check Your Understanding 1. Why does a topological order exist only for a DAG? 2. In the worked example, the DFS discovery order started $5, 0, 2, 3, 1$. Why is that not a valid topological order, while the reversed finish order is? 3. If you ran topological_sort starting the outer loop from vertex $4$ instead of $5$, would you necessarily get the same output? Must it still be a valid topological order?

Answers

  1. A topological order is a linear arrangement with every edge pointing forward. A directed cycle $v_1\to v_2 \to \cdots \to v_1$ would force $v_1$ before $v_2$ before $\cdots$ before $v_1$ — $v_1$ strictly before itself, impossible. So cycles forbid any topological order; acyclicity (a DAG) is exactly the condition under which one exists. 2. Discovery order $5,0,\dots$ puts $0$ second, but $0$ is a sink that several tasks ($5,4$) must precede — it should be near the end. Discovery has no ordering guarantee with respect to edges; only the finishing-time property (proved above) does, and it requires reversing the finish order. 3. Not necessarily the same list — a graph can have many valid topological orders, and the starting vertex and neighbor order affect which one DFS produces. But it must still be valid: the finishing-time argument holds regardless of where the outer loop begins, so every output is a legitimate topological order.

28.6 The complexity of traversal

We have claimed BFS and DFS are "linear." Let us make that precise, because the bound depends entirely on the representation — and seeing exactly why is the analytical payoff of Chapter 14 applied to graphs.

The strategy first. To bound a traversal, count the total work, not the per-vertex work, and account for it in two buckets: work done per vertex and work done per edge. The trick — an amortized counting argument — is to notice that although a single vertex's neighbor-scan can be long, the scans across all vertices together examine each edge a fixed number of times. Summing per-vertex costs that vary wildly is hard; summing "one unit per edge, total" is easy. This is the aggregate method of Chapter 14.

Theorem. On an adjacency list, both BFS and DFS run in $\Theta(V+E)$ time, where $V=\lvert V\rvert$ and $E=\lvert E\rvert$. On an adjacency matrix, both run in $\Theta(V^2)$.

Proof (adjacency list). Each vertex is marked visited exactly once and therefore enqueued/pushed and dequeued/popped (or entered and exited by recursion) exactly once; this contributes $\Theta(V)$ across all vertices, plus $\Theta(V)$ for initializing the visited structure. When a vertex $v$ is processed, the algorithm scans its neighbor list once, doing $O(\deg(v))$ work. Summing over all vertices, the total neighbor-scanning work is $$\sum_{v\in V} \deg(v) = 2E$$ for an undirected graph by the handshaking lemma (Chapter 27), and $\sum_v (\text{out-deg}\,v) = E$ for a directed graph. Either way the scanning is $\Theta(E)$. Adding the buckets: $\Theta(V) + \Theta(E) = \Theta(V+E)$. The $+V$ term matters because of isolated or unreachable vertices and the cost of even starting: a graph with many vertices and no edges still takes $\Theta(V)$ just to set up and (for the component/sort variants) to loop over. $\blacksquare$

Proof (adjacency matrix). Identical accounting, except that listing the neighbors of $v$ now requires scanning its entire matrix row — all $n$ cells — regardless of how many are actually edges. So per-vertex work is $\Theta(V)$ and total work is $V \cdot \Theta(V) = \Theta(V^2)$, which swamps the $\Theta(E)$ edge term. On a dense graph $E = \Theta(V^2)$, so the two representations agree; on a sparse graph the matrix's $\Theta(V^2)$ is dramatically worse than the list's $\Theta(V+E)=\Theta(V)$. $\blacksquare$

That single result is the practical reason the adjacency list is the default representation for traversal-heavy code: on the sparse graphs that dominate practice, it turns a quadratic algorithm into a linear one.

Traversal Adjacency list Adjacency matrix Extra space
BFS $\Theta(V+E)$ $\Theta(V^2)$ $O(V)$ (queue + visited + dist)
DFS $\Theta(V+E)$ $\Theta(V^2)$ $O(V)$ (stack/recursion + visited)

💡 Intuition: "$\Theta(V+E)$" is the best you could hope for: any algorithm that must look at the whole graph has to spend time proportional to the size of its input, and the input is $V$ vertices and $E$ edges. BFS and DFS hit that floor. You cannot traverse a graph faster than reading it.

🔄 Check Your Understanding 1. Why is the neighbor-scanning work $\Theta(E)$ rather than $\Theta(V\cdot E)$, even though each vertex scans a neighbor list? 2. For a sparse graph with $V=10^6$ and $E=2\times 10^6$, give the approximate operation counts for list-based vs. matrix-based BFS. Which is feasible? 3. Why does the $\Theta(V+E)$ bound include the $+V$ term — when does it dominate?

Answers

  1. Because of the handshaking-lemma sum: $\sum_v \deg(v)=2E$ (undirected) or $\sum_v \mathrm{outdeg}(v)=E$ (directed). Each edge is examined a constant number of times across the whole run (once or twice total), not once per vertex. The per-vertex scans add up to "one pass over all edges," i.e. $\Theta(E)$, not a product. 2. List: about $V+E \approx 3\times 10^6$ operations — trivial for a modern machine. Matrix: about $V^2 = 10^{12}$ operations — roughly a trillion, far too slow (and the matrix needs $10^{12}$ cells of memory, which won't fit). Only the list version is feasible. 3. The $+V$ accounts for visiting/initializing every vertex even those with no edges, and for outer loops (components, topo sort) that touch every vertex. It dominates when the graph is very sparse — e.g. $E=0$ (all isolated vertices): then the work is purely $\Theta(V)$, and dropping the $+V$ would wrongly suggest $\Theta(0)$.

Project Checkpoint: adding bfs and dfs to graphs.py

In Chapter 27 you started dmtoolkit/graphs.py with a Graph class (an adjacency-list graph with add_edge and a neighbors(v) method). Now you add the two workhorse traversals — bfs(g, s) and dfs(g, s) — the functions on which Dijkstra (Chapter 29), connectivity, and topological sort all build. We return distances and parents from BFS (so the same function powers both "degrees of separation" and shortest-path reconstruction) and a visiting order from DFS.

Add to dmtoolkit/graphs.py (assuming Graph.neighbors(v) yields the neighbors of v):

from collections import deque

def bfs(g, s):
    """BFS from source s. Returns (dist, parent) dicts over reachable vertices.
    dist[v] is the number of edges on a shortest s->v path; parent rebuilds it."""
    dist = {s: 0}
    parent = {s: None}
    queue = deque([s])
    while queue:
        v = queue.popleft()
        for w in g.neighbors(v):
            if w not in dist:                 # 'in dist' == 'already discovered'
                dist[w] = dist[v] + 1
                parent[w] = v
                queue.append(w)
    return dist, parent

def dfs(g, s):
    """DFS from source s. Returns the list of reachable vertices in
    pre-order (discovery order)."""
    visited = set()
    order = []
    def visit(v):
        visited.add(v)
        order.append(v)
        for w in g.neighbors(v):
            if w not in visited:
                visit(w)
    visit(s)
    return order

# --- quick check on the chapter's example graph ---
g = Graph()
for u, w in [(0,1),(0,2),(1,3),(2,3),(3,4)]:
    g.add_edge(u, w)
dist, parent = bfs(g, 0)
print(dist[4], dfs(g, 0))
# Expected output:
# 3 [0, 1, 3, 2, 4]

The BFS dist[4] == 3 is the "degrees of separation" from vertex $0$ to vertex $4$ — the first real social-network measurement your Toolkit can make. The DFS pre-order [0, 1, 3, 2, 4] matches the hand-trace in 28.3.

How this builds toward the capstone. These two functions are the spine of Part V. In Chapter 29 you will refactor bfs into dijkstra by swapping the FIFO queue for a priority queue keyed on accumulated weight; the structure is identical. The DFS becomes the engine of topological_sort and cycle detection. By the capstone's social-network analyzer (Track B), bfs is what computes separation, component sizes, and the network's diameter — so the dozen lines you add now do real analytical work later. Keep the signatures exactly as written so the later modules compose.

Toolkit so far: logic.py (Ch. 1–3), sets.py (Ch. 8), relations.py (Ch. 12–13), combinatorics.py (Ch. 15–17), recurrences.py (Ch. 18–19), probability.py (Ch. 20–21), numbertheory.py (Ch. 22–23), crypto.py (Ch. 24–25), coding.py (Ch. 26), and now graphs.py gains bfs and dfs atop Chapter 27's Graph.


Summary

Graph traversal rests on two choices: how to store the graph and how to visit it. The reference facts to carry forward:

Representations

Adjacency matrix Adjacency list
Structure $n\times n$ table; $A[i][j]=1$ iff edge per-vertex neighbor lists
Space $\Theta(n^2)$ $\Theta(n+m)$
Edge test $\{i,j\}$? $\Theta(1)$ $\Theta(\deg i)$
List neighbors of $v$ $\Theta(n)$ $\Theta(\deg v)$
Best when dense; many edge tests; linear algebra sparse; traversal-heavy (the default)

Traversals

BFS DFS
Data structure FIFO queue LIFO stack / recursion (call stack)
Explores rings (by distance from $s$) deep along each branch, then backtracks
Special power shortest unweighted distances $\delta(s,v)$ finishing-time order (topo sort, cycles)
Marks visited at enqueue (each vertex queued once) at entry (or at pop, for the iterative form)
Time (list / matrix) $\Theta(V+E)$ / $\Theta(V^2)$ $\Theta(V+E)$ / $\Theta(V^2)$

Applications, each a thin layer over a traversal

  • Connectivity: one traversal reaches all $n$ vertices $\iff$ connected.
  • Connected components: relaunch a traversal from each unvisited vertex; one launch = one component; they partition $V$ (an equivalence relation, Ch. 12). Still $\Theta(V+E)$ overall.
  • Cycle detection (undirected): DFS; a back edge to a visited non-parent vertex $\Rightarrow$ cycle.
  • Topological sort (DAG): DFS post-order, reversed. Exists $\iff$ the digraph is acyclic. Re-derives Ch. 13's poset linear extension.
  • Spanning tree (intro): the tree edges of any BFS/DFS on a connected graph form a spanning tree — $n-1$ edges, no cycle (full treatment in Ch. 32).

Decision rules

  • Need shortest paths and all edges count equally → BFS.
  • Need topological order, cycle detection, or to recurse into structure → DFS.
  • Graph is sparse / you mostly traverse → adjacency list.
  • Graph is dense / you mostly test single edges → adjacency matrix.
  • Weighted edges → neither BFS nor DFS suffices for shortest paths; wait for Dijkstra (Ch. 29).

Spaced Review

Retrieval keeps knowledge available. Answer from memory before checking.

  1. (Ch. 27) State the handshaking lemma. Where exactly did we use it in this chapter, and what would the traversal bound become without it?
  2. (Ch. 27) A graph is connected iff what is true of a traversal started at a single vertex? Give the one-line test in terms of the size of the reached set.
  3. (Ch. 14) The traversal bound $\Theta(V+E)$ was obtained by an aggregate (amortized) counting argument. Restate the argument in one sentence, and explain why a naive "each of $V$ vertices scans up to $V$ neighbors" gives the looser bound $O(V^2)$.
  4. (Ch. 14) Order these from slowest- to fastest-growing for large $n$: $\Theta(V+E)$ on a sparse graph ($E=\Theta(V)$), $\Theta(V^2)$, $\Theta(V\log V)$. Which traversal/representation pairing achieves the fastest?

Answers

  1. The handshaking lemma: $\sum_{v\in V}\deg(v) = 2\lvert E\rvert$ (the sum of all degrees is twice the number of edges). We used it to show that the total neighbor-scanning work across all vertices is $\sum_v \deg(v) = 2E = \Theta(E)$. Without it we could only bound each scan by $V$ and would get the looser $O(V^2)$. 2. Connected iff the traversal from one vertex reaches every vertex; test: len(reached) == n (the number of reached vertices equals the total number of vertices). 3. Aggregate argument: each vertex is processed once ($\Theta(V)$) and, summed over all vertices, the neighbor scans examine each edge a constant number of times, totaling $\Theta(E)$ — so $\Theta(V+E)$. The naive bound multiplies instead of summing — "$V$ vertices times up to $V$ neighbors each" $=O(V^2)$ — because it ignores that most vertices have far fewer than $V$ neighbors and that the degrees sum to only $2E$. 4. From slowest- to fastest-growing: $\Theta(V^2) > \Theta(V\log V) > \Theta(V+E)=\Theta(V)$ on a sparse graph. The fastest pairing is BFS or DFS on an adjacency list, achieving $\Theta(V+E)$.

What's Next

BFS answers "how many edges lie between two vertices" — perfect when every edge is equal. But real maps and networks attach a weight to each edge: kilometers of road, milliseconds of latency, dollars of cost. Now the shortest path is the one of least total weight, and a route with more edges can beat one with fewer. BFS's ring argument collapses. The fix is to process vertices not in order of edge count but in order of accumulated weight, using a priority queue — and that single change turns BFS into Dijkstra's algorithm. Chapter 29 develops weighted graphs and shortest paths, proves Dijkstra correct with a greedy-plus-invariant argument that echoes the loop invariants of Chapter 6, and then handles the hard case Dijkstra cannot — negative edge weights — with Bellman–Ford. The traversals you built here are the chassis; Chapter 29 adds the engine.