Case Study: Building a Dependency-Aware Build Scheduler with Topological Sort
"A build system's only real job is to do things in an order that never breaks a promise."
Executive Summary
In this build-focused case study you will construct, from scratch, a miniature build scheduler — the
beating heart of tools like make, Bazel, and every CI pipeline. Given a set of build targets and their
"must come before" dependencies, the scheduler must output an order in which each target is built only
after everything it depends on. That is exactly the topological sort of §28.5, recovered as a
depth-first-search post-order. But a real scheduler must also refuse impossible inputs: if the
dependencies contain a cycle ("A needs B, B needs A"), no valid order exists, and the tool must say
so rather than emit garbage. So you will also build the directed cycle detector that the chapter
gestures at — the three-state DFS that watches its own recursion stack.
You will build in four increments, each a working program: (1) parse dependencies into a directed graph,
(2) topologically sort a known-good DAG, (3) detect a cycle and report it, and (4) fold both into one
robust schedule() that either returns a build order or raises a precise error. Where Case Study 1 was
about reading a graph, this one is about building a tool on top of a traversal.
Skills applied
- Translating "X must be built before Y" into a directed graph and an adjacency list (§28.1, Chapter 27).
- Implementing topological sort as a DFS post-order, reversed (§28.5).
- Implementing directed cycle detection with the unvisited / on-stack / finished state machine (§28.5, contrasted with the undirected test of §28.4).
- Recognizing that topological order exists iff the dependency graph is a DAG (§28.5, Chapter 13).
- Composing the two into a production-shaped API that fails loudly on bad input.
Background
The scenario
You are writing the dependency engine for a small build tool. A project's build is described by targets and dependency edges. We adopt the convention from §28.5: an edge $a \to b$ means "$a$ must be built before $b$" (so the arrow points from prerequisite to dependent). A valid build order is then a listing of all targets in which, for every edge $a \to b$, target $a$ appears before target $b$ — precisely a topological order.
🔗 Connection. You met this problem twice already. In Chapter 13 a topological sort was a linear extension of a partial order — flattening a poset of "must precede" relations into a sequence — and
makewas the motivating example. In §28.5 we re-derived the same ordering from depth-first search. This case study turns that re-derivation into a working tool, and — the new piece — handles the input a theory chapter can assume away: a dependency graph that is not a DAG.
The dependencies
Here is a realistic (if small) C-project build. Read each edge as "left must be built before right."
config -> utils (utils includes config.h)
config -> db
utils -> db
utils -> api
db -> api
api -> web
api -> cli
logger -> utils
logger -> api
Targets with no outgoing edges (web, cli) are final artifacts; targets with no incoming edges
(config, logger) are leaves of the dependency tree that can be built first. Our scheduler must
discover an order — some order — respecting all nine edges.
💡 Intuition: Think of each edge as a promise: "I will not build
dbuntilconfigandutilsare done." A topological sort is a single global ordering that keeps every promise simultaneously. DFS finds one by the §28.5 trick: a target finishes (in DFS) only after everything reachable from it finishes, so listing targets in reverse finish order puts every prerequisite ahead of its dependents.
Phase 1: Parse dependencies into a directed graph
First, build the adjacency list from a flat list of edges. Every target must appear as a key — even ones with no dependencies — so the traversal knows it exists (the §28.1 lesson: an isolated vertex still needs a key, here an empty neighbor list).
def build_graph(edges):
"""edges: list of (prereq, dependent) pairs meaning prereq -> dependent.
Returns an adjacency list with EVERY mentioned target as a key."""
adj = {}
for a, b in edges:
adj.setdefault(a, [])
adj.setdefault(b, []) # ensure dependents are keys too
adj[a].append(b)
return adj
edges = [
("config", "utils"), ("config", "db"),
("utils", "db"), ("utils", "api"),
("db", "api"),
("api", "web"), ("api", "cli"),
("logger", "utils"), ("logger", "api"),
]
adj = build_graph(edges)
for k in adj:
print(k, "->", adj[k])
# Expected output:
# config -> ['utils', 'db']
# utils -> ['db', 'api']
# db -> ['api']
# api -> ['web', 'cli']
# web -> []
# cli -> []
# logger -> ['utils', 'api']
The keys appear in insertion order (Python 3.7+ dicts preserve it): config, utils, db, api, web, cli,
logger. That order matters — it is the order the topological sort's outer loop will consider targets, and
hence (together with neighbor order) determines which valid topological order we get among the many that
exist.
🔄 Check Your Understanding Why do
webandcliappear as keys with empty lists, even though they are never on the left of an edge?
Answer
They are still targets — they must be built, and the scheduler must place them in the output. The
setdefault(b, [])line guarantees every dependent becomes a key. A target absent from the keys would simply be missing from the build order, exactly the §28.1 reason an isolated vertex still gets an empty neighbor list.
Phase 2: Topological sort of the (good) DAG
Now the core algorithm, taken verbatim from §28.5: DFS, append each vertex when it finishes, then reverse.
def topological_sort(adj):
"""Topological order of a DAG via DFS post-order (reversed)."""
visited = set()
finish = [] # vertices in order of FINISHING
def visit(v):
visited.add(v)
for w in adj[v]:
if w not in visited:
visit(w)
finish.append(v) # record AFTER all descendants finish
for v in adj:
if v not in visited:
visit(v)
finish.reverse()
return finish
order = topological_sort(adj)
print(order)
# Expected output:
# ['logger', 'config', 'utils', 'db', 'api', 'cli', 'web']
This output is load-bearing, so let us hand-derive it with a finish-order trace, exactly like the
§28.5 worked example. The outer loop visits keys in insertion order: config, utils, db, api, web, cli,
logger. It starts at config.
| Event | Detail | finish (append order) |
|---|---|---|
visit(config) |
dive to first neighbor utils |
— |
visit(utils) |
dive to first neighbor db |
— |
visit(db) |
dive to its only neighbor api |
— |
visit(api) |
dive to web |
— |
finish web |
leaf (no neighbors) | [web] |
back in api |
next neighbor cli |
— |
finish cli |
leaf | [web, cli] |
finish api |
both neighbors done | [web, cli, api] |
finish db |
its neighbor api done |
[web, cli, api, db] |
back in utils |
next neighbor api already visited |
— |
finish utils |
done | [web, cli, api, db, utils] |
back in config |
next neighbor db already visited |
— |
finish config |
done | [web, cli, api, db, utils, config] |
| outer loop | utils…cli all visited; reach logger |
— |
visit(logger) |
neighbors utils, api already visited |
— |
finish logger |
nothing new | [web, cli, api, db, utils, config, logger] |
Finish order is [web, cli, api, db, utils, config, logger]; reversed, the topological order is
['logger', 'config', 'utils', 'db', 'api', 'cli', 'web']. Now verify the promises — spot-check a few
edges against the output:
def respects(order, edges):
pos = {t: i for i, t in enumerate(order)}
return all(pos[a] < pos[b] for a, b in edges)
print("all dependencies respected?", respects(order, edges))
print("config before db?", order.index("config") < order.index("db"))
print("logger before api?", order.index("logger") < order.index("api"))
# Expected output:
# all dependencies respected? True
# config before db? True
# logger before api? True
Every edge points "forward" in the list — config (index 1) precedes db (index 3); api (index 4)
precedes both cli (5) and web (6); logger (index 0) precedes everything it feeds. That is the whole
promise of a topological order.
⚠️ Common Pitfall: This output is one valid order, not the order.
['config', 'logger', ...]would be equally valid —configandloggerare incomparable (neither must precede the other), so a build system is free to choose, and could even build them in parallel. The §28.5 algorithm produces a specific one determined by key-insertion order and neighbor order; do not write a test that demands an exact list unless you have pinned both. Test the property (respects(order, edges)), not the exact permutation.🐛 Find the Error. A teammate "simplifies" the sort by recording each target at discovery (pre-order) and skipping the
finish.reverse():def visit(v): visited.add(v); order.append(v); for w .... On this graph it happens to emit['config', 'utils', 'db', 'api', 'web', 'cli', 'logger']. Spot the bug and the failing promise.
Answer
Recording at discovery has no ordering guarantee with respect to edges (§28.5). On this graph the emitted list ends with
logger, butlogger -> utilsandlogger -> apirequireloggerto come beforeutilsandapi— and it comes after both. The promiselogger before utilsis broken. Pre-order passes on lucky inputs and fails on the rest; only the reversed finish order is correct. This is exactly theme two of the book: passing on one input is not correctness.
Phase 3: Detect a cycle (so we can reject impossible builds)
Phase 2 assumed the input was a DAG. Real configs contain mistakes. Suppose someone adds one bad edge —
api -> config ("api must be built before config") — to the existing dependencies. Now there is a
directed cycle: config -> utils -> api -> config. No build order can satisfy it (config would
have to come before and after itself), so the scheduler must detect it.
Here is the §28.4 trap, made concrete: the undirected cycle test does not work for directed graphs. We need the directed version — a DFS that tracks three states per target:
- unvisited — not yet reached;
- on the recursion stack ("gray") — started but not finished, i.e. an ancestor of the current call;
- finished ("black") — fully explored.
A directed cycle exists exactly when DFS follows an edge to a target that is currently on the stack — a back edge to an ancestor.
def find_cycle(adj):
"""Return a target that lies on a directed cycle, or None if the graph is a DAG."""
UNVISITED, ON_STACK, DONE = 0, 1, 2
state = {v: UNVISITED for v in adj}
def visit(v):
state[v] = ON_STACK
for w in adj[v]:
if state[w] == ON_STACK: # edge to an ancestor -> cycle!
return w
if state[w] == UNVISITED:
hit = visit(w)
if hit is not None:
return hit
state[v] = DONE
return None
for v in adj:
if state[v] == UNVISITED:
hit = visit(v)
if hit is not None:
return hit
return None
good = build_graph(edges)
bad = build_graph(edges + [("api", "config")]) # inject the cycle
print("cycle in good graph:", find_cycle(good))
print("cycle in bad graph:", find_cycle(bad))
# Expected output:
# cycle in good graph: None
# cycle in bad graph: config
Trace the bad graph to see config get caught. DFS starts at config, marks it ON_STACK, dives
config -> utils -> db -> api (each marked ON_STACK as it is entered). Now inside visit(api) the
neighbors are web, cli, config (the injected edge appended last). It finishes web and cli (marking
them DONE), then examines the edge api -> config: state["config"] is ON_STACK — config is an
ancestor still open on the recursion stack — so DFS returns config. The cycle
config -> utils -> db -> api -> config is exposed. On the good graph, no edge ever points to an
ON_STACK target (every back-pointing edge lands on a DONE one), so find_cycle returns None.
💡 Intuition: "ON_STACK" is the set of targets on the current path from the root of the DFS tree — your unbroken chain of open recursive calls. An edge to one of those is a shortcut back up your own path: a loop. An edge to a DONE target is fine — it points into a region you already finished, off your current path, so it closes no loop. The whole subtlety of directed cycle detection is distinguishing "ancestor on my path" (ON_STACK) from "already finished elsewhere" (DONE) — a distinction the undirected test of §28.4 never needed.
🔄 Check Your Understanding Why is the undirected
has_cyclefrom §28.4 — "report a cycle on any edge to a visited non-parent vertex" — wrong here? Give the smallest DAG it would misjudge.
Answer
Ignoring direction conflates two questions (§28.5). The diamond
a -> b, a -> c, b -> d, c -> dis a perfectly good DAG with no directed cycle, yet its underlying undirected graph has the 4-cycle $a\!-\!b\!-\!d\!-\!c\!-\!a$, which the undirected test would flag. Directed detection must respect direction by tracking the recursion stack, reporting a cycle only on a back edge to an ancestor (ON_STACK), not to any finished vertex.
Phase 4: Compose — a robust schedule() that fails loudly
A production scheduler should return a build order or refuse with a precise error — never silently emit a broken order. Compose Phases 2 and 3: check for a cycle first; if clean, return the topological sort.
class CyclicDependencyError(Exception):
pass
def schedule(edges):
"""Return a valid build order, or raise CyclicDependencyError if impossible."""
adj = build_graph(edges)
bad = find_cycle(adj)
if bad is not None:
raise CyclicDependencyError(
f"dependency cycle detected (involves '{bad}'); no build order exists"
)
return topological_sort(adj)
# Good input -> an order:
print(schedule(edges))
# Bad input -> a precise failure:
try:
schedule(edges + [("api", "config")])
except CyclicDependencyError as e:
print("ERROR:", e)
# Expected output:
# ['logger', 'config', 'utils', 'db', 'api', 'cli', 'web']
# ERROR: dependency cycle detected (involves 'config'); no build order exists
This is the §28.5 theorem made operational: a topological order exists if and only if the dependency
graph is a DAG. schedule enforces both halves — it produces an order when one exists, and proves
none exists (by exhibiting a cycle) when it does not.
⚠️ Common Pitfall: Running
topological_sorton a graph with a cycle does not raise an error on its own — the §28.5 implementation will happily return some list (because itsvisitedset treats on-stack and finished vertices identically), and that list will silently violate a dependency. The integrity comes from checkingfind_cyclefirst. A robust real-world implementation often fuses the two passes — a single DFS that builds the post-order and raises the instant it meets an ON_STACK vertex — but the two-pass version here is easier to read and prove correct, and asymptotically identical at $\Theta(V + E)$.🔄 Check Your Understanding
schedulerunsfind_cycle(one DFS) and thentopological_sort(another DFS). What is its overall running time, and why does running two traversals not change the asymptotic class?
Answer
Each DFS is $\Theta(V + E)$ on the adjacency list (§28.6). Two of them in sequence add: $\Theta(V+E) + \Theta(V+E) = \Theta(V + E)$ — a constant factor of 2, which big-Theta absorbs. The scheduler is linear in the size of the dependency graph.
Phase 5: Counting the valid orders (a structural aside)
How many valid build orders does the good graph admit? Each one is a distinct topological order, and the
count depends on how much freedom the partial order leaves. config and logger are incomparable, and
once their prerequisites are met, web and cli (both depending only on api) can come in either order.
A by-hand enumeration shows the orders factor as "interleave the two independent leaves at the front" ×
"order the two independent artifacts at the end," giving several valid permutations — all of which
respects(order, edges) would accept. The single order schedule returns is just the one DFS happens to
produce. For a build tool the practical consequence is profound: incomparable targets can be built in
parallel, and the count of topological orders measures the available parallelism.
🔗 Connection. This ties back to Chapter 13: the number of linear extensions of a poset is a classic (and computationally hard in general) quantity. A build tool does not enumerate them; it picks one cheaply with DFS and exploits the incomparability to schedule independent work concurrently.
Discussion Questions
- We chose the convention $a \to b$ = "$a$ before $b$" and used DFS post-order reversed. If you flip
the convention so $a \to b$ means "$a$ depends on $b$" (arrow from dependent to prerequisite), what
changes in
topological_sort— do you still reverse, or not? Work it out on a two-node example. scheduledoes two DFS passes. Sketch how to fuse them into one DFS that returns the post-order and raisesCyclicDependencyErrorthe instant it follows an edge to an ON_STACK vertex. Which single data structure replaces bothvisitedand the post-order list?- The error message names one target on the cycle (
config). How would you modifyfind_cycleto report the entire cycle as a path (e.g.config -> utils -> db -> api -> config)? (Hint: keep the recursion stack as an explicit list and slice it when you hit the ON_STACK vertex.) - Real build tools (
make, Bazel) also want to rebuild only what changed. Given the graph and a set of "dirty" targets, how would you use a traversal to compute exactly the set of targets that must be rebuilt — and in what order? Which traversal, and from which sources? - Topological sort says some order exists; it does not minimize build time. If each target had a known compile duration and unlimited parallel workers, which classic quantity over the DAG gives the shortest possible total build time, and why is a plain topological order not enough to find it?
Your Turn: Extensions
- Option A (fuse the passes). Implement
schedule_one_pass(edges): a single DFS that appends to the post-order and raisesCyclicDependencyErroron a back edge to an ON_STACK vertex. Verify it returns the same order asscheduleon the good graph and raises on the bad one. - Option B (report the full cycle). Extend
find_cycleto return the cycle as a list of targets, by maintaining an explicit path stack and slicing it when an ON_STACK vertex is met. Hand-derive the cycle list for the injectedapi -> configedge: it should be['config', 'utils', 'db', 'api'](or a rotation of it). - Option C (incremental rebuild). Given a set
dirtyof changed targets, compute the set of targets that transitively depend on any dirty one (a traversal forward along the edges from each dirty target), then topologically sort just that subset. Test it withdirty = {"utils"}and hand-derive which targets must rebuild. - Option D (Kahn's algorithm). Implement the other standard topological sort — the BFS-flavored Kahn's algorithm: repeatedly remove a target with in-degree $0$, decrementing its dependents' in-degrees. It both sorts and detects cycles (if any targets remain with nonzero in-degree, a cycle exists). Compare its output and its cycle-handling to the DFS version on both graphs.
Key Takeaways
- Topological sort = DFS post-order, reversed. Record each target when it finishes, reverse the list, and every prerequisite precedes its dependents — the §28.5 algorithm, now a working scheduler.
- A real tool must reject impossible inputs. A topological order exists iff the graph is a DAG; a scheduler must detect cycles and fail loudly rather than emit a broken order (§28.5, Chapter 13).
- Directed cycle detection needs three states. The undirected §28.4 test is wrong for directed graphs; you must distinguish on the recursion stack (ancestor) from finished, reporting a cycle only on a back edge to an ON_STACK vertex.
- Sequencing adds. Two DFS passes are still $\Theta(V + E)$ — linear in the dependency graph (§28.6). Composing traversals does not change the asymptotic class.
- Many orders, one chosen. The output is one valid topological order among possibly many; test the
property (
respects), not the exact permutation. Incomparable targets are the parallelism a build tool can exploit (Chapter 13).