Further Reading: Spanning Trees, MSTs, and Network Design

Annotated pointers for going deeper. Start with the textbook treatments to firm up the algorithms and the cut-property proof, then branch toward union-find's analysis, the history, and the applications (clustering, approximation) as your interest pulls you. All sources are Tier-1 or Tier-2.

Core textbook treatments

Rosen, Discrete Mathematics and Its Applications (8th ed.), §11.4–11.5 (trees, spanning trees, minimum spanning trees) The market-standard discrete-math treatment, with Kruskal's and Prim's algorithms presented at exactly this chapter's level and a large exercise bank. The place to go for more drill problems on hand-tracing the two algorithms.

Cormen, Leiserson, Rivest & Stein (CLRS), Introduction to Algorithms (4th ed.), Chapter 23 (Minimum Spanning Trees) and Chapter 19 (Data Structures for Disjoint Sets) The canonical algorithmic reference. Chapter 23 develops the generic MST algorithm and the cut property as a single unifying lemma — Kruskal and Prim both fall out as instances, mirroring §32.5 exactly. Chapter 19 is the reference for union-find, including the union-by-rank-plus-path-compression analysis. Read these two together with this chapter.

Lehman, Leighton & Meyer, Mathematics for Computer Science (MIT 6.042), the graph-theory chapters on trees and spanning trees Freely available. The CS-flavored counterpart: spanning trees developed alongside the proof technique, with the same "prove it, don't just assert it" spirit as this chapter. A good companion for the cut- and cycle-property arguments.

On the algorithms in depth

West, Introduction to Graph Theory (2nd ed.), the spanning-tree sections A more mathematically thorough treatment of trees and spanning trees, including Cayley's formula and the matrix-tree theorem mentioned in §32.1. Reach for this if the "how many spanning trees?" question (and its beautiful closed forms) caught your interest.

Kleinberg & Tardos, Algorithm Design, the chapter on greedy algorithms (minimum spanning trees and clustering) The clearest textbook account of why greedy works here, built entirely on exchange arguments, and it explicitly connects the MST to maximum-spacing clustering — the theoretical backing for Case Study 1. The single best next read if you want to generalize the cut property to other greedy proofs.

Sedgewick & Wayne, Algorithms (4th ed.), §4.3 (Minimum Spanning Trees) Companion site freely available. A working-programmer's presentation with carefully engineered Kruskal and Prim implementations (including the lazy vs. eager Prim distinction) and excellent visualizations of the cut property in action. Good if you want to see production-quality code after building yours from scratch.

History and the bigger picture

Graham & Hell, "On the History of the Minimum Spanning Tree Problem" (Annals of the History of Computing, 1985) Tier 2. The definitive history: Borůvka's 1926 origin (the Moravian electrical grid named in §32.6), Jarník, Kruskal, and Prim, and how the problem was repeatedly rediscovered. A short, readable account that puts the algorithms in context.

Chazelle, "A Minimum Spanning Tree Algorithm with Inverse-Ackermann Type Complexity" (Journal of the ACM, 2000) Tier 2. The frontier of MST algorithms — a near-linear-time method whose complexity involves the same inverse-Ackermann function as union-find. Skim the introduction for a sense of how far past $O(E \log V)$ the problem has been pushed; the full proof is research-level.

Applications

Tan, Steinbach, Karpatne & Kumar, Introduction to Data Mining (2nd ed.), the hierarchical clustering section Tier 2. The standard data-mining treatment of single-linkage (and other) agglomerative clustering, including the chaining failure mode demonstrated in Case Study 1. Read it to see the MST-based view sitting inside the broader clustering landscape.

CLRS (4th ed.), §35.2 (the MST-based 2-approximation for metric TSP) The clean proof that walking an MST yields a traveling-salesman tour at most twice optimal — the approximation result previewed in §32.6. A satisfying payoff showing the MST as scaffolding under a hard problem.

Suggested order

Re-read §§32.3–32.5 here, then work Rosen §11.4–11.5 for hand-tracing drill.
Read CLRS Chapter 23 (generic MST + cut property) and Chapter 19 (union-find) together — this is the algorithmic core.
Read the Kleinberg–Tardos greedy/MST chapter for the exchange-argument generalization and the clustering connection (pairs with Case Study 1).
For history, read Graham & Hell; for the clustering application, the Introduction to Data Mining section; save Chazelle and the TSP approximation for a deep dive after Chapter 30.