Chapter 31: Trees — Key Takeaways

A one-page, exam-ready reference. Skim before a quiz; chase details back into index.md.


Definitions at a glance

Term Definition
Tree Connected, acyclic undirected graph
Forest Acyclic graph (disjoint union of trees; connectivity not required)
Leaf Vertex of degree 1 (rooted: a vertex with no children)
Rooted tree Tree + a distinguished root; edges oriented away from the root
Parent / child On edge $\{u,v\}$ with $u$ nearer the root: $u$ = parent, $v$ = child
Depth of $v$ # edges on the root-to-$v$ path (root has depth 0) — measured down
Height of tree Max depth over all vertices = longest root-to-leaf path — measured up
Binary tree Rooted tree, each node $\le 2$ children, labeled left/right
Full binary tree Every internal node has exactly 2 children
Complete binary tree All levels full except possibly the last, filled left-to-right
BST Binary tree where, at every node, left subtree keys $<$ node $<$ right subtree keys
Heap (min) Complete binary tree where every node $\le$ its children (min at root)
Expression tree Binary tree: internal nodes = operators, leaves = operands
Trie Rooted tree; edge labels are characters; root-to-node path spells a prefix
Prefix code Codeword set where no codeword is a prefix of another (⟺ symbols at leaves)
Huffman coding Greedy build of an optimal prefix-code tree by repeatedly merging the 2 lowest frequencies

Key theorems

Theorem Statement Proof engine
Characterizations (31.1) Tree ⟺ unique path between every pair ⟺ minimally connected ⟺ maximally acyclic cycle of implications
Leaf existence (31.3) Tree with $\ge 2$ vertices has a degree-1 vertex take the longest path; its endpoint has degree 1
Edge count (31.2) Tree on $n$ vertices has exactly $n-1$ edges induction: strip a leaf, recurse
Converse Connected graph with $n-1$ edges is a tree too few edges to also have a cycle
Huffman optimality (31.4) Huffman minimizes $\sum_s f(s)\,d(s)$ over all prefix codes greedy exchange argument + induction

Memory hook: connected + acyclic ⇒ exactly $n-1$ edges. Spend every edge on connectivity, none on redundancy. Remove any edge → disconnects; add any edge → makes a cycle.


Traversals — pick by when the root is visited

Traversal Visit order Yields Use it to…
Preorder root, $L$, $R$ prefix / Polish serialize/copy a tree, top-down listing
Inorder $L$, root, $R$ infix read a BST in sorted order
Postorder $L$, $R$, root postfix / RPN evaluate expression tree; free a tree
Level-order by increasing depth BFS on a tree (Ch. 28)

Sample tree 1(2(4,5),3): pre $=[1,2,4,5,3]$; in $=[4,2,5,1,3]$; post $=[4,5,2,3,1]$. Preorder + inorder uniquely reconstruct a binary tree.


Formulas to memorize

Quantity Formula
Edges in a tree $\lvert E\rvert = n - 1$
BST / heap operation cost $O(h)$, where $h$ = height
Balanced height $h \approx \log_2 n$ → $O(\log n)$ ops
Degenerate height $h \approx n$ → $O(n)$ ops (sorted input!)
Array heap: children of $i$ left $=2i+1$, right $=2i+2$
Array heap: parent of $i$ $\lfloor (i-1)/2 \rfloor$
Heap ops insert/remove-min $O(\log n)$, peek-min $O(1)$
Trie lookup (length $m$) $O(m)$, independent of dictionary size
Prefix-code cost $\sum_s f(s)\,d(s)$ ($f$ = frequency, $d$ = codeword length = leaf depth)

"Which structure?" decision aid

  • Need sorted dictionary with fast search/insert → balanced BST (AVL / red-black). Plain BST only if input is random — sorted input → $O(n)$ stick.
  • Need to repeatedly pull the min/maxbinary heap (priority queue; powers Dijkstra, Ch. 29).
  • Need to evaluate or store arithmeticexpression tree (evaluate in postorder).
  • Need prefix queries / autocomplete on strings → trie ($O(m)$ per query).
  • Need frequency-based compression with unambiguous decoding → Huffman prefix code.

Common pitfalls

  • Depth vs. height — opposite directions. Depth from the root (root = 0); height = max depth.
  • Plain BST ≠ balanced — sorted/adversarial inserts make a degenerate $O(n)$ chain. Use a self-balancing tree for guarantees.
  • Freeing a tree — must be postorder (children before parent), or you orphan/dangle the children.
  • Prefix-code leaves — every symbol must be at a leaf; a symbol at an internal node breaks prefix-freeness.
  • Induction base case — the $n-1$ edge proof starts at $n=1$ (0 edges); a single isolated vertex is itself a (trivial) tree.

Toolkit additions (graphs.py)

Function Purpose Builds on
is_tree(g) True iff $g$ is connected with $\lvert V\rvert-1$ edges Graph/neighbors (Ch. 27), bfs (Ch. 28)
tree_height(g, root) Longest root-to-leaf path length recursion on children

Feeds Ch. 32 (spanning trees / MST): is_tree validates a candidate spanning tree from Kruskal/Prim.


Connections

  • Back to Ch. 7: every tree proof here is structural induction (strip a leaf / split at root).
  • Back to Ch. 27: trees are graphs; edges counted via the handshaking lemma.
  • Heap = priority queue promised in Ch. 29 (Dijkstra).
  • Huffman → Ch. 38 (coding theory: optimal codes, error correction).
  • Exchange argument → Ch. 32 (MST cut property — the same greedy-correctness pattern).