Case Study: Inside the Calculator — Analyzing an Expression-Tree Evaluator

DataField.Dev

Case Study: Inside the Calculator — Analyzing an Expression-Tree Evaluator

"The structure of the data determines the structure of the program." — a folk theorem of computer science, in the spirit of Niklaus Wirth's Algorithms + Data Structures = Programs

Executive Summary

Every spreadsheet cell, every eval-free calculator, and every compiler front-end faces the same problem: a human wrote 3 + 4 * 2 - 1 as a flat string, but a machine can only compute by combining two numbers at a time, in the right order, respecting precedence. The bridge between the flat string and the stepwise computation is a tree — specifically the expression tree of §31.5. In this case study you will analyze a small but complete arithmetic evaluator: you will read its expression-tree representation, hand-derive what each of the three traversals from §31.3 produces, prove why postorder is the evaluation order, and trace the evaluator on a worked input down to the number. The emphasis here is analysis — understanding why an existing design is correct and what each part buys you — rather than building from a blank page. (Case Study 2 takes the build-it route, with a Huffman compressor.)

The punchline you will be able to defend: the three traversals of one expression tree are exactly the three notations humans and machines use for arithmetic — infix, prefix (Polish), and postfix (Reverse Polish) — and the choice of evaluation order is forced by the data dependency "an operator needs its operands' values first."

Skills applied

Reading an expression tree and connecting it to §31.5's definition (operators at internal nodes, operands at leaves).
Hand-deriving preorder, inorder, and postorder traversals (§31.3) and identifying them as prefix, infix, and postfix notation.
Proving, by structural induction, that a postorder evaluation returns the value of the expression.
Tracing a recursive evaluator and reading off its # Expected output: by hand.
Analyzing the cost of evaluation in terms of tree size (a Chapter 14 cost-analysis connection).

Background

The scenario

A teammate has shipped the parsing half of a small calculator — the part that turns a string like 3 + 4 * 2 into an expression tree — and asks you to review the evaluator that consumes the tree. You will not touch the parser (turning a string into a tree is a separate topic); your job is to be sure the tree-to-number step is correct, and to document, for the rest of the team, why the design uses the traversals it does.

Here is the node type and the tree the parser produces for 3 + 4 * 2. Note that the parser has already done the hard work of respecting precedence: multiplication binds tighter than addition, so the * node sits below the + node.

class Node:
    """An expression-tree node: a value plus optional left/right children.
    Leaves hold numbers; internal nodes hold an operator string."""
    def __init__(self, val, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

# Parser output for the string "3 + 4 * 2":
#        +
#       / \
#      3   *
#         / \
#        4   2
tree = Node('+', Node(3), Node('*', Node(4), Node(2)))

💡 Intuition: The tree is the precedence. In the flat string 3 + 4 * 2, precedence is an implicit rule you carry in your head. In the tree, precedence is made structural: because * is deeper than +, the multiplication is forced to happen first, no matter how we walk the tree. The parser's whole job is to translate the precedence rules into depth. Our job is to read a value off the finished tree — and the structure now does the bookkeeping for us.

Why this matters

Expression trees (also called abstract syntax trees, ASTs, when generalized beyond arithmetic) are the central data structure of every compiler and interpreter. Once code is a tree, everything a language tool does is a tree traversal: evaluation is one traversal, pretty-printing is another, type-checking is another, optimization rewrites the tree, and code generation walks it one last time. Understanding the three traversals on the small arithmetic case is the foundation for understanding all of them. And the correctness argument we will give — structural induction over the recursive definition of the tree — is exactly the Chapter 7 pattern the whole of Chapter 31 keeps cashing in.

🔗 Connection: This is the "compiler hook" the chapter's Overview promised: your compiler turns 2 * (3 + 4) into a tree before it emits a single instruction. Here we look inside that step. The same structural-induction proof technique you met in Chapter 7 (and used in Theorem 31.2 to count edges) is what certifies the evaluator.

What "well-formed" means — pinning down the hypothesis

Before we can prove anything about the evaluator, we must say precisely which trees it is supposed to handle, because a proof is only as good as the assumptions it names. We will call an expression tree well-formed when it satisfies the recursive definition of §31.5:

every leaf holds a number (an operand), and
every internal node holds a binary operator and has exactly two children.

This is itself a tiny structural definition (basis: a number-leaf; recursive clause: an operator with two well-formed subtrees), and it is exactly the class the parser is contracted to produce. It rules out malformed shapes that would crash or mislead the evaluator: an operator node with only one child (the recursion would call evaluate(None) and fail the leaf test), or a leaf holding an operator string (the first branch would return '+' as if it were a number). Naming well-formedness now is what lets the Phase-2 proof induct cleanly: the inductive step may assume both subtrees are themselves well-formed, which is precisely the recursive clause of the definition.

⚠️ Common Pitfall: Skipping the well-formedness assumption and "proving" the evaluator correct for all trees. The evaluator is not correct on a malformed tree (e.g. a '+' node with one child) — and that is fine, because the parser never produces one. A correctness theorem must state the class of inputs it covers; "correct on well-formed expression trees" is a true and useful theorem, while "correct on all binary trees" is false. Precision about the hypothesis is not pedantry; it is the difference between a theorem and a wish.

Phase 1: Read the three traversals off the tree

Before evaluating, let us analyze the tree's shape through the three traversals of §31.3. Recall the recursive definitions: for root $r$, left subtree $L$, right subtree $R$,

Preorder: visit $r$, then $L$, then $R$ (root first);
Inorder: visit $L$, then $r$, then $R$ (root between);
Postorder: visit $L$, then $R$, then $r$ (root last).

We hand-trace each on the tree for 3 + 4 * 2, which is Node('+', Node(3), Node('*', Node(4), Node(2))).

Preorder. Visit the root +. Recurse left into the leaf 3 (visit 3). Recurse right into the * node: visit *, then its left leaf 4, then its right leaf 2. Reading the visits in order:

$$\text{preorder} = +\ 3\ *\ 4\ 2.$$

This is prefix (Polish) notation: the operator precedes its operands. A prefix string is unambiguous without parentheses — + 3 * 4 2 can only mean "add 3 to (multiply 4 and 2)."

Inorder. Recurse left first into leaf 3 (visit 3), then visit the root +, then recurse right into the * subtree, whose inorder is 4 * 2. Reading in order:

$$\text{inorder} = 3\ +\ 4\ *\ 2.$$

This is infix notation — the everyday form. Notice the subtlety: the bare inorder string 3 + 4 * 2 has lost the precedence information that the tree made explicit. To recover the exact tree from infix you must add parentheses around each operator's subtree: 3 + (4 * 2). This is why infix needs precedence rules (or parentheses) but the tree does not.

Postorder. Recurse left into leaf 3 (visit 3), recurse right into the * subtree whose postorder is 4 2 *, then visit the root + last. Reading in order:

$$\text{postorder} = 3\ 4\ 2\ *\ +.$$

This is postfix / Reverse Polish Notation (RPN). Like prefix, it is parenthesis-free and unambiguous, and — crucially — it is the order in which a stack machine can evaluate without lookahead: push operands, and when an operator appears, pop the right number of operands and push the result.

def preorder(node):
    if node is None: return []
    return [node.val] + preorder(node.left) + preorder(node.right)

def inorder(node):
    if node is None: return []
    return inorder(node.left) + [node.val] + inorder(node.right)

def postorder(node):
    if node is None: return []
    return postorder(node.left) + [node.val] + postorder(node.right)

tree = Node('+', Node(3), Node('*', Node(4), Node(2)))
print(preorder(tree))    # prefix  (Polish)
print(inorder(tree))     # infix   (loses precedence without parens)
print(postorder(tree))   # postfix (Reverse Polish Notation)
# Expected output:
# ['+', 3, '*', 4, 2]
# [3, '+', 4, '*', 2]
# [3, 4, 2, '*', '+']

🔄 Check Your Understanding Without re-running the trace, give all three traversals for the tree of (3 + 4) * 2, which is Node('*', Node('+', Node(3), Node(4)), Node(2)). How does the inorder compare to the inorder above, and why is that a problem for infix notation?

Answer

Preorder: * + 3 4 2. Inorder: 3 + 4 * 2. Postorder: 3 4 + 2 *. The inorder is identical to the inorder of 3 + 4 * 2 from the body — both flatten to 3 + 4 * 2 — yet the two trees are different and evaluate to different numbers (14 vs. 11). This is exactly why infix is ambiguous without parentheses or precedence rules: two different trees share one infix string. Prefix and postfix, by contrast, distinguish them (* + 3 4 2 vs. + 3 * 4 2).

Phase 2: Why postorder is the evaluation order

The evaluator's correctness rests on one observation: to apply an operator, you need the values of both operand subtrees first. That is a data dependency, and it forces a bottom-up order — which is exactly postorder ($L$, $R$, then the node).

Here is the evaluator under review:

def evaluate(node):
    """Return the numeric value of an expression tree.
    Leaves hold numbers; internal nodes hold '+', '-', '*', or '/'."""
    if node.left is None and node.right is None:    # leaf: an operand
        return node.val
    a = evaluate(node.left)                          # value of left subtree  (recurse)
    b = evaluate(node.right)                         # value of right subtree (recurse)
    op = node.val
    if op == '+': return a + b
    if op == '-': return a - b
    if op == '*': return a * b
    return a / b                                     # op == '/'

Notice the shape of the recursion: it evaluates node.left, then node.right, then combines them at node. That is the postorder pattern — the recursion is a postorder traversal that, instead of appending the node's value to a list, applies the node's operator to the two values returned from below.

The strategy first (why evaluate is correct). We prove, by structural induction on the expression tree (Chapter 7), that evaluate(t) returns the mathematical value the tree denotes. The base case is a leaf (a lone number). The inductive step assumes the evaluator is correct on the two subtrees — which is legitimate because each subtree is strictly smaller — and shows the operator combines their values correctly. The recursion stepping into subtrees, not into "the previous case," is the cue for structural induction, just as in Theorem 31.2's leaf-stripping.

Claim. For every well-formed expression tree $t$ (leaves are numbers, internal nodes are binary operators with two children), evaluate(t) returns the value of the arithmetic expression $t$ denotes.

Proof (structural induction on $t$).

Base case ($t$ is a leaf). A leaf holds a number $v$ and denotes the value $v$. The first branch returns node.val $= v$. ✓

Inductive step ($t$ has root operator $\mathrm{op}$, left subtree $L$, right subtree $R$). By the inductive hypothesis, evaluate(L) returns the value $\llbracket L \rrbracket$ that $L$ denotes, and evaluate(R) returns $\llbracket R \rrbracket$. The code sets $a = \llbracket L \rrbracket$, $b = \llbracket R \rrbracket$, and returns $a \mathbin{\mathrm{op}} b$. By the definition of an expression tree (§31.5), the value $t$ denotes is precisely $\llbracket L \rrbracket \mathbin{\mathrm{op}} \llbracket R \rrbracket$. So the returned value equals the value of $t$. ✓

By structural induction, evaluate(t) is correct for every well-formed expression tree. $\blacksquare$

🔗 Connection: The proof has the identical skeleton to Theorem 31.2 ("strip to a smaller tree, apply the hypothesis"). There we removed a leaf; here we descend into subtrees. Both are structural induction (Chapter 7). When you see a recursive function on a tree, your reflex should be: the proof inducts on the same structure the recursion descends.

Phase 3: Trace the evaluator to a number

Let us trace evaluate on the tree for 3 + 4 * 2, i.e. Node('+', Node(3), Node('*', Node(4), Node(2))). Follow the calls bottom-up — the order the proof guarantees.

evaluate(root '+') is not a leaf, so it first calls evaluate(left) on the leaf 3.
evaluate(3) is a leaf → returns 3. So $a = 3$.
Back at the root, it calls evaluate(right) on the * node.
evaluate('*') is not a leaf; it calls evaluate(4) → 4 (so its $a = 4$), then evaluate(2) → 2 (its $b = 2$), then applies *: returns $4 \cdot 2 = 8$.
Back at the root, $b = 8$. The root applies +: returns $3 + 8 = 11$.

tree = Node('+', Node(3), Node('*', Node(4), Node(2)))
print(evaluate(tree))
# Expected output:
# 11

The result is $11$, matching ordinary arithmetic: $3 + (4 \cdot 2) = 3 + 8 = 11$. The precedence we "remembered" as humans was carried, instead, by the tree's shape — the * being deeper than the +.

🧩 Productive Struggle: Trace evaluate on the tree for (8 - 2) / 3, i.e. Node('/', Node('-', Node(8), Node(2)), Node(3)). Write down each subtree value before the final division. What number comes out, and what would change if the parser had instead built Node('-', Node(8), Node('/', Node(2), Node(3)))?

Answer

For (8 - 2) / 3: the left - subtree evaluates to $8 - 2 = 6$; the right leaf is $3$; the root / returns $6 / 3 = 2.0$. For the second tree 8 - (2 / 3): the right / subtree is $2/3 \approx 0.667$; the root `-` returns $8 - 0.667 = 7.333\ldots$. Same symbols, same operands, different tree, different answer — precedence/association lives in the structure.

Phase 3.5: The same tree, a different job — serializing with preorder

Evaluation is only one thing a compiler does with an expression tree. A second, equally common task is serialization: writing the tree out as text (to a file, a cache, or across a network) so it can be reconstructed later. The point of analyzing this here is that it uses a different traversal — and the reason it does so is, once again, a dependency, just a different one.

To rebuild a tree from a stream, the reconstructor must create each node before it can attach that node's children. So the writer should emit the parent before its children — which is exactly preorder. A reconstructor reading a preorder stream creates the root first, then recursively reads and attaches the left subtree, then the right.

def serialize(node):
    """Preorder serialization. '#' marks an absent child so the structure is recoverable."""
    if node is None:
        return ["#"]
    return [str(node.val)] + serialize(node.left) + serialize(node.right)

tree = Node('+', Node(3), Node('*', Node(4), Node(2)))
print(serialize(tree))
# Expected output:
# ['+', '3', '#', '#', '*', '4', '#', '#', '2', '#', '#']

Hand-tracing the preorder emission: visit +; go left to leaf 3 (emit 3, then # # for its two empty children); back up, go right to * (emit *); its left leaf 4 emits 4 # #; its right leaf 2 emits 2 # #. Reading in order gives the list above. The # markers are essential: a bare preorder value list (+ 3 * 4 2) cannot be reconstructed without also knowing where the empty children are — which is precisely the §31.3 Productive Struggle lesson that you need preorder plus inorder (or, as here, preorder plus null-markers) to pin down the shape.

💡 Intuition: Match the traversal to the dependency. Evaluation needs children-before-parent (postorder), because a parent's value depends on its children's values. Serialization needs parent-before-children (preorder), because a reconstructed child must be attached to an already-created parent. The data dependency points one way for evaluation and the other way for rebuilding — and the traversal simply follows it. This single principle ("which work depends on which?") chooses the traversal every time.

🔄 Check Your Understanding If you wanted to write the expression to a file in a form a human reads most naturally, which traversal would you use, and what extra information must you add to make it unambiguous?

Answer

Inorder, which produces the familiar infix form 3 + 4 * 2. But infix is ambiguous (Phase 1), so to make it unambiguous for a machine you must add parentheses around each operator's subtree — the to_infix extension in "Your Turn." Humans tolerate precedence rules; a round-trippable file should parenthesize.

Phase 4: Cost analysis — how expensive is evaluation?

A reviewer asks: how does evaluation scale? The answer is a clean tree-size argument.

evaluate visits each node exactly once (it makes one recursive call per child, and each node is a child of exactly one parent, plus the root). At each node it does a constant amount of work: one comparison to test for a leaf, and one arithmetic operation if it is an internal node. Therefore the total work is $\Theta(n)$, where $n$ is the number of nodes — linear in the size of the expression. This is the best possible: any evaluator must at least look at every operand and operator once.

🔗 Connection: "Each node visited once, constant work per node, so $\Theta(n)$ total" is the standard tree-traversal cost argument you first met framed as cost analysis in Chapter 14. It recurs for every traversal in this chapter — preorder, inorder, postorder, and the trie and heap walks — and is worth committing to reflex.

One subtlety worth flagging in review: the recursion depth equals the height $h$ of the tree, so the call stack uses $O(h)$ space. For a balanced expression tree $h = O(\log n)$, but for a deeply right-nested expression like 1 + (2 + (3 + (\dots))) the tree is a stick of height $h = \Theta(n)$, and a naive recursive evaluator can overflow the call stack on a very long expression — the same "degenerate stick" phenomenon as the unbalanced BST in §31.4, here costing stack space rather than time.

⚠️ Common Pitfall: Confusing time and stack space for a tree traversal. The time is $\Theta(n)$ regardless of shape (every node is visited once). The recursion depth — and hence stack space — is $\Theta(h)$, which is $\Theta(\log n)$ only when the tree is balanced. A degenerate, stick-shaped expression tree is fine on time but can blow the stack; production evaluators often convert the recursion to an explicit stack (effectively evaluating the postfix form directly).

Discussion Questions

The evaluator's correctness proof inducted on the structure of the tree, not on the number of nodes or the height. Could you instead prove it by induction on the number of nodes? Sketch how, and say which induction feels more natural for tree-shaped data and why.
We saw that inorder loses precedence (two different trees share one infix string), while prefix and postfix do not. Explain, in tree terms, why prefix and postfix are unambiguous but infix is not. (Hint: in prefix/postfix, where does each operator sit relative to the operands it governs, and how does that fix the grouping?)
The evaluator handles +, -, *, /. Suppose you add a unary minus (negation), which has one operand. How would the Node representation and the leaf test in evaluate change? What does a unary operator do to the "binary tree" assumption?
A division-by-zero Node('/', Node(1), Node(0)) would crash evaluate. Where in the recursion does the error occur, and at what point in the postorder schedule? How does this illustrate that errors in tree evaluation surface bottom-up?
The evaluator runs in $\Theta(n)$ time and $\Theta(h)$ stack space. For which class of input expressions are these two quantities the same order, and for which do they differ the most? Tie your answer to §31.4's balanced-versus-degenerate distinction.

Your Turn: Extensions

Option A (pretty-print with parentheses). Write to_infix(node) that returns a fully parenthesized infix string, so that (3 + (4 * 2)) round-trips back to the same tree. This is the inorder traversal that does not lose precedence, because it parenthesizes every subtree. Test it by hand on the two trees from Phase 1's Check Your Understanding and confirm they now produce different strings.
Option B (evaluate postfix directly). Write a stack-based evaluator that consumes the postfix token list (e.g. [3, 4, 2, '*', '+']) without building a tree: push numbers; on an operator, pop two operands, apply, push the result. Hand-trace it on the postfix from Phase 1 and confirm it returns 11. Argue why this uses $O(\text{stack depth}) = O(h)$ space and avoids deep recursion.
Option C (count operations). Instrument evaluate to count how many arithmetic operations it performs, and confirm by hand that for an expression tree with $k$ internal nodes (operators) it performs exactly $k$ operations. Connect this to Exercise 31.10 (a full binary tree with $\ell$ leaves has $\ell - 1$ internal nodes): if there are $\ell$ operands, how many operations does evaluation do?
Option D (variables). Extend leaves to allow variable names (e.g. Node('x')) and give evaluate a dictionary env = {'x': 5}. State the one-line change to the leaf branch, and note that you have now built a tiny interpreter — evaluation of an AST against an environment, the core loop of every scripting language.

Key Takeaways

One tree, three notations. The preorder, inorder, and postorder traversals of an expression tree are prefix (Polish), infix, and postfix (RPN). Prefix and postfix are parenthesis-free and unambiguous; infix loses precedence and needs parentheses or precedence rules.
Postorder is the evaluation order — and it is forced. An operator needs both operands' values first, a data dependency that mandates bottom-up evaluation. The recursive evaluate is a postorder traversal that applies operators instead of listing nodes.
Correctness is structural induction. The evaluator is proved correct by inducting on the tree's recursive structure (Chapter 7) — the same engine as Theorem 31.2's edge count. The recursion and the proof descend the same structure.
Cost is $\Theta(n)$ time, $\Theta(h)$ stack space. Every node is visited once (linear time), but recursion depth equals the height — balanced trees use $O(\log n)$ stack, degenerate sticks use $O(n)$ and can overflow it. This is the §31.4 height story reappearing.
The tree carries the meaning the string only implied. Precedence and association, which a human tracks as rules over a flat string, become structural (depth) in the tree — which is exactly why compilers parse to a tree before doing anything else.