Case Study 1: Building an Expression Evaluator
The Problem
Arithmetic expressions like 3 + 4 * 2 - 1 seem simple when you read them, but teaching a computer to evaluate them correctly — respecting operator precedence and parentheses — is one of the classic problems in computer science. The solution, dating back to Edsger Dijkstra's 1961 "shunting-yard algorithm," relies entirely on stacks.
In this case study, we build a complete expression evaluator in Pascal that:
- Accepts an infix expression as a string (e.g.,
"3 + 4 * 2") - Converts it to postfix notation (also called Reverse Polish Notation, or RPN) using the shunting-yard algorithm
- Evaluates the postfix expression using a stack
- Returns the numeric result
This case study demonstrates that choosing the right data structure — a stack — transforms a seemingly complex problem into a sequence of simple push/pop operations.
Background: Three Notations
Consider the expression "three plus four times two":
| Notation | Form | How It Works |
|---|---|---|
| Infix | 3 + 4 * 2 |
Operator between operands. Requires precedence rules. |
| Prefix (Polish) | + 3 * 4 2 |
Operator before operands. No ambiguity. |
| Postfix (RPN) | 3 4 2 * + |
Operator after operands. No ambiguity. |
Infix notation is what humans prefer, but it is ambiguous without precedence rules: does 3 + 4 * 2 mean (3 + 4) * 2 = 14 or 3 + (4 * 2) = 11? Convention says multiplication binds tighter, giving 11, but a naive left-to-right parser would give 14.
Postfix notation eliminates ambiguity. In 3 4 2 * +, we process tokens left to right:
- 3 → push 3
- 4 → push 4
- 2 → push 2
- * → pop 2 and 4, push 8
- + → pop 8 and 3, push 11
The result is 11. No precedence rules needed, no parentheses.
Step 1: Data Structures
We need two kinds of stacks — one for numbers (reals) and one for operators (characters):
program ExpressionEvaluator;
type
{ Number stack for evaluation }
PNumNode = ^TNumNode;
TNumNode = record
Value: Real;
Next: PNumNode;
end;
TNumStack = record
Top: PNumNode;
Count: Integer;
end;
{ Character stack for shunting-yard }
PCharNode = ^TCharNode;
TCharNode = record
Ch: Char;
Next: PCharNode;
end;
TCharStack = record
Top: PCharNode;
Count: Integer;
end;
The stack operations follow the same pattern from the chapter:
{ Number stack operations }
procedure InitNumStack(var S: TNumStack);
begin
S.Top := nil;
S.Count := 0;
end;
function NumIsEmpty(const S: TNumStack): Boolean;
begin
NumIsEmpty := (S.Top = nil);
end;
procedure NumPush(var S: TNumStack; Value: Real);
var
N: PNumNode;
begin
New(N);
N^.Value := Value;
N^.Next := S.Top;
S.Top := N;
Inc(S.Count);
end;
function NumPop(var S: TNumStack): Real;
var
Temp: PNumNode;
begin
if NumIsEmpty(S) then
begin
WriteLn('Error: Number stack underflow');
Halt(1);
end;
NumPop := S.Top^.Value;
Temp := S.Top;
S.Top := S.Top^.Next;
Dispose(Temp);
Dec(S.Count);
end;
{ Character stack operations }
procedure InitCharStack(var S: TCharStack);
begin
S.Top := nil;
S.Count := 0;
end;
function CharIsEmpty(const S: TCharStack): Boolean;
begin
CharIsEmpty := (S.Top = nil);
end;
procedure CharPush(var S: TCharStack; Ch: Char);
var
N: PCharNode;
begin
New(N);
N^.Ch := Ch;
N^.Next := S.Top;
S.Top := N;
Inc(S.Count);
end;
function CharPop(var S: TCharStack): Char;
var
Temp: PCharNode;
begin
if CharIsEmpty(S) then
begin
WriteLn('Error: Operator stack underflow');
Halt(1);
end;
CharPop := S.Top^.Ch;
Temp := S.Top;
S.Top := S.Top^.Next;
Dispose(Temp);
Dec(S.Count);
end;
function CharPeek(const S: TCharStack): Char;
begin
if CharIsEmpty(S) then
CharPeek := #0
else
CharPeek := S.Top^.Ch;
end;
Step 2: Operator Precedence
The shunting-yard algorithm needs to know which operators bind more tightly:
function Precedence(Op: Char): Integer;
begin
case Op of
'+', '-': Precedence := 1;
'*', '/': Precedence := 2;
else
Precedence := 0;
end;
end;
function IsOperator(Ch: Char): Boolean;
begin
IsOperator := Ch in ['+', '-', '*', '/'];
end;
function IsDigitOrDot(Ch: Char): Boolean;
begin
IsDigitOrDot := (Ch in ['0'..'9']) or (Ch = '.');
end;
Step 3: Infix to Postfix (Shunting-Yard Algorithm)
Dijkstra's shunting-yard algorithm processes each token in the infix expression:
- Number: Output it directly to the postfix string.
- Operator: Pop and output operators from the stack that have greater or equal precedence, then push this operator.
- Left parenthesis
(: Push it onto the stack. - Right parenthesis
): Pop and output operators until we find the matching(. Discard both parentheses. - End of input: Pop and output all remaining operators.
function InfixToPostfix(const Expr: string): string;
var
OpStack: TCharStack;
Output: string;
i: Integer;
Ch: Char;
NumStr: string;
begin
InitCharStack(OpStack);
Output := '';
i := 1;
while i <= Length(Expr) do
begin
Ch := Expr[i];
{ Skip spaces }
if Ch = ' ' then
begin
Inc(i);
Continue;
end;
{ Number: collect all digits and decimal point }
if IsDigitOrDot(Ch) then
begin
NumStr := '';
while (i <= Length(Expr)) and IsDigitOrDot(Expr[i]) do
begin
NumStr := NumStr + Expr[i];
Inc(i);
end;
Output := Output + NumStr + ' ';
Continue; { Do not increment i again }
end;
{ Left parenthesis }
if Ch = '(' then
begin
CharPush(OpStack, Ch);
Inc(i);
Continue;
end;
{ Right parenthesis }
if Ch = ')' then
begin
while (not CharIsEmpty(OpStack)) and (CharPeek(OpStack) <> '(') do
Output := Output + CharPop(OpStack) + ' ';
if not CharIsEmpty(OpStack) then
CharPop(OpStack); { Discard the '(' }
Inc(i);
Continue;
end;
{ Operator }
if IsOperator(Ch) then
begin
while (not CharIsEmpty(OpStack)) and
IsOperator(CharPeek(OpStack)) and
(Precedence(CharPeek(OpStack)) >= Precedence(Ch)) do
Output := Output + CharPop(OpStack) + ' ';
CharPush(OpStack, Ch);
Inc(i);
Continue;
end;
{ Unknown character }
WriteLn('Error: Unexpected character "', Ch, '"');
Halt(1);
end;
{ Pop remaining operators }
while not CharIsEmpty(OpStack) do
Output := Output + CharPop(OpStack) + ' ';
InfixToPostfix := Output;
end;
Tracing an Example
Let us trace 3 + 4 * 2:
| Token | Action | Output | Stack |
|---|---|---|---|
3 |
Output number | 3 |
|
+ |
Push (stack empty) | 3 |
+ |
4 |
Output number | 3 4 |
+ |
* |
* > +, so push |
3 4 |
+ * |
2 |
Output number | 3 4 2 |
+ * |
| End | Pop all | 3 4 2 * + |
Result: 3 4 2 * + — correct postfix.
Now trace (3 + 4) * 2:
| Token | Action | Output | Stack |
|---|---|---|---|
( |
Push | ( |
|
3 |
Output | 3 |
( |
+ |
Push (top is () |
3 |
( + |
4 |
Output | 3 4 |
( + |
) |
Pop until ( |
3 4 + |
|
* |
Push | 3 4 + |
* |
2 |
Output | 3 4 + 2 |
* |
| End | Pop all | 3 4 + 2 * |
Result: 3 4 + 2 *, which evaluates to (3 + 4) * 2 = 14. The parentheses are gone but their effect is preserved in the ordering.
Step 4: Postfix Evaluation
Evaluating postfix is straightforward with a number stack:
function EvaluatePostfix(const Postfix: string): Real;
var
NumStack: TNumStack;
i: Integer;
Ch: Char;
NumStr: string;
A, B: Real;
Code: Integer;
Value: Real;
begin
InitNumStack(NumStack);
i := 1;
while i <= Length(Postfix) do
begin
Ch := Postfix[i];
{ Skip spaces }
if Ch = ' ' then
begin
Inc(i);
Continue;
end;
{ Number: collect digits }
if IsDigitOrDot(Ch) then
begin
NumStr := '';
while (i <= Length(Postfix)) and IsDigitOrDot(Postfix[i]) do
begin
NumStr := NumStr + Postfix[i];
Inc(i);
end;
Val(NumStr, Value, Code);
if Code <> 0 then
begin
WriteLn('Error: Invalid number "', NumStr, '"');
Halt(1);
end;
NumPush(NumStack, Value);
Continue;
end;
{ Operator: pop two operands, compute, push result }
if IsOperator(Ch) then
begin
B := NumPop(NumStack); { Second operand (popped first) }
A := NumPop(NumStack); { First operand }
case Ch of
'+': NumPush(NumStack, A + B);
'-': NumPush(NumStack, A - B);
'*': NumPush(NumStack, A * B);
'/':
begin
if Abs(B) < 1E-10 then
begin
WriteLn('Error: Division by zero');
Halt(1);
end;
NumPush(NumStack, A / B);
end;
end;
Inc(i);
Continue;
end;
Inc(i);
end;
EvaluatePostfix := NumPop(NumStack);
end;
Note the order: when we pop two values for an operator, the first popped value is the second operand (B), and the second popped value is the first operand (A). This matters for subtraction and division: 7 3 - should compute 7 - 3 = 4, not 3 - 7 = -4.
Step 5: Putting It Together
function Evaluate(const Expr: string): Real;
var
Postfix: string;
begin
Postfix := InfixToPostfix(Expr);
WriteLn('Postfix: ', Postfix);
Evaluate := EvaluatePostfix(Postfix);
end;
{ Main program }
var
Expression: string;
begin
WriteLn('=== Expression Evaluator ===');
WriteLn('Enter an arithmetic expression (or "quit" to exit):');
repeat
Write('> ');
ReadLn(Expression);
if Expression = 'quit' then Break;
WriteLn('Result: ', Evaluate(Expression):0:4);
WriteLn;
until False;
WriteLn('Goodbye.');
end.
Sample Session
=== Expression Evaluator ===
Enter an arithmetic expression (or "quit" to exit):
> 3 + 4 * 2
Postfix: 3 4 2 * +
Result: 11.0000
> (3 + 4) * 2
Postfix: 3 4 + 2 *
Result: 14.0000
> 10 / 2 + 3 * (4 - 1)
Postfix: 10 2 / 3 4 1 - * +
Result: 14.0000
> 100 - 50 * 2 + 25
Postfix: 100 50 2 * - 25 +
Result: 25.0000
> quit
Goodbye.
Analysis
Why Stacks?
The shunting-yard algorithm works because of a key insight: operator precedence is a nesting problem, and stacks are the natural tool for tracking nested structures. When we encounter a high-precedence operator like *, we push it, deferring it until we have collected its operands. When we encounter a lower-precedence operator like +, we know the previous higher-precedence operations are complete, so we pop and output them.
The same logic handles parentheses: ( is pushed as a marker, and ) triggers popping until we find the matching (. This is exactly the bracket-matching pattern from Section 15.4, applied to expression parsing.
Complexity
- Infix to postfix: O(n) where n is the length of the expression. Each token is pushed and popped at most once.
- Postfix evaluation: O(n) for the same reason.
- Total: O(n) time, O(n) space for the stacks.
Limitations and Extensions
Our evaluator handles +, -, *, /, and parentheses. Exercise E4 challenges you to extend it with exponentiation (right-associative!), unary minus, and variables. The shunting-yard algorithm handles all of these with small modifications to the precedence and associativity logic.
Key Takeaways
- The stack transforms expression evaluation from a complex parsing problem into a simple token-processing loop.
- Dijkstra's shunting-yard algorithm converts infix to postfix in O(n) time using a single operator stack.
- Postfix evaluation uses a single number stack, also in O(n) time.
- The order of operand popping matters for non-commutative operators (subtraction, division).
- This is a canonical example of Theme 5: the right data structure (stack) makes the algorithm (expression evaluation) almost trivial.