Case Study 1: Building an Expression Evaluator

The Problem

Arithmetic expressions like 3 + 4 * 2 - 1 seem simple when you read them, but teaching a computer to evaluate them correctly — respecting operator precedence and parentheses — is one of the classic problems in computer science. The solution, dating back to Edsger Dijkstra's 1961 "shunting-yard algorithm," relies entirely on stacks.

In this case study, we build a complete expression evaluator in Pascal that:

  1. Accepts an infix expression as a string (e.g., "3 + 4 * 2")
  2. Converts it to postfix notation (also called Reverse Polish Notation, or RPN) using the shunting-yard algorithm
  3. Evaluates the postfix expression using a stack
  4. Returns the numeric result

This case study demonstrates that choosing the right data structure — a stack — transforms a seemingly complex problem into a sequence of simple push/pop operations.


Background: Three Notations

Consider the expression "three plus four times two":

Notation Form How It Works
Infix 3 + 4 * 2 Operator between operands. Requires precedence rules.
Prefix (Polish) + 3 * 4 2 Operator before operands. No ambiguity.
Postfix (RPN) 3 4 2 * + Operator after operands. No ambiguity.

Infix notation is what humans prefer, but it is ambiguous without precedence rules: does 3 + 4 * 2 mean (3 + 4) * 2 = 14 or 3 + (4 * 2) = 11? Convention says multiplication binds tighter, giving 11, but a naive left-to-right parser would give 14.

Postfix notation eliminates ambiguity. In 3 4 2 * +, we process tokens left to right: - 3 → push 3 - 4 → push 4 - 2 → push 2 - * → pop 2 and 4, push 8 - + → pop 8 and 3, push 11

The result is 11. No precedence rules needed, no parentheses.


Step 1: Data Structures

We need two kinds of stacks — one for numbers (reals) and one for operators (characters):

program ExpressionEvaluator;

type
  { Number stack for evaluation }
  PNumNode = ^TNumNode;
  TNumNode = record
    Value: Real;
    Next: PNumNode;
  end;

  TNumStack = record
    Top: PNumNode;
    Count: Integer;
  end;

  { Character stack for shunting-yard }
  PCharNode = ^TCharNode;
  TCharNode = record
    Ch: Char;
    Next: PCharNode;
  end;

  TCharStack = record
    Top: PCharNode;
    Count: Integer;
  end;

The stack operations follow the same pattern from the chapter:

{ Number stack operations }
procedure InitNumStack(var S: TNumStack);
begin
  S.Top := nil;
  S.Count := 0;
end;

function NumIsEmpty(const S: TNumStack): Boolean;
begin
  NumIsEmpty := (S.Top = nil);
end;

procedure NumPush(var S: TNumStack; Value: Real);
var
  N: PNumNode;
begin
  New(N);
  N^.Value := Value;
  N^.Next := S.Top;
  S.Top := N;
  Inc(S.Count);
end;

function NumPop(var S: TNumStack): Real;
var
  Temp: PNumNode;
begin
  if NumIsEmpty(S) then
  begin
    WriteLn('Error: Number stack underflow');
    Halt(1);
  end;
  NumPop := S.Top^.Value;
  Temp := S.Top;
  S.Top := S.Top^.Next;
  Dispose(Temp);
  Dec(S.Count);
end;

{ Character stack operations }
procedure InitCharStack(var S: TCharStack);
begin
  S.Top := nil;
  S.Count := 0;
end;

function CharIsEmpty(const S: TCharStack): Boolean;
begin
  CharIsEmpty := (S.Top = nil);
end;

procedure CharPush(var S: TCharStack; Ch: Char);
var
  N: PCharNode;
begin
  New(N);
  N^.Ch := Ch;
  N^.Next := S.Top;
  S.Top := N;
  Inc(S.Count);
end;

function CharPop(var S: TCharStack): Char;
var
  Temp: PCharNode;
begin
  if CharIsEmpty(S) then
  begin
    WriteLn('Error: Operator stack underflow');
    Halt(1);
  end;
  CharPop := S.Top^.Ch;
  Temp := S.Top;
  S.Top := S.Top^.Next;
  Dispose(Temp);
  Dec(S.Count);
end;

function CharPeek(const S: TCharStack): Char;
begin
  if CharIsEmpty(S) then
    CharPeek := #0
  else
    CharPeek := S.Top^.Ch;
end;

Step 2: Operator Precedence

The shunting-yard algorithm needs to know which operators bind more tightly:

function Precedence(Op: Char): Integer;
begin
  case Op of
    '+', '-': Precedence := 1;
    '*', '/': Precedence := 2;
  else
    Precedence := 0;
  end;
end;

function IsOperator(Ch: Char): Boolean;
begin
  IsOperator := Ch in ['+', '-', '*', '/'];
end;

function IsDigitOrDot(Ch: Char): Boolean;
begin
  IsDigitOrDot := (Ch in ['0'..'9']) or (Ch = '.');
end;

Step 3: Infix to Postfix (Shunting-Yard Algorithm)

Dijkstra's shunting-yard algorithm processes each token in the infix expression:

  • Number: Output it directly to the postfix string.
  • Operator: Pop and output operators from the stack that have greater or equal precedence, then push this operator.
  • Left parenthesis (: Push it onto the stack.
  • Right parenthesis ): Pop and output operators until we find the matching (. Discard both parentheses.
  • End of input: Pop and output all remaining operators.
function InfixToPostfix(const Expr: string): string;
var
  OpStack: TCharStack;
  Output: string;
  i: Integer;
  Ch: Char;
  NumStr: string;
begin
  InitCharStack(OpStack);
  Output := '';
  i := 1;

  while i <= Length(Expr) do
  begin
    Ch := Expr[i];

    { Skip spaces }
    if Ch = ' ' then
    begin
      Inc(i);
      Continue;
    end;

    { Number: collect all digits and decimal point }
    if IsDigitOrDot(Ch) then
    begin
      NumStr := '';
      while (i <= Length(Expr)) and IsDigitOrDot(Expr[i]) do
      begin
        NumStr := NumStr + Expr[i];
        Inc(i);
      end;
      Output := Output + NumStr + ' ';
      Continue;  { Do not increment i again }
    end;

    { Left parenthesis }
    if Ch = '(' then
    begin
      CharPush(OpStack, Ch);
      Inc(i);
      Continue;
    end;

    { Right parenthesis }
    if Ch = ')' then
    begin
      while (not CharIsEmpty(OpStack)) and (CharPeek(OpStack) <> '(') do
        Output := Output + CharPop(OpStack) + ' ';
      if not CharIsEmpty(OpStack) then
        CharPop(OpStack);  { Discard the '(' }
      Inc(i);
      Continue;
    end;

    { Operator }
    if IsOperator(Ch) then
    begin
      while (not CharIsEmpty(OpStack)) and
            IsOperator(CharPeek(OpStack)) and
            (Precedence(CharPeek(OpStack)) >= Precedence(Ch)) do
        Output := Output + CharPop(OpStack) + ' ';
      CharPush(OpStack, Ch);
      Inc(i);
      Continue;
    end;

    { Unknown character }
    WriteLn('Error: Unexpected character "', Ch, '"');
    Halt(1);
  end;

  { Pop remaining operators }
  while not CharIsEmpty(OpStack) do
    Output := Output + CharPop(OpStack) + ' ';

  InfixToPostfix := Output;
end;

Tracing an Example

Let us trace 3 + 4 * 2:

Token Action Output Stack
3 Output number 3
+ Push (stack empty) 3 +
4 Output number 3 4 +
* * > +, so push 3 4 + *
2 Output number 3 4 2 + *
End Pop all 3 4 2 * +

Result: 3 4 2 * + — correct postfix.

Now trace (3 + 4) * 2:

Token Action Output Stack
( Push (
3 Output 3 (
+ Push (top is () 3 ( +
4 Output 3 4 ( +
) Pop until ( 3 4 +
* Push 3 4 + *
2 Output 3 4 + 2 *
End Pop all 3 4 + 2 *

Result: 3 4 + 2 *, which evaluates to (3 + 4) * 2 = 14. The parentheses are gone but their effect is preserved in the ordering.


Step 4: Postfix Evaluation

Evaluating postfix is straightforward with a number stack:

function EvaluatePostfix(const Postfix: string): Real;
var
  NumStack: TNumStack;
  i: Integer;
  Ch: Char;
  NumStr: string;
  A, B: Real;
  Code: Integer;
  Value: Real;
begin
  InitNumStack(NumStack);
  i := 1;

  while i <= Length(Postfix) do
  begin
    Ch := Postfix[i];

    { Skip spaces }
    if Ch = ' ' then
    begin
      Inc(i);
      Continue;
    end;

    { Number: collect digits }
    if IsDigitOrDot(Ch) then
    begin
      NumStr := '';
      while (i <= Length(Postfix)) and IsDigitOrDot(Postfix[i]) do
      begin
        NumStr := NumStr + Postfix[i];
        Inc(i);
      end;
      Val(NumStr, Value, Code);
      if Code <> 0 then
      begin
        WriteLn('Error: Invalid number "', NumStr, '"');
        Halt(1);
      end;
      NumPush(NumStack, Value);
      Continue;
    end;

    { Operator: pop two operands, compute, push result }
    if IsOperator(Ch) then
    begin
      B := NumPop(NumStack);  { Second operand (popped first) }
      A := NumPop(NumStack);  { First operand }
      case Ch of
        '+': NumPush(NumStack, A + B);
        '-': NumPush(NumStack, A - B);
        '*': NumPush(NumStack, A * B);
        '/':
          begin
            if Abs(B) < 1E-10 then
            begin
              WriteLn('Error: Division by zero');
              Halt(1);
            end;
            NumPush(NumStack, A / B);
          end;
      end;
      Inc(i);
      Continue;
    end;

    Inc(i);
  end;

  EvaluatePostfix := NumPop(NumStack);
end;

Note the order: when we pop two values for an operator, the first popped value is the second operand (B), and the second popped value is the first operand (A). This matters for subtraction and division: 7 3 - should compute 7 - 3 = 4, not 3 - 7 = -4.


Step 5: Putting It Together

function Evaluate(const Expr: string): Real;
var
  Postfix: string;
begin
  Postfix := InfixToPostfix(Expr);
  WriteLn('Postfix: ', Postfix);
  Evaluate := EvaluatePostfix(Postfix);
end;

{ Main program }
var
  Expression: string;
begin
  WriteLn('=== Expression Evaluator ===');
  WriteLn('Enter an arithmetic expression (or "quit" to exit):');

  repeat
    Write('> ');
    ReadLn(Expression);
    if Expression = 'quit' then Break;
    WriteLn('Result: ', Evaluate(Expression):0:4);
    WriteLn;
  until False;

  WriteLn('Goodbye.');
end.

Sample Session

=== Expression Evaluator ===
Enter an arithmetic expression (or "quit" to exit):
> 3 + 4 * 2
Postfix: 3 4 2 * +
Result: 11.0000

> (3 + 4) * 2
Postfix: 3 4 + 2 *
Result: 14.0000

> 10 / 2 + 3 * (4 - 1)
Postfix: 10 2 / 3 4 1 - * +
Result: 14.0000

> 100 - 50 * 2 + 25
Postfix: 100 50 2 * - 25 +
Result: 25.0000

> quit
Goodbye.

Analysis

Why Stacks?

The shunting-yard algorithm works because of a key insight: operator precedence is a nesting problem, and stacks are the natural tool for tracking nested structures. When we encounter a high-precedence operator like *, we push it, deferring it until we have collected its operands. When we encounter a lower-precedence operator like +, we know the previous higher-precedence operations are complete, so we pop and output them.

The same logic handles parentheses: ( is pushed as a marker, and ) triggers popping until we find the matching (. This is exactly the bracket-matching pattern from Section 15.4, applied to expression parsing.

Complexity

  • Infix to postfix: O(n) where n is the length of the expression. Each token is pushed and popped at most once.
  • Postfix evaluation: O(n) for the same reason.
  • Total: O(n) time, O(n) space for the stacks.

Limitations and Extensions

Our evaluator handles +, -, *, /, and parentheses. Exercise E4 challenges you to extend it with exponentiation (right-associative!), unary minus, and variables. The shunting-yard algorithm handles all of these with small modifications to the precedence and associativity logic.


Key Takeaways

  1. The stack transforms expression evaluation from a complex parsing problem into a simple token-processing loop.
  2. Dijkstra's shunting-yard algorithm converts infix to postfix in O(n) time using a single operator stack.
  3. Postfix evaluation uses a single number stack, also in O(n) time.
  4. The order of operand popping matters for non-commutative operators (subtraction, division).
  5. This is a canonical example of Theme 5: the right data structure (stack) makes the algorithm (expression evaluation) almost trivial.