Chapter 25: Algorithm Design Strategies: Divide and Conquer, Greedy, Dynamic Programming

28 min read

By now, you have a toolkit of specific algorithms: binary search, merge sort, quicksort, tree traversal, BFS, DFS, Dijkstra. Each solves a particular problem. But what do you do when you encounter a new problem — one that does not match any...

Learning Objectives

Identify and apply the divide-and-conquer strategy to new problems
Identify and apply the greedy strategy, and recognize when greedy fails
Explain overlapping subproblems and optimal substructure
Implement memoization (top-down) and tabulation (bottom-up) dynamic programming
Solve classic DP problems: Fibonacci, longest common subsequence, knapsack
Choose the appropriate strategy for a given problem

In This Chapter

25.1 Three Strategies for Hard Problems
25.2 Divide and Conquer (Recap and New Examples)
25.3 Greedy Algorithms (Activity Selection, Coin Change, Fractional Knapsack)
25.4 Dynamic Programming (Overlapping Subproblems, Memoization, Tabulation)
25.5 Classic DP Problems (Fibonacci, Longest Common Subsequence, Knapsack)
25.6 Recognizing DP Problems: A Systematic Approach
25.7 Choosing the Right Strategy: A Comparison Framework
25.8 Project Checkpoint: PennyWise Budget Optimization
25.9 Retrieval Practice: Strategy Identification
25.10 Summary

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 25: Algorithm Design Strategies: Divide and Conquer, Greedy, Dynamic Programming

"An algorithm must be seen to be believed." — Donald Knuth

By now, you have a toolkit of specific algorithms: binary search, merge sort, quicksort, tree traversal, BFS, DFS, Dijkstra. Each solves a particular problem. But what do you do when you encounter a new problem — one that does not match any algorithm you have seen before?

You need algorithm design strategies — general approaches for attacking problems that guide you from "I have no idea how to solve this" to "I have a structured plan." This chapter introduces three of the most important strategies in computer science: divide and conquer, greedy algorithms, and dynamic programming. These are not specific algorithms but families of algorithms, each embodying a different philosophy about how to break a problem into manageable pieces.

Divide and conquer says: "Split the problem in half, solve each half recursively, combine the results." Greedy says: "At each step, make the locally optimal choice and never look back." Dynamic programming says: "Break the problem into overlapping subproblems, solve each one once, store the results."

By the end of this chapter, you will be able to recognize which strategy fits a given problem, implement solutions using each strategy, and understand why some strategies fail on certain problems. PennyWise will gain a greedy budget allocator and a dynamic programming optimizer for Tomas's spending.

Prerequisites: Recursion (Chapter 22) and searching/sorting (Chapter 23) are essential. We will revisit merge sort and binary search as examples of divide and conquer, and build on the Fibonacci recursion from Chapter 22.

25.1 Three Strategies for Hard Problems

When you face an optimization or search problem, one of these three strategies usually applies:

Divide and Conquer

Philosophy: Break the problem into independent sub-problems of the same type, solve each recursively, and combine the results.

Key characteristics: - Sub-problems are independent (solving one does not help solve another). - Combination step merges sub-solutions. - Naturally recursive.

Examples: Merge sort, quicksort, binary search, fast exponentiation.

Greedy Algorithms

Philosophy: At each step, make the choice that looks best right now, without considering future consequences. Never revise a past decision.

Key characteristics: - Makes a single pass through the input (or a small number of passes). - Never backtracks. - Fast (usually O(n) or O(n log n)). - Works only when the "greedy choice property" holds: locally optimal choices lead to a globally optimal solution.

Examples: Activity selection, coin change (with standard denominations), Huffman coding, Dijkstra's algorithm.

Dynamic Programming

Philosophy: Break the problem into overlapping sub-problems. Solve each sub-problem once and store the result in a table. Build up the solution from smaller sub-problems to larger ones.

Key characteristics: - Sub-problems overlap (the same sub-problem appears multiple times). - Stores intermediate results (memoization or tabulation). - Guarantees optimal solutions when the problem has "optimal substructure." - Trades memory for time.

Examples: Fibonacci (with memoization), longest common subsequence, 0/1 knapsack, edit distance.

📊 Strategy Comparison | Property | Divide & Conquer | Greedy | Dynamic Programming | |----------|-----------------|--------|-------------------| | Sub-problems | Independent | None (local choices) | Overlapping | | Backtracking | No | No | No (but explores all options via table) | | Optimality guarantee | Problem-specific | Only with greedy property | Yes (with optimal substructure) | | Typical complexity | O(n log n) | O(n) or O(n log n) | O(n^2) or O(n * W) | | Memory | O(n) or O(log n) | O(1) or O(n) | O(n) or O(n^2) |

25.2 Divide and Conquer (Recap and New Examples)

We have already seen divide and conquer in action:

Merge sort (Chapter 23): Divide the array in half, sort each half, merge.
Binary search (Chapter 23): Divide the search space in half, search the appropriate half.
Fast exponentiation (Chapter 22): x^n = (x^(n/2))^2 for even n.

The general pattern is:

function DivideAndConquer(Problem):
  if Problem is small enough:
    solve directly (base case)
  else:
    Divide Problem into sub-problems P1, P2, ...
    S1 := DivideAndConquer(P1)
    S2 := DivideAndConquer(P2)
    ...
    return Combine(S1, S2, ...)

Deriving Merge Sort's Complexity

Merge sort follows this pattern precisely. Let T(n) be the time to sort n elements:

Divide: Split the array at the midpoint. Cost: O(1).
Conquer: Two recursive calls, each on n/2 elements. Cost: 2 * T(n/2).
Combine: Merge two sorted halves. Cost: O(n).

This gives the recurrence: T(n) = 2T(n/2) + O(n).

We can solve this by expansion:

T(n) = 2T(n/2) + n
     = 2[2T(n/4) + n/2] + n = 4T(n/4) + 2n
     = 4[2T(n/8) + n/4] + 2n = 8T(n/8) + 3n
     ...
     = 2^k * T(n/2^k) + k*n

When n/2^k = 1 (i.e., k = log2(n)): T(n) = n * T(1) + n * log2(n) = O(n log n).

New Example: Closest Pair of Points

Given n points in 2D space, find the pair with the smallest distance between them. The brute-force approach checks all n(n-1)/2 pairs: O(n^2). Divide and conquer achieves O(n log n).

Idea: 1. Sort points by x-coordinate. 2. Divide: Split into left and right halves along the median x-coordinate. 3. Conquer: Recursively find the closest pair in each half. 4. Combine: The overall closest pair is either entirely in the left half, entirely in the right half, or one point from each half. The "strip" near the dividing line must be checked, but this can be done in O(n) with a clever observation about the geometry.

type
  TPoint = record
    X, Y: Real;
  end;
  TPointArray = array of TPoint;

function PointDist(const A, B: TPoint): Real;
begin
  PointDist := Sqrt(Sqr(A.X - B.X) + Sqr(A.Y - B.Y));
end;

function ClosestPairRec(var Points: TPointArray;
                        Left, Right: Integer): Real;
var
  Mid, I, J: Integer;
  DLeft, DRight, D, StripD: Real;
  Strip: TPointArray;
  StripCount: Integer;
  MidX: Real;
begin
  { Base case: 2 or 3 points — brute force }
  if Right - Left < 3 then begin
    ClosestPairRec := 1e18;
    for I := Left to Right do
      for J := I + 1 to Right do begin
        D := PointDist(Points[I], Points[J]);
        if D < ClosestPairRec then
          ClosestPairRec := D;
      end;
    Exit;
  end;

  Mid := (Left + Right) div 2;
  MidX := Points[Mid].X;

  { Conquer }
  DLeft := ClosestPairRec(Points, Left, Mid);
  DRight := ClosestPairRec(Points, Mid + 1, Right);

  if DLeft < DRight then D := DLeft
  else D := DRight;

  { Combine: check strip }
  SetLength(Strip, Right - Left + 1);
  StripCount := 0;
  for I := Left to Right do
    if Abs(Points[I].X - MidX) < D then begin
      Strip[StripCount] := Points[I];
      Inc(StripCount);
    end;

  { Check pairs in strip (at most 7 comparisons per point) }
  for I := 0 to StripCount - 1 do
    for J := I + 1 to StripCount - 1 do begin
      if Strip[J].Y - Strip[I].Y >= D then Break;
      StripD := PointDist(Strip[I], Strip[J]);
      if StripD < D then D := StripD;
    end;

  ClosestPairRec := D;
end;

The key insight is that the strip check appears to be O(n^2) but is actually O(n) because each point needs to be compared with at most 7 others in the strip (a geometric proof that we omit here). This makes the total algorithm O(n log n).

Another D&C Example: Maximum Subarray Sum

Given an array of integers (possibly negative), find the contiguous subarray with the largest sum. For [-2, 1, -3, 4, -1, 2, 1, -5, 4], the answer is [4, -1, 2, 1] with sum 6.

Divide and conquer approach: 1. Divide: Split the array at the midpoint. 2. Conquer: Recursively find the maximum subarray in the left half and the right half. 3. Combine: Find the maximum subarray that crosses the midpoint (extends from the left half into the right half). This requires scanning left from the midpoint and right from the midpoint, which takes O(n). 4. Return the maximum of the three candidates (left, right, crossing).

function MaxCrossingSum(const Arr: array of Integer;
                        Low, Mid, High: Integer): Integer;
var
  LeftSum, RightSum, Sum, I: Integer;
begin
  LeftSum := -MaxInt;
  Sum := 0;
  for I := Mid downto Low do begin
    Sum := Sum + Arr[I];
    if Sum > LeftSum then LeftSum := Sum;
  end;

  RightSum := -MaxInt;
  Sum := 0;
  for I := Mid + 1 to High do begin
    Sum := Sum + Arr[I];
    if Sum > RightSum then RightSum := Sum;
  end;

  MaxCrossingSum := LeftSum + RightSum;
end;

function MaxSubArrayDC(const Arr: array of Integer;
                       Low, High: Integer): Integer;
var
  Mid, LeftMax, RightMax, CrossMax: Integer;
begin
  if Low = High then begin
    MaxSubArrayDC := Arr[Low];
    Exit;
  end;

  Mid := (Low + High) div 2;
  LeftMax := MaxSubArrayDC(Arr, Low, Mid);
  RightMax := MaxSubArrayDC(Arr, Mid + 1, High);
  CrossMax := MaxCrossingSum(Arr, Low, Mid, High);

  MaxSubArrayDC := LeftMax;
  if RightMax > MaxSubArrayDC then MaxSubArrayDC := RightMax;
  if CrossMax > MaxSubArrayDC then MaxSubArrayDC := CrossMax;
end;

This runs in O(n log n) — the same recurrence as merge sort. There is also a famous O(n) solution using DP (Kadane's algorithm), but the divide-and-conquer approach illustrates the strategy well.

Recognizing Divide-and-Conquer Problems

A problem is a good candidate for divide and conquer when:

It can be split into similar sub-problems of roughly equal size.
The sub-problems are independent — solving one does not affect another.
There is an efficient way to combine sub-solutions into a solution for the whole problem.
The base case is trivial — small inputs can be solved directly.

If the sub-problems overlap (the same sub-problem appears in multiple places), you probably need dynamic programming instead.

The Power and Limitation of Divide and Conquer

Divide and conquer is at its best when the problem splits cleanly into independent pieces. Merge sort is the ideal example: the two halves of the array have nothing to do with each other. But divide and conquer struggles when:

Sub-problems share data. If the left and right halves need information about each other, the "combine" step becomes expensive.
The split is uneven. Quicksort with a bad pivot creates one huge sub-problem and one tiny one, degrading to O(n^2).
The recursion generates redundant work. If the same sub-problem appears in both halves (as in naive Fibonacci), divide and conquer without memoization wastes exponential time.

When you recognize these limitations, you know it is time to consider greedy (if local choices suffice) or DP (if sub-problems overlap).

25.3 Greedy Algorithms (Activity Selection, Coin Change, Fractional Knapsack)

A greedy algorithm makes the locally optimal choice at each step, hoping that local optima lead to a global optimum. This works for some problems and fails spectacularly for others. The art is knowing which is which.

Activity Selection

Problem: Given n activities with start and end times, select the maximum number of non-overlapping activities.

Greedy strategy: Sort activities by end time. For each activity, if it does not overlap the last selected activity, select it.

type
  TActivity = record
    Start, Finish: Integer;
  end;
  TActivityArray = array of TActivity;

function CompareByFinish(const A, B: TActivity): Integer;
begin
  if A.Finish < B.Finish then CompareByFinish := -1
  else if A.Finish > B.Finish then CompareByFinish := 1
  else CompareByFinish := 0;
end;

function ActivitySelection(var Acts: TActivityArray; N: Integer): Integer;
var
  Selected, I, LastFinish: Integer;
begin
  { Sort by finish time (assume sorted) }
  Selected := 1;              { Always select the first activity }
  LastFinish := Acts[0].Finish;

  for I := 1 to N - 1 do
    if Acts[I].Start >= LastFinish then begin
      Inc(Selected);
      LastFinish := Acts[I].Finish;
    end;

  ActivitySelection := Selected;
end;

Why does greedy work here? By always selecting the activity that finishes earliest, we leave the maximum remaining time for future activities. This can be proven optimal using an exchange argument: any solution that does not include the earliest-finishing activity can be improved by swapping in that activity.

Worked Example: Activity Selection

Consider these activities (already sorted by finish time):

Activity  Start  Finish
   A        1      3
   B        2      5
   C        4      7
   D        1      8
   E        5      9
   F        8     10
   G        9     11
   H       11     14

Greedy trace:

Select A (finish 3). LastFinish = 3.
  B: start 2 < 3? Skip.
  C: start 4 >= 3? Select C. LastFinish = 7.
  D: start 1 < 7? Skip.
  E: start 5 < 7? Skip.
  F: start 8 >= 7? Select F. LastFinish = 10.
  G: start 9 < 10? Skip.
  H: start 11 >= 10? Select H. LastFinish = 14.

Selected: A, C, F, H — 4 activities.

This is optimal. No other selection achieves more than 4 non-overlapping activities.

Coin Change (Standard Denominations)

Problem: Make change for a given amount using the fewest coins. Available denominations: 25, 10, 5, 1.

Greedy strategy: Always use the largest denomination that does not exceed the remaining amount.

procedure GreedyCoinChange(Amount: Integer);
var
  Coins: array[1..4] of Integer = (25, 10, 5, 1);
  Count, I, NumCoins: Integer;
begin
  WriteLn('Making change for ', Amount, ' cents:');
  for I := 1 to 4 do begin
    NumCoins := Amount div Coins[I];
    if NumCoins > 0 then begin
      WriteLn('  ', NumCoins, ' x ', Coins[I], ' cent coin(s)');
      Amount := Amount - NumCoins * Coins[I];
    end;
  end;
end;

Worked example: Making change for 73 cents:

73 div 25 = 2 quarters (50 cents). Remaining: 23.
23 div 10 = 2 dimes    (20 cents). Remaining: 3.
 3 div 5  = 0 nickels.
 3 div 1  = 3 pennies   (3 cents). Remaining: 0.

Total: 2 quarters + 2 dimes + 3 pennies = 7 coins.

This is optimal for US denominations.

⚠️ Warning: Greedy Does Not Always Work for Coin Change With US denominations (25, 10, 5, 1), greedy always finds the optimal solution. But with unusual denominations like (6, 4, 1), greedy fails: for amount 8, greedy gives 6+1+1 (3 coins), but the optimal solution is 4+4 (2 coins). When greedy fails, you need dynamic programming (Section 25.5).

Huffman Coding Preview

One of the most famous greedy algorithms is Huffman coding, used in data compression (ZIP files, JPEG, MP3). The idea is to assign shorter binary codes to more frequent characters and longer codes to less frequent ones. The greedy strategy: repeatedly merge the two least-frequent nodes into a single node. This produces an optimal prefix-free code.

We will not implement Huffman coding in full (it requires a priority queue), but the principle is important: the greedy choice (merge the two smallest) is provably optimal for this specific problem. This illustrates that greedy algorithms can be powerful — when the mathematical structure supports them.

Fractional Knapsack

Problem: Given items with weights and values, fill a knapsack of limited capacity to maximize total value. You can take fractions of items.

Greedy strategy: Compute value-per-unit-weight for each item. Sort by this ratio in decreasing order. Take items greedily, splitting the last item if needed.

type
  TItem = record
    Weight, Value: Real;
    Ratio: Real;  { Value per unit weight }
  end;

function FractionalKnapsack(var Items: array of TItem;
                            N: Integer; Capacity: Real): Real;
var
  TotalValue, Remaining: Real;
  I: Integer;
begin
  { Calculate ratios and sort by ratio descending (assumed done) }
  for I := 0 to N - 1 do
    Items[I].Ratio := Items[I].Value / Items[I].Weight;

  TotalValue := 0;
  Remaining := Capacity;

  for I := 0 to N - 1 do begin
    if Remaining <= 0 then Break;

    if Items[I].Weight <= Remaining then begin
      { Take entire item }
      TotalValue := TotalValue + Items[I].Value;
      Remaining := Remaining - Items[I].Weight;
    end
    else begin
      { Take fraction }
      TotalValue := TotalValue + Items[I].Ratio * Remaining;
      Remaining := 0;
    end;
  end;

  FractionalKnapsack := TotalValue;
end;

Worked example: Capacity = 50 kg.

Item  Weight  Value  Ratio($/kg)
  A     10     60     6.00
  B     20    100     5.00
  C     30    120     4.00

Sorted by ratio (already sorted): A, B, C.

Take all of A (10 kg, $60). Remaining capacity: 40 kg.
Take all of B (20 kg, $100). Remaining capacity: 20 kg.
Take 20/30 of C (20 kg, $80). Remaining capacity: 0 kg.

Total value: $60 + $100 + $80 = $240.

Greedy works for the fractional knapsack because taking the highest-value-per-weight item first is always optimal when you can split items. For the 0/1 knapsack (items cannot be split), greedy fails — and we need dynamic programming.

Why Fractional Knapsack Works but 0/1 Does Not

The distinction is subtle but important. In the fractional knapsack, if the best remaining item does not fit entirely, you take a fraction of it. This means the greedy choice — always take the highest-ratio item — never "wastes" capacity. Every unit of capacity is filled with the highest-value material available.

In the 0/1 knapsack, taking a large item might leave unused capacity that could have been filled more efficiently by smaller items. The greedy choice (highest ratio) might "waste" capacity by leaving a gap that exactly the wrong size. This is why DP is needed: it considers all combinations of items and finds the one that uses capacity most efficiently.

To illustrate: capacity 6, items A (weight 5, value 10, ratio 2.0) and B (weight 3, value 7, ratio 2.33) and C (weight 3, value 7, ratio 2.33). Greedy by ratio takes B (weight 3, value 7), then C (weight 3, value 7), total value 14, weight 6. But what if we try A (weight 5) + nothing else = value 10? Greedy wins here. But change the values: A (weight 5, value 10), B (weight 3, value 9), C (weight 3, value 9). Greedy takes B+C = 18 (weight 6). Taking A alone gives only 10. Here greedy happens to find the optimal. The cases where greedy fails are less obvious and require careful construction — which is why you should always use DP for the 0/1 variant unless you can prove the greedy property holds.

When Greedy Works: Recognizing the Greedy Choice Property

A problem has the greedy choice property if making the locally optimal choice at each step leads to a globally optimal solution. Here are the telltale signs:

The problem has optimal substructure: An optimal solution contains optimal solutions to sub-problems.
A greedy choice is safe: You can prove (usually by an exchange argument) that the greedy choice does not eliminate any optimal solutions.
After making a greedy choice, you have a smaller instance of the same problem.

Activity selection works because choosing the earliest-finishing activity leaves the most room for future activities. Fractional knapsack works because taking the highest-ratio item first always improves or maintains the total value.

Coin change with standard denominations works because each larger coin is a multiple of smaller coins (25 = 55, 10 = 25, 5 = 5*1). This structural property does not hold for arbitrary denominations, which is why greedy fails for {6, 4, 1}.

💡 Retrieval Practice: Greedy Algorithms Before reading the DP section, try these: 1. For the activity selection problem, why can we not sort by start time and select greedily? (Answer: an activity that starts earliest might have a very late finish, blocking many shorter activities.) 2. In the fractional knapsack, what happens if two items have the same value-per-weight ratio? (Answer: it does not matter — take either one first, the result is the same.) 3. Is Dijkstra's algorithm greedy? (Answer: yes! It always selects the unvisited vertex with the smallest tentative distance — a locally optimal choice that turns out to be globally optimal for non-negative weights.)

25.4 Dynamic Programming (Overlapping Subproblems, Memoization, Tabulation)

Dynamic programming (DP) is the most powerful — and most misunderstood — algorithm design strategy. It applies when a problem can be broken into sub-problems that overlap (the same sub-problem is needed by multiple larger problems) and exhibits optimal substructure (the optimal solution to the whole problem contains optimal solutions to its sub-problems).

The Core Insight

Recall naive recursive Fibonacci from Chapter 22:

function FibRec(N: Integer): Int64;
begin
  if N <= 1 then FibRec := N
  else FibRec := FibRec(N-1) + FibRec(N-2);
end;

This is exponentially slow because it recomputes the same values over and over. FibRec(5) computes FibRec(3) twice and FibRec(2) three times.

DP says: compute each sub-problem once and store the result. There are two approaches:

Memoization (Top-Down)

Start from the top (the original problem) and recurse downward, but cache results in a table:

var
  Memo: array[0..100] of Int64;
  MemoInit: array[0..100] of Boolean;

function FibMemo(N: Integer): Int64;
begin
  if N <= 1 then begin
    FibMemo := N;
    Exit;
  end;
  if MemoInit[N] then begin
    FibMemo := Memo[N];
    Exit;
  end;

  Memo[N] := FibMemo(N-1) + FibMemo(N-2);
  MemoInit[N] := True;
  FibMemo := Memo[N];
end;

Now FibMemo(50) runs instantly instead of taking eons. Each value is computed once and looked up thereafter.

Tabulation (Bottom-Up)

Build the table from the bottom (smallest sub-problems) up to the original problem, without recursion:

function FibTab(N: Integer): Int64;
var
  Table: array of Int64;
  I: Integer;
begin
  if N <= 1 then begin FibTab := N; Exit; end;

  SetLength(Table, N + 1);
  Table[0] := 0;
  Table[1] := 1;
  for I := 2 to N do
    Table[I] := Table[I-1] + Table[I-2];

  FibTab := Table[N];
end;

The Fibonacci Memoization Table

Let us trace FibTab(8) and see the table being built:

Index:   0  1  2  3  4  5  6   7   8
Value:   0  1  1  2  3  5  8  13  21

Step by step:
  Table[0] = 0
  Table[1] = 1
  Table[2] = Table[1] + Table[0] = 1 + 0 = 1
  Table[3] = Table[2] + Table[1] = 1 + 1 = 2
  Table[4] = Table[3] + Table[2] = 2 + 1 = 3
  Table[5] = Table[4] + Table[3] = 3 + 2 = 5
  Table[6] = Table[5] + Table[4] = 5 + 3 = 8
  Table[7] = Table[6] + Table[5] = 8 + 5 = 13
  Table[8] = Table[7] + Table[6] = 13 + 8 = 21

Both approaches yield O(n) time and O(n) space. Tabulation can sometimes be optimized to O(1) space by keeping only the last two values:

function FibOpt(N: Integer): Int64;
var
  A, B, Temp: Int64;
  I: Integer;
begin
  if N <= 1 then begin FibOpt := N; Exit; end;
  A := 0; B := 1;
  for I := 2 to N do begin
    Temp := A + B; A := B; B := Temp;
  end;
  FibOpt := B;
end;

💡 Intuition: Memoization vs. Tabulation Memoization is "lazy" — it computes only the sub-problems that are actually needed (top-down with caching). Tabulation is "eager" — it computes all sub-problems in order (bottom-up, no recursion). Memoization is easier to write (just add a cache to the recursive solution). Tabulation avoids recursion overhead and makes the computation order explicit. For most problems, either approach works; choose based on readability and the specific problem structure.

When to Choose Memoization vs. Tabulation

Choose memoization when: - The recurrence is easy to write but the fill order is hard to determine. - Not all sub-problems are needed (memoization skips unreachable states). - You want a quick conversion from a recursive brute-force solution.

Choose tabulation when: - All sub-problems will be needed anyway. - You want to avoid recursion overhead and potential stack overflow. - You want to optimize space by keeping only the most recent rows/columns. - The fill order is obvious (left-to-right, bottom-to-top, etc.).

In practice, tabulation is more common in production code because it is iterative (no stack overflow risk) and often allows space optimization. Memoization is more common in prototyping and competitive programming because it is faster to code.

25.5 Classic DP Problems (Fibonacci, Longest Common Subsequence, Knapsack)

Longest Common Subsequence (LCS)

Problem: Given two strings, find the length of their longest common subsequence (not substring — characters need not be adjacent).

Example: LCS of "ABCBDAB" and "BDCAB" is "BCAB" (length 4).

Recurrence:

LCS(i, j) = 0                          if i=0 or j=0
LCS(i, j) = LCS(i-1, j-1) + 1         if S1[i] = S2[j]
LCS(i, j) = max(LCS(i-1,j), LCS(i,j-1))  otherwise

Tabulation implementation:

function LCS(const S1, S2: string): Integer;
var
  Table: array of array of Integer;
  I, J, Len1, Len2: Integer;
begin
  Len1 := Length(S1);
  Len2 := Length(S2);
  SetLength(Table, Len1 + 1, Len2 + 1);

  { Initialize base cases }
  for I := 0 to Len1 do Table[I][0] := 0;
  for J := 0 to Len2 do Table[0][J] := 0;

  { Fill table }
  for I := 1 to Len1 do
    for J := 1 to Len2 do begin
      if S1[I] = S2[J] then
        Table[I][J] := Table[I-1][J-1] + 1
      else begin
        if Table[I-1][J] > Table[I][J-1] then
          Table[I][J] := Table[I-1][J]
        else
          Table[I][J] := Table[I][J-1];
      end;
    end;

  LCS := Table[Len1][Len2];
end;

Full LCS Grid Trace

Let us trace the LCS table for S1 = "ABCB" and S2 = "BDCAB":

        ""  B  D  C  A  B
    ""   0  0  0  0  0  0
     A   0  0  0  0  1  1
     B   0  1  1  1  1  2
     C   0  1  1  2  2  2
     B   0  1  1  2  2  3

Step-by-step explanation of key cells:

Table[1][4] (A vs BDCA): S1[1]='A' = S2[4]='A', so Table[1][4] = Table[0][3] + 1 = 0 + 1 = 1.
Table[2][1] (AB vs B): S1[2]='B' = S2[1]='B', so Table[2][1] = Table[1][0] + 1 = 0 + 1 = 1.
Table[3][3] (ABC vs BDC): S1[3]='C' = S2[3]='C', so Table[3][3] = Table[2][2] + 1 = 1 + 1 = 2.
Table[4][5] (ABCB vs BDCAB): S1[4]='B' = S2[5]='B', so Table[4][5] = Table[3][4] + 1 = 2 + 1 = 3.

The LCS has length 3. To recover the actual subsequence, trace back from Table[4][5]: - Table[4][5]: came from diagonal (B=B) → include 'B', go to [3][4]. - Table[3][4]: max of Table[2][4]=1 and Table[3][3]=2 → came from left [3][3]. - Table[3][3]: came from diagonal (C=C) → include 'C', go to [2][2]. - Table[2][2]: max of Table[1][2]=0 and Table[2][1]=1 → came from below [2][1]. - Table[2][1]: came from diagonal (B=B) → include 'B', go to [1][0]. - Table[1][0] = 0 → done.

Reading in reverse: B, C, B → the LCS is "BCB" (length 3).

LCS Applications

LCS is not just an academic exercise. It has important real-world applications:

Version control (diff). When git diff shows you which lines changed between two versions of a file, it uses a variant of LCS to find the longest sequence of unchanged lines. The lines that are in the LCS are marked as unchanged; the rest are marked as additions or deletions.

Bioinformatics. DNA and protein sequences are compared using LCS variants to find evolutionary relationships. If two species have long common subsequences in their DNA, they are likely closely related.

Spell checking. The edit distance between two words (how many insertions, deletions, and substitutions are needed to transform one into the other) is closely related to LCS. The edit distance equals m + n - 2 * LCS(m, n) for strings of lengths m and n with no substitutions.

Plagiarism detection. Comparing two documents for common subsequences can reveal copied content, even if the plagiarist has rearranged sentences.

Time: O(m * n) where m and n are the string lengths. Space: O(m * n) for the table.

0/1 Knapsack

Problem: Given n items, each with a weight and a value, fill a knapsack of capacity W to maximize total value. Each item is either taken entirely or not at all (no fractions).

This is where greedy fails and DP succeeds.

Recurrence:

K(i, w) = 0                                       if i=0 or w=0
K(i, w) = K(i-1, w)                               if weight[i] > w
K(i, w) = max(K(i-1, w), K(i-1, w-weight[i]) + value[i])   otherwise

function Knapsack01(const Weights, Values: array of Integer;
                    N, Capacity: Integer): Integer;
var
  Table: array of array of Integer;
  I, W: Integer;
begin
  SetLength(Table, N + 1, Capacity + 1);

  for I := 0 to N do Table[I][0] := 0;
  for W := 0 to Capacity do Table[0][W] := 0;

  for I := 1 to N do
    for W := 1 to Capacity do begin
      if Weights[I-1] > W then
        Table[I][W] := Table[I-1][W]
      else begin
        if Table[I-1][W] > Table[I-1][W - Weights[I-1]] + Values[I-1] then
          Table[I][W] := Table[I-1][W]
        else
          Table[I][W] := Table[I-1][W - Weights[I-1]] + Values[I-1];
      end;
    end;

  Knapsack01 := Table[N][Capacity];
end;

Full 0/1 Knapsack Grid Trace

Items: A (weight 2, value 3), B (weight 3, value 4), C (weight 4, value 5), D (weight 5, value 8). Capacity = 7.

         w=0  w=1  w=2  w=3  w=4  w=5  w=6  w=7
  i=0:    0    0    0    0    0    0    0    0
  i=1(A): 0    0    3    3    3    3    3    3
  i=2(B): 0    0    3    4    4    7    7    7
  i=3(C): 0    0    3    4    5    7    8    9
  i=4(D): 0    0    3    4    5    8    8   11

Key cells explained:

Table[1][2] = 3: Item A (weight 2) fits in capacity 2. max(0, 0+3) = 3.
Table[2][5] = 7: Item B (weight 3) fits in capacity 5. max(Table[1][5], Table[1][2]+4) = max(3, 3+4) = 7. Take A and B.
Table[3][7] = 9: Item C (weight 4) fits in capacity 7. max(Table[2][7], Table[2][3]+5) = max(7, 4+5) = 9. Take B and C.
Table[4][7] = 11: Item D (weight 5) fits in capacity 7. max(Table[3][7], Table[3][2]+8) = max(9, 3+8) = 11. Take A and D.

The optimal value is 11, achieved by taking items A (weight 2, value 3) and D (weight 5, value 8), total weight 7.

Time: O(n * W). Space: O(n * W) (can be optimized to O(W) with a 1D array).

DP Coin Change (Where Greedy Fails, DP Succeeds)

Let us solve the coin change problem with DP for denominations {6, 4, 1}, amount 8:

Recurrence: Let C(a) be the minimum number of coins to make amount a.

C(0) = 0
C(a) = min over all denominations d where d <= a: C(a - d) + 1

Tabulation:

function DPCoinChange(Amount: Integer; const Coins: array of Integer;
                      NumCoins: Integer): Integer;
var
  Table: array of Integer;
  A, C: Integer;
begin
  SetLength(Table, Amount + 1);
  Table[0] := 0;
  for A := 1 to Amount do begin
    Table[A] := MaxInt;
    for C := 0 to NumCoins - 1 do
      if (Coins[C] <= A) and (Table[A - Coins[C]] < Table[A] - 1) then
        Table[A] := Table[A - Coins[C]] + 1;
  end;
  DPCoinChange := Table[Amount];
end;

Trace for amount 8, coins {6, 4, 1}:

a=0: C(0) = 0
a=1: min(C(0)+1) = 1                  [use coin 1]
a=2: min(C(1)+1) = 2                  [use coin 1]
a=3: min(C(2)+1) = 3                  [use coin 1]
a=4: min(C(3)+1, C(0)+1) = min(4, 1) = 1   [use coin 4]
a=5: min(C(4)+1, C(1)+1) = min(2, 2) = 2   [use coin 4+1 or 1*5]
a=6: min(C(5)+1, C(2)+1, C(0)+1) = min(3, 3, 1) = 1  [use coin 6]
a=7: min(C(6)+1, C(3)+1, C(1)+1) = min(2, 4, 2) = 2  [use coin 6+1]
a=8: min(C(7)+1, C(4)+1, C(2)+1) = min(3, 2, 3) = 2  [use coin 4+4]

DP correctly finds that amount 8 requires 2 coins (4+4), while greedy gives 3 coins (6+1+1). This is the power of DP: it considers all possibilities, not just the locally optimal one.

Recovering the Solution

The DP table tells us the value of the optimal solution. To find which items are included, trace back through the table:

procedure PrintKnapsackItems(const Table: array of array of Integer;
                             const Weights, Values: array of Integer;
                             N, Capacity: Integer);
var
  I, W: Integer;
begin
  W := Capacity;
  for I := N downto 1 do begin
    if Table[I][W] <> Table[I-1][W] then begin
      WriteLn('  Item ', I, ': weight=', Weights[I-1],
              ', value=', Values[I-1]);
      W := W - Weights[I-1];
    end;
  end;
end;

For our example: Item 4 (D, weight 5, value 8) was taken (Table[4][7] differs from Table[3][7]). Remaining capacity: 2. Item 1 (A, weight 2, value 3) was taken (Table[1][2] differs from Table[0][2]).

25.6 Recognizing DP Problems: A Systematic Approach

Many students struggle to recognize when DP applies. Here is a systematic approach:

Step 1: Can the Problem Be Expressed as a Decision at Each Stage?

DP problems typically involve making choices at each step: include this item or not? Match this character or not? Go left or right? If the problem can be phrased as a sequence of decisions, DP may apply.

Step 2: Do the Same Sub-Problems Appear Repeatedly?

Draw the recursion tree for a small input. If you see the same function call appearing in multiple branches, you have overlapping sub-problems. This is the strongest indicator for DP.

For Fibonacci, F(3) appears in both the F(5) and F(4) subtrees. For LCS, LCS(i-1, j-1) is needed by LCS(i, j), LCS(i, j+1), and LCS(i+1, j).

Step 3: Does the Problem Have Optimal Substructure?

Can the optimal solution to the whole problem be built from optimal solutions to sub-problems? For knapsack: if the optimal solution includes item i, then the remaining items must be an optimal solution for the remaining capacity. For LCS: if the optimal LCS includes a match at position (i, j), then the remaining characters must form an optimal LCS.

Step 4: Define the Recurrence

Express the optimal value for a problem state in terms of optimal values for smaller states. This is the hardest step and requires practice. Common patterns:

Include/exclude: OPT(i, w) = max(OPT(i-1, w), OPT(i-1, w-w_i) + v_i) (knapsack)
Match/mismatch: OPT(i, j) = OPT(i-1, j-1) + 1 if match, else max(OPT(i-1,j), OPT(i,j-1)) (LCS)
Last choice: OPT(a) = min over choices d: OPT(a - d) + 1 (coin change)

Step 5: Determine the Table Dimensions

Each "state variable" in the recurrence becomes a dimension of the DP table. Knapsack has two state variables (item index and remaining capacity) → 2D table. Coin change has one (remaining amount) → 1D table. LCS has two (positions in each string) → 2D table.

Step 6: Determine the Fill Order

In tabulation, you must fill the table so that each cell's dependencies are already computed. For knapsack, row i depends on row i-1, so fill rows top to bottom. For coin change, cell a depends on cells a-d for each denomination d, all of which are smaller → fill left to right.

💡 Intuition: DP Is Not Hard — It Is Unfamiliar Students often find DP mystifying because it requires a different way of thinking. But the pattern is always the same: (1) define the sub-problems, (2) write the recurrence, (3) determine the base cases, (4) fill the table. Once you have practiced this pattern on 5-10 problems, it becomes as natural as writing a for loop. The difficulty is not in the technique — it is in recognizing which problems have the right structure for DP.

25.7 Choosing the Right Strategy: A Comparison Framework

Here is a decision framework for selecting an algorithm design strategy:

Can the problem be split into independent sub-problems of the same type? → Divide and conquer. Example: sorting, searching, closest pair.
Can you make a locally optimal choice that never needs revision? → Greedy. Example: activity selection, fractional knapsack. But verify the greedy property — it fails more often than it works.
Does the problem have overlapping sub-problems and optimal substructure? → Dynamic programming. Example: 0/1 knapsack, LCS, shortest paths.
None of the above? → Consider backtracking (systematic trial and error), branch and bound (backtracking with pruning), or approximation algorithms (accept a "good enough" solution).

Side-by-Side Comparison: Knapsack Three Ways

Consider the knapsack problem to see how all three strategies apply (or fail):

Fractional knapsack (greedy): Sort by value/weight ratio, take greedily. Works because fractions are allowed. O(n log n).

0/1 knapsack (DP): Build the table. Works because items are indivisible and sub-problems overlap. O(n * W).

0/1 knapsack (divide and conquer): Split items into two groups, solve each. Does not work efficiently because sub-problems are not independent — the capacity constraint creates dependencies.

This example illustrates a crucial point: the right strategy depends on the problem's mathematical structure, not on your intuition about what "should" work.

When Greedy Fails: A Concrete Demonstration

Consider these items for a capacity-6 knapsack:

Item  Weight  Value  Ratio
  A     1      1     1.0
  B     3      4     1.33
  C     4      5     1.25
  D     5      7     1.40

Greedy (by ratio): Take D (weight 5, value 7). Remaining capacity: 1. Take A (weight 1, value 1). Total: value 8, weight 6.

Optimal (by DP): Take B (weight 3, value 4) and C (weight 4, value 5). Wait — that exceeds capacity. Take B (weight 3) and A (weight 1) = value 5. Or take C (weight 4) and A (weight 1) = value 6. Actually, the greedy solution (D+A = 8) is optimal here!

Let us try a different example where greedy genuinely fails:

Item  Weight  Value  Ratio
  A     3      4     1.33
  B     4      5     1.25
  C     2      3     1.50

Capacity = 5.

Greedy (by ratio): Take C (weight 2, value 3). Remaining capacity: 3. Take A (weight 3, value 4). Total: value 7, weight 5.

DP optimal: Take A (weight 3, value 4) and C (weight 2, value 3). Total: value 7, weight 5. Same result!

The failure case is clearer with denominations: coins {6, 4, 1}, amount 8. Greedy: 6+1+1 = 3 coins. Optimal: 4+4 = 2 coins.

⚠️ Warning: Greedy Is the Most Dangerous Strategy Greedy algorithms are simple, fast, and wrong on many problems. The coin change example shows that greedy produces optimal results for standard US denominations but fails for denomination set {6, 4, 1}. Always prove the greedy property before trusting a greedy solution. If you cannot prove it, use DP.

25.8 Project Checkpoint: PennyWise Budget Optimization

Greedy Budget Allocation

Rosa has a monthly budget of $1,500. She has identified spending categories with different "satisfaction ratings" per dollar spent. The greedy approach allocates money to the highest-satisfaction-per-dollar category first:

type
  TBudgetItem = record
    Category: string[20];
    MaxSpend: Currency;       { Maximum useful spending }
    SatisfactionPerDollar: Real;
  end;

procedure GreedyBudget(var Items: array of TBudgetItem;
                       N: Integer; Budget: Currency);
var
  I: Integer;
  Remaining: Currency;
  Allocated: Currency;
begin
  { Items assumed sorted by SatisfactionPerDollar descending }
  Remaining := Budget;

  WriteLn('Greedy Budget Allocation ($', Budget:0:2, '):');
  for I := 0 to N - 1 do begin
    if Remaining <= 0 then Break;

    if Items[I].MaxSpend <= Remaining then
      Allocated := Items[I].MaxSpend
    else
      Allocated := Remaining;

    WriteLn('  ', Items[I].Category:15, ': $', Allocated:0:2,
            ' (sat/$ = ', Items[I].SatisfactionPerDollar:0:2, ')');
    Remaining := Remaining - Allocated;
  end;

  if Remaining > 0 then
    WriteLn('  Unallocated: $', Remaining:0:2);
end;

Why greedy works here: This is a fractional knapsack in disguise. Rosa can spend any amount up to MaxSpend in each category — she is not forced to spend the full amount or nothing. When partial allocation is allowed, the greedy approach (highest satisfaction-per-dollar first) is provably optimal. The "capacity" is her total budget, the "items" are spending categories, and the "fractions" are the partial amounts she allocates.

Worked example:

Category          MaxSpend   Sat/$
Food               $400      0.90
Entertainment      $200      0.85
Transport          $300      0.70
Savings            $500      0.60
Clothing           $200      0.40

Budget: $1,500. Greedy allocation:
  Food:           $400.00 (sat/$ = 0.90)
  Entertainment:  $200.00 (sat/$ = 0.85)
  Transport:      $300.00 (sat/$ = 0.70)
  Savings:        $500.00 (sat/$ = 0.60)
  Clothing:       $100.00 (sat/$ = 0.40)  ← partial allocation

Total satisfaction: 400*0.9 + 200*0.85 + 300*0.7 + 500*0.6 + 100*0.4
                  = 360 + 170 + 210 + 300 + 40 = 1080

This greedy solution works well because the budget allocation is essentially a fractional knapsack — Rosa can spend any amount up to the maximum in each category.

DP Spending Optimization

Tomas has a tighter budget ($500) and discrete spending options (he cannot spend a fraction of a meal or a fraction of a bus pass). This is the 0/1 knapsack: each spending option has a cost (weight) and a satisfaction value. DP finds the combination that maximizes total satisfaction within the budget.

procedure DPBudget(const Options: array of TBudgetItem;
                   N: Integer; Budget: Integer);
var
  Table: array of array of Integer;
  I, W: Integer;
  Weights, Values: array of Integer;
begin
  SetLength(Weights, N);
  SetLength(Values, N);
  for I := 0 to N - 1 do begin
    Weights[I] := Round(Options[I].MaxSpend);
    Values[I] := Round(Options[I].SatisfactionPerDollar * Options[I].MaxSpend);
  end;

  SetLength(Table, N + 1, Budget + 1);
  for I := 0 to N do Table[I][0] := 0;
  for W := 0 to Budget do Table[0][W] := 0;

  for I := 1 to N do
    for W := 1 to Budget do begin
      Table[I][W] := Table[I-1][W];
      if (Weights[I-1] <= W) and
         (Table[I-1][W - Weights[I-1]] + Values[I-1] > Table[I][W]) then
        Table[I][W] := Table[I-1][W - Weights[I-1]] + Values[I-1];
    end;

  WriteLn('DP Budget Optimization ($', Budget, '):');
  WriteLn('  Maximum satisfaction: ', Table[N][Budget]);

  { Trace back to find selected items }
  W := Budget;
  for I := N downto 1 do
    if Table[I][W] <> Table[I-1][W] then begin
      WriteLn('  Selected: ', Options[I-1].Category:15,
              ' ($', Weights[I-1], ', satisfaction=', Values[I-1], ')');
      W := W - Weights[I-1];
    end;
end;

This DP approach guarantees the optimal allocation, whereas greedy might miss a better combination when items are indivisible. The tradeoff: DP uses more memory and computation than greedy, but it is always correct.

🔗 Cross-Reference: In Chapter 26, we will analyze the time complexity of both the greedy and DP approaches and discuss when the extra cost of DP is justified versus when greedy's speed is more practical.

25.9 Retrieval Practice: Strategy Identification

For each of the following problems, identify the most appropriate strategy (divide and conquer, greedy, or dynamic programming) and briefly explain why. Answers follow.

Finding the maximum element in an unsorted array.
Scheduling classes into rooms to minimize the number of rooms used.
Finding the longest increasing subsequence of an array.
Multiplying two large polynomials efficiently.
Making change for a given amount using denominations {1, 5, 10, 25}.
Finding the shortest path between two vertices in a weighted graph.
Counting the number of paths from the top-left to the bottom-right of a grid.

Answers

None of the three — a simple O(n) scan suffices. No decomposition needed.
Greedy. Sort classes by start time. Assign each class to the first available room. This is the interval partitioning problem.
Dynamic programming. Sub-problems overlap (the LIS ending at position i depends on LIS values at all previous positions). O(n^2) DP or O(n log n) with binary search optimization.
Divide and conquer. The Karatsuba algorithm splits each polynomial in half, recursively multiplies the pieces, and combines. Runs in O(n^1.585) instead of O(n^2).
Greedy (for these denominations). The denominations have the property that each larger coin is a multiple of smaller combinations. For arbitrary denominations, use DP.
Greedy (Dijkstra's algorithm for non-negative weights) or DP (Bellman-Ford for graphs with negative weights, or Floyd-Warshall for all-pairs shortest paths).
Dynamic programming. The number of paths to cell (i, j) equals the number of paths to (i-1, j) plus the number of paths to (i, j-1). This is a classic DP problem with a 2D table.

These exercises reinforce the core skill of this chapter: pattern recognition. The more problems you classify, the faster you will recognize which strategy to apply when you encounter a new problem.

A Meta-Strategy for Problem Solving

When you encounter a new algorithmic problem, follow this systematic approach:

Understand the problem. What is the input? What is the output? Can you solve it by hand on a small example?
Look for brute force. Can you enumerate all possibilities? What is the brute-force complexity? (This gives you a baseline to improve upon.)
Look for structure. Does the problem decompose? Are there sub-problems? Are they independent or overlapping?
Try each strategy mentally. Can you split the problem in half (D&C)? Can you make a greedy choice (greedy)? Can you build a table of sub-solutions (DP)?
Implement and test. Write the solution. Test it on small examples first. Verify the complexity matches your analysis.
Optimize if needed. If the solution is too slow, return to step 3 and look for more structure.

This meta-strategy works because algorithm design is fundamentally about recognizing structure in problems. The three strategies we studied — divide and conquer, greedy, and dynamic programming — are the three most common structures that problems exhibit. Most problems in competitive programming, technical interviews, and real-world software engineering can be solved using one of these three strategies (or a combination).

25.10 Summary

This chapter introduced three fundamental algorithm design strategies:

Divide and conquer splits a problem into independent sub-problems, solves them recursively, and combines the results. It is the strategy behind merge sort, quicksort, binary search, and the closest-pair-of-points algorithm. We derived merge sort's O(n log n) complexity from its recurrence relation. The key requirement is that sub-problems be independent — solving one does not help with another.

Greedy algorithms make the locally optimal choice at each step, never backtracking. They are simple and fast but only correct when the "greedy choice property" holds. Activity selection and fractional knapsack are classic greedy successes; arbitrary coin change and 0/1 knapsack are classic greedy failures. We traced through activity selection and coin change with concrete examples.

Dynamic programming is the strategy for problems with overlapping sub-problems and optimal substructure. It computes each sub-problem once and stores the result, transforming exponential brute-force solutions into polynomial-time algorithms. Memoization (top-down) adds a cache to a recursive solution; tabulation (bottom-up) builds a table iteratively. We traced complete DP tables for Fibonacci, LCS (with a 4x5 grid), and 0/1 knapsack (with a 4x7 grid), including solution recovery through backtracking.

The comparison framework for choosing strategies: - Independent sub-problems → divide and conquer. - Local choices always lead to global optimum → greedy. - Overlapping sub-problems with optimal substructure → dynamic programming. - When in doubt, DP is the safest bet — it is slower but always correct.

The meta-strategy for problem solving (understand, brute-force, look for structure, try each strategy, implement, optimize) provides a systematic approach to any new algorithmic challenge.

Wirth's equation takes its final form: the right algorithm design strategy + the right data structure = an efficient, correct program.

📊 Key Concepts at a Glance | Concept | Definition | |---------|-----------| | Divide and conquer | Split into independent sub-problems, solve recursively, combine | | Greedy | Make locally optimal choices; never backtrack | | Greedy choice property | Local optima lead to global optimum | | Dynamic programming | Solve overlapping sub-problems once, store results | | Overlapping subproblems | Same sub-problem appears multiple times | | Optimal substructure | Optimal solution contains optimal sub-solutions | | Memoization | Top-down DP: recursion with caching | | Tabulation | Bottom-up DP: iterative table filling | | 0/1 Knapsack | Select items to maximize value within a weight limit | | LCS | Longest common subsequence of two strings | | Activity selection | Select maximum non-overlapping activities (greedy) | | Fractional knapsack | Fill knapsack with item fractions (greedy) |

Next: Chapter 26 — Complexity Analysis: How Fast Is Your Program and Can You Make It Faster?