> "Sorting is perhaps the most deeply studied problem in computer science, and for good reason: the ability to sort efficiently is the foundation of dozens of other algorithms."
Learning Objectives
- Implement linear search and binary search and explain when each is appropriate
- Implement selection sort, insertion sort, merge sort, and quicksort in Pascal
- Analyze the time complexity of each searching and sorting algorithm
- Compare sorting algorithms and select the right one for a given problem
- Sort records by different fields using comparison functions
- Use Pascal's built-in sorting facilities for practical applications
In This Chapter
- 23.1 Why Study Sorting and Searching?
- 23.2 Linear Search and Binary Search
- 23.3 Selection Sort and Insertion Sort (Simple O(n^2) Sorts)
- 23.4 Merge Sort (Divide and Conquer, O(n log n))
- 23.5 Quicksort (Partition, Pivot Selection, O(n log n) Average)
- 23.6 Comparison of Sorting Algorithms (Table, When to Use Which)
- 23.7 Sorting Records by Different Fields (Comparison Functions)
- 23.8 Pascal's Built-in Sorting (TFPGList.Sort)
- 23.9 Retrieval Practice and Common Sorting Pitfalls
- 23.10 Project Checkpoint: PennyWise Sortable Expenses
- 23.11 GradeBook Pro: Sorting Students by Multiple Criteria
- 23.12 Summary
Chapter 23: Searching and Sorting: Classic Algorithms Implemented in Pascal
"Sorting is perhaps the most deeply studied problem in computer science, and for good reason: the ability to sort efficiently is the foundation of dozens of other algorithms." — Robert Sedgewick
If you have studied algorithms for even a single day, you know that searching and sorting are where the conversation begins. They are to computer science what scales are to music, what basic compounds are to chemistry — the foundational operations upon which nearly everything else is built. Binary search powers database indexes. Sorting underlies everything from spreadsheets to search engines. The algorithms we study in this chapter are not relics of the 1960s; they are alive in every program you use today.
This chapter takes you through the most important searching and sorting algorithms, implemented in Pascal. We begin with the two fundamental search strategies — linear and binary — and then work through five sorting algorithms of increasing sophistication: selection sort, insertion sort, merge sort, quicksort, and a brief look at Pascal's built-in sorting. For each algorithm, we will not just show you the code; we will trace through examples by hand, analyze the performance, and discuss when you would choose one algorithm over another.
By the end of this chapter, Rosa's PennyWise expenses will be sortable by date, amount, or category — and she will never again need to scroll through an unsorted list to find last Tuesday's coffee purchase.
Prerequisites: Arrays and array manipulation (Chapter 9) and recursion (Chapter 22) are essential. Merge sort and quicksort are recursive algorithms; if you are not comfortable with recursive function calls and base cases, review Chapter 22 first.
23.1 Why Study Sorting and Searching?
You might wonder: why implement sorting algorithms by hand when every language and library already provides a sort function? Three reasons.
First, understanding how algorithms work makes you a better programmer. When you call a library sort, you need to know its performance characteristics — is it O(n log n)? O(n^2) in the worst case? Stable or unstable? — to make informed decisions. Understanding the algorithms gives you that knowledge.
Second, real-world problems often require modified algorithms. Standard library sorts work on arrays of comparable values. But what if you need to sort records by multiple fields? Sort a linked list? Sort data that does not fit in memory? Maintain a partially sorted collection? These variations require algorithmic understanding, not just API knowledge.
Third, sorting and searching are the gateway to algorithmic thinking. The techniques you learn here — divide and conquer, invariants, choosing data structures to support operations — appear everywhere in computer science. Master them here, and you will recognize them in databases, compilers, operating systems, and machine learning.
💡 Intuition: Searching and Sorting in Daily Life You already search and sort constantly. Looking for a name in an alphabetized phone book? That is binary search. Arranging playing cards in your hand by inserting each new card into its proper position? That is insertion sort. Scanning a shelf of unsorted books for a specific title? That is linear search. Algorithms formalize what you already do intuitively.
23.2 Linear Search and Binary Search
Linear Search: The Brute-Force Approach
Linear search (also called sequential search) checks every element in the collection, one by one, until it finds the target or reaches the end. It is the simplest possible search algorithm.
function LinearSearch(const Arr: array of Integer;
Size: Integer; Target: Integer): Integer;
var
I: Integer;
begin
for I := 0 to Size - 1 do
if Arr[I] = Target then begin
LinearSearch := I; { Found: return index }
Exit;
end;
LinearSearch := -1; { Not found }
end;
Analysis: - Best case: O(1) — target is the first element. - Worst case: O(n) — target is the last element or not present. - Average case: O(n/2), which is O(n). - Requires sorted data? No. Works on any array, in any order.
Linear search is perfectly appropriate for small collections (under a few hundred elements) or unsorted data. Do not dismiss it — for a list of 20 items, the overhead of sorting first would dwarf the savings from binary search.
Tracing Linear Search
Let us trace LinearSearch on the array [38, 27, 43, 3, 9, 82, 10] searching for target 9:
Step 1: I=0, Arr[0]=38, 38 = 9? No.
Step 2: I=1, Arr[1]=27, 27 = 9? No.
Step 3: I=2, Arr[2]=43, 43 = 9? No.
Step 4: I=3, Arr[3]=3, 3 = 9? No.
Step 5: I=4, Arr[4]=9, 9 = 9? Yes! Return 4.
Five comparisons. Now let us search for target 50 (not present):
Step 1: I=0, Arr[0]=38, 38 = 50? No.
Step 2: I=1, Arr[1]=27, 27 = 50? No.
Step 3: I=2, Arr[2]=43, 43 = 50? No.
Step 4: I=3, Arr[3]=3, 3 = 50? No.
Step 5: I=4, Arr[4]=9, 9 = 50? No.
Step 6: I=5, Arr[5]=82, 82 = 50? No.
Step 7: I=6, Arr[6]=10, 10 = 50? No.
Loop ends. Return -1.
Seven comparisons — the full array was scanned. This is the worst case for an array of size 7.
Binary Search: Divide and Conquer
Binary search is dramatically faster — but it requires that the data be sorted. The idea is simple: compare the target to the middle element. If the target is smaller, search the left half. If larger, search the right half. Each comparison eliminates half the remaining elements.
function BinarySearch(const Arr: array of Integer;
Low, High: Integer; Target: Integer): Integer;
var
Mid: Integer;
begin
if Low > High then begin
BinarySearch := -1; { Not found }
Exit;
end;
Mid := (Low + High) div 2;
if Arr[Mid] = Target then
BinarySearch := Mid
else if Target < Arr[Mid] then
BinarySearch := BinarySearch(Arr, Low, Mid - 1, Target)
else
BinarySearch := BinarySearch(Arr, Mid + 1, High, Target);
end;
And the iterative version, which avoids recursion overhead:
function BinarySearchIter(const Arr: array of Integer;
Size: Integer; Target: Integer): Integer;
var
Low, High, Mid: Integer;
begin
Low := 0;
High := Size - 1;
while Low <= High do begin
Mid := (Low + High) div 2;
if Arr[Mid] = Target then begin
BinarySearchIter := Mid;
Exit;
end
else if Target < Arr[Mid] then
High := Mid - 1
else
Low := Mid + 1;
end;
BinarySearchIter := -1; { Not found }
end;
Tracing Binary Search Step-by-Step
Let us trace BinarySearchIter on the sorted array [3, 9, 10, 27, 38, 43, 82, 91] (Size = 8), searching for target 27:
Step 1: Low=0, High=7, Mid=(0+7) div 2 = 3
Arr[3]=27, 27 = 27? Yes! Return 3.
Found in one comparison. Now search for target 82:
Step 1: Low=0, High=7, Mid=3
Arr[3]=27, 82 > 27 → Low := 4
Step 2: Low=4, High=7, Mid=(4+7) div 2 = 5
Arr[5]=43, 82 > 43 → Low := 6
Step 3: Low=6, High=7, Mid=(6+7) div 2 = 6
Arr[6]=82, 82 = 82? Yes! Return 6.
Three comparisons. Now search for target 50 (not present):
Step 1: Low=0, High=7, Mid=3
Arr[3]=27, 50 > 27 → Low := 4
Step 2: Low=4, High=7, Mid=5
Arr[5]=43, 50 > 43 → Low := 6
Step 3: Low=6, High=7, Mid=6
Arr[6]=82, 50 < 82 → High := 5
Step 4: Low=6, High=5, Low > High → Not found. Return -1.
Four comparisons to determine that 50 is not in the array. At most log2(8) + 1 = 4 comparisons are ever needed for an 8-element array.
Analysis: - Best case: O(1) — target is the middle element. - Worst case: O(log n) — halving the search space log2(n) times. - Requires sorted data? Yes. Binary search on unsorted data produces incorrect results.
📊 Comparison: Linear vs. Binary Search | Array Size | Linear (worst) | Binary (worst) | |-----------|---------------|----------------| | 10 | 10 comparisons | 4 comparisons | | 100 | 100 | 7 | | 1,000 | 1,000 | 10 | | 1,000,000 | 1,000,000 | 20 | | 1,000,000,000 | 1,000,000,000 | 30 |
Binary search on a billion elements takes at most 30 comparisons. Linear search takes up to a billion. This is why databases use indexes (which are sorted structures enabling binary search).
⚠️ Warning: The Sorted Prerequisite Binary search requires sorted data. If you binary-search an unsorted array, you will get wrong answers — not an error message, not a crash, just silently incorrect results. Always ensure the data is sorted before applying binary search, or use linear search.
23.3 Selection Sort and Insertion Sort (Simple O(n^2) Sorts)
The two simplest sorting algorithms are selection sort and insertion sort. Both are O(n^2) in the worst case, making them impractical for large datasets. But they are easy to understand, easy to implement, and perfectly adequate for small arrays (under a few hundred elements).
Selection Sort
Idea: Find the smallest element and swap it to position 0. Find the next smallest and swap it to position 1. Continue until the array is sorted.
procedure SelectionSort(var Arr: array of Integer; Size: Integer);
var
I, J, MinIdx, Temp: Integer;
begin
for I := 0 to Size - 2 do begin
MinIdx := I;
for J := I + 1 to Size - 1 do
if Arr[J] < Arr[MinIdx] then
MinIdx := J;
{ Swap Arr[I] and Arr[MinIdx] }
if MinIdx <> I then begin
Temp := Arr[I];
Arr[I] := Arr[MinIdx];
Arr[MinIdx] := Temp;
end;
end;
end;
Tracing Selection Sort
Let us trace selection sort on [38, 27, 43, 3, 9, 82, 10, 56]:
Pass 1 (I=0): Find min in [38 27 43 3 9 82 10 56] → min is 3 at index 3
Swap Arr[0] and Arr[3]: [3 27 43 38 9 82 10 56]
Pass 2 (I=1): Find min in [27 43 38 9 82 10 56] → min is 9 at index 4
Swap Arr[1] and Arr[4]: [3 9 43 38 27 82 10 56]
Pass 3 (I=2): Find min in [43 38 27 82 10 56] → min is 10 at index 6
Swap Arr[2] and Arr[6]: [3 9 10 38 27 82 43 56]
Pass 4 (I=3): Find min in [38 27 82 43 56] → min is 27 at index 4
Swap Arr[3] and Arr[4]: [3 9 10 27 38 82 43 56]
Pass 5 (I=4): Find min in [38 82 43 56] → min is 38 at index 4
No swap needed: [3 9 10 27 38 82 43 56]
Pass 6 (I=5): Find min in [82 43 56] → min is 43 at index 6
Swap Arr[5] and Arr[6]: [3 9 10 27 38 43 82 56]
Pass 7 (I=6): Find min in [82 56] → min is 56 at index 7
Swap Arr[6] and Arr[7]: [3 9 10 27 38 43 56 82]
Result: [3 9 10 27 38 43 56 82] ✓ Sorted!
Line-by-line walkthrough of the code:
- The outer loop (for I := 0 to Size - 2) represents each "position to fill." After pass I completes, positions 0 through I contain the correct final values.
- MinIdx := I starts by assuming the current position holds the minimum.
- The inner loop (for J := I + 1 to Size - 1) scans the unsorted portion, updating MinIdx whenever a smaller element is found.
- After the inner loop, MinIdx points to the smallest element in the unsorted portion. The swap places it in position I.
- The if MinIdx <> I check avoids unnecessary swaps when the minimum is already in position.
Analysis: - Comparisons: Always n(n-1)/2 — same in best, worst, and average case. - Swaps: At most n-1 swaps (one per pass). - Time complexity: O(n^2) always. - Stable? No (the swap can change the relative order of equal elements).
Selection sort's virtue is its simplicity and its minimal number of swaps. If swapping is expensive (large records), selection sort may outperform insertion sort.
Insertion Sort
Idea: Maintain a sorted region at the beginning of the array. Take the next unsorted element and insert it into its correct position within the sorted region, shifting elements as needed.
procedure InsertionSort(var Arr: array of Integer; Size: Integer);
var
I, J, Key: Integer;
begin
for I := 1 to Size - 1 do begin
Key := Arr[I];
J := I - 1;
while (J >= 0) and (Arr[J] > Key) do begin
Arr[J + 1] := Arr[J];
Dec(J);
end;
Arr[J + 1] := Key;
end;
end;
Tracing Insertion Sort
Let us trace insertion sort on the same array [38, 27, 43, 3, 9, 82, 10, 56]:
Initial: [38 | 27 43 3 9 82 10 56] (| separates sorted from unsorted)
Pass 1 (I=1): Key = 27
Compare 27 < 38? Yes → shift 38 right
Insert 27 at position 0
Array: [27 38 | 43 3 9 82 10 56]
Pass 2 (I=2): Key = 43
Compare 43 < 38? No → stop
Insert 43 at position 2 (already there)
Array: [27 38 43 | 3 9 82 10 56]
Pass 3 (I=3): Key = 3
Compare 3 < 43? Yes → shift 43 right
Compare 3 < 38? Yes → shift 38 right
Compare 3 < 27? Yes → shift 27 right
Insert 3 at position 0
Array: [3 27 38 43 | 9 82 10 56]
Pass 4 (I=4): Key = 9
Compare 9 < 43? Yes → shift 43 right
Compare 9 < 38? Yes → shift 38 right
Compare 9 < 27? Yes → shift 27 right
Compare 9 < 3? No → stop
Insert 9 at position 1
Array: [3 9 27 38 43 | 82 10 56]
Pass 5 (I=5): Key = 82
Compare 82 < 43? No → stop
Insert 82 at position 5 (already there)
Array: [3 9 27 38 43 82 | 10 56]
Pass 6 (I=6): Key = 10
Compare 10 < 82? Yes → shift 82 right
Compare 10 < 43? Yes → shift 43 right
Compare 10 < 38? Yes → shift 38 right
Compare 10 < 27? Yes → shift 27 right
Compare 10 < 9? No → stop
Insert 10 at position 2
Array: [3 9 10 27 38 43 82 | 56]
Pass 7 (I=7): Key = 56
Compare 56 < 82? Yes → shift 82 right
Compare 56 < 43? No → stop
Insert 56 at position 6
Array: [3 9 10 27 38 43 56 82]
Result: [3 9 10 27 38 43 56 82] ✓ Sorted!
Line-by-line walkthrough:
- The outer loop (for I := 1 to Size - 1) picks each unsorted element in turn. Position 0 is trivially "sorted" by itself.
- Key := Arr[I] saves the element to be inserted. We need this because the shifting will overwrite its position.
- The while loop (while (J >= 0) and (Arr[J] > Key)) walks backward through the sorted portion, shifting each element one position right to make room.
- Arr[J + 1] := Key places the saved element in its correct position.
- The short-circuit evaluation of and is critical: J >= 0 must be checked before Arr[J] > Key, or you will access an invalid index when J = -1.
Analysis: - Best case: O(n) — the array is already sorted (the inner loop never executes). - Worst case: O(n^2) — the array is in reverse order. - Average case: O(n^2). - Stable? Yes (equal elements maintain their relative order).
💡 Intuition: Sorting Playing Cards Insertion sort is how most people naturally sort playing cards. You pick up cards one at a time and insert each one into its proper position among the cards already in your hand. This is why insertion sort is fast on nearly-sorted data — most cards are already close to their final position, so very little shifting is needed.
Insertion sort has a special property: it is adaptive, meaning it runs faster when the data is nearly sorted. For data that is already almost in order (with just a few elements out of place), insertion sort can be nearly O(n) — making it the best choice for maintaining an already-sorted collection when a few new elements are added.
When to Use O(n^2) Sorts
Despite their quadratic worst case, simple sorts have legitimate uses:
-
Small arrays (n < 50): The constant factors of O(n log n) algorithms (function call overhead, cache behavior) can make them slower than simple sorts on small inputs. Many real-world sort implementations switch to insertion sort for small sub-arrays.
-
Nearly sorted data: Insertion sort's O(n) best case makes it ideal for data that is already almost in order.
-
Teaching and debugging: When you need a sort that is easy to understand and verify, simple sorts are the right choice.
23.4 Merge Sort (Divide and Conquer, O(n log n))
Merge sort is our first O(n log n) algorithm, and it is a beautiful application of the divide-and-conquer strategy that we introduced with recursion in Chapter 22.
Idea: Split the array in half. Recursively sort each half. Merge the two sorted halves into a single sorted array.
The Merge Operation
The key insight is that merging two sorted arrays into one sorted array is easy and fast — O(n) time. You simply compare the front elements of each array and take the smaller one, repeating until both arrays are exhausted.
procedure Merge(var Arr: array of Integer;
Left, Mid, Right: Integer);
var
Temp: array of Integer;
I, J, K: Integer;
TempSize: Integer;
begin
TempSize := Right - Left + 1;
SetLength(Temp, TempSize);
I := Left; { Index into left half }
J := Mid + 1; { Index into right half }
K := 0; { Index into temp array }
{ Merge the two halves }
while (I <= Mid) and (J <= Right) do begin
if Arr[I] <= Arr[J] then begin
Temp[K] := Arr[I];
Inc(I);
end
else begin
Temp[K] := Arr[J];
Inc(J);
end;
Inc(K);
end;
{ Copy remaining elements from left half }
while I <= Mid do begin
Temp[K] := Arr[I];
Inc(I);
Inc(K);
end;
{ Copy remaining elements from right half }
while J <= Right do begin
Temp[K] := Arr[J];
Inc(J);
Inc(K);
end;
{ Copy merged result back to original array }
for I := 0 to TempSize - 1 do
Arr[Left + I] := Temp[I];
end;
The Recursive Sort
procedure MergeSort(var Arr: array of Integer;
Left, Right: Integer);
var
Mid: Integer;
begin
if Left >= Right then { Base case: 0 or 1 elements }
Exit;
Mid := (Left + Right) div 2;
MergeSort(Arr, Left, Mid); { Sort left half }
MergeSort(Arr, Mid + 1, Right); { Sort right half }
Merge(Arr, Left, Mid, Right); { Merge the halves }
end;
Tracing Merge Sort in Detail
Let us trace MergeSort on the array [38, 27, 43, 3, 9, 82, 10, 56]:
Level 0: MergeSort([38 27 43 3 9 82 10 56], 0, 7)
Split at Mid=3: Left=[38 27 43 3], Right=[9 82 10 56]
Level 1a: MergeSort([38 27 43 3], 0, 3)
Split at Mid=1: Left=[38 27], Right=[43 3]
Level 2a: MergeSort([38 27], 0, 1)
Split at Mid=0: Left=[38], Right=[27]
Level 3a: MergeSort([38], 0, 0) → base case (1 element)
Level 3b: MergeSort([27], 1, 1) → base case (1 element)
Merge [38] and [27]:
Compare 38 vs 27: take 27. Compare 38 vs (empty): take 38.
Result: [27 38]
Level 2b: MergeSort([43 3], 2, 3)
Split at Mid=2: Left=[43], Right=[3]
Merge [43] and [3]:
Compare 43 vs 3: take 3. Then take 43.
Result: [3 43]
Merge [27 38] and [3 43]:
Compare 27 vs 3: take 3.
Compare 27 vs 43: take 27.
Compare 38 vs 43: take 38.
Take remaining 43.
Result: [3 27 38 43]
Level 1b: MergeSort([9 82 10 56], 4, 7)
(similar process)
[9 82] → merge → [9 82]
[10 56] → merge → [10 56]
Merge [9 82] and [10 56]:
Compare 9 vs 10: take 9.
Compare 82 vs 10: take 10.
Compare 82 vs 56: take 56.
Take remaining 82.
Result: [9 10 56 82]
Level 0: Merge [3 27 38 43] and [9 10 56 82]:
Compare 3 vs 9: take 3.
Compare 27 vs 9: take 9.
Compare 27 vs 10: take 10.
Compare 27 vs 56: take 27.
Compare 38 vs 56: take 38.
Compare 43 vs 56: take 43.
Take remaining 56, 82.
Result: [3 9 10 27 38 43 56 82] ✓
At each level, the total work is O(n) (merging). There are O(log n) levels (halving the array each time). Total: O(n log n).
Analysis:
- Best case: O(n log n)
- Worst case: O(n log n)
- Average case: O(n log n)
- Space: O(n) extra (for the temporary merge array)
- Stable? Yes (equal elements maintain their relative order, because the merge uses <=)
Merge sort's great virtue is its guaranteed O(n log n) performance — it never degrades to O(n^2) regardless of the input. Its drawback is the O(n) extra memory for the temporary array.
Natural Merge Sort (Brief Mention)
Standard merge sort always splits at the midpoint, ignoring any existing order. Natural merge sort instead identifies existing sorted runs in the data and merges those runs. On nearly-sorted data, natural merge sort can be O(n) (just one merge pass if the data is already in two sorted halves). Python's Timsort, one of the most practical sorting algorithms in use today, is based on this idea.
💡 Retrieval Practice: Merge Sort
Before reading the quicksort section, try these exercises in your head or on paper:
1. Trace merge sort on the array [5, 1, 4, 2, 8]. Draw the split tree and show each merge step.
2. What is the minimum number of comparisons the merge step needs for two sorted arrays of size 3 each? (Answer: 3 — if one array's elements are all smaller than the other's.)
3. Why does merge sort use <= rather than < in its comparison? (Answer: to maintain stability — equal elements from the left half come before equal elements from the right.)
Merge Sort on a Linked List
One advantage of merge sort over quicksort is that it works naturally on linked lists. Arrays favor quicksort because partition uses random access; linked lists favor merge sort because the split and merge operations only need sequential access. In fact, many library implementations use merge sort specifically for linked list sorting. The idea is identical: find the midpoint (using the "slow and fast pointer" technique), recursively sort each half, and merge. No temporary array is needed because the merge can relink existing nodes.
🔗 Cross-Reference: Merge sort is a divide-and-conquer algorithm — a strategy we will study formally in Chapter 25 alongside greedy algorithms and dynamic programming.
23.5 Quicksort (Partition, Pivot Selection, O(n log n) Average)
Quicksort is arguably the most important sorting algorithm in practice. It is the default sort in many standard libraries, the basis of the C standard library's qsort, and the sort that Tony Hoare invented in 1960 and that computer scientists have been refining ever since.
Idea: Choose a pivot element. Partition the array so that all elements less than the pivot are to its left and all elements greater are to its right. The pivot is now in its final sorted position. Recursively sort the left and right partitions.
The Partition Operation
function Partition(var Arr: array of Integer;
Low, High: Integer): Integer;
var
Pivot, Temp: Integer;
I, J: Integer;
begin
Pivot := Arr[High]; { Choose last element as pivot }
I := Low - 1; { Index of the boundary }
for J := Low to High - 1 do begin
if Arr[J] <= Pivot then begin
Inc(I);
{ Swap Arr[I] and Arr[J] }
Temp := Arr[I];
Arr[I] := Arr[J];
Arr[J] := Temp;
end;
end;
{ Place pivot in its final position }
Temp := Arr[I + 1];
Arr[I + 1] := Arr[High];
Arr[High] := Temp;
Partition := I + 1; { Return pivot's final index }
end;
Line-by-line explanation: The partition function uses the Lomuto scheme. Variable I tracks the boundary between elements less-than-or-equal-to the pivot (indices Low through I) and elements greater than the pivot (indices I+1 through J-1). As J scans forward, any element less than or equal to the pivot is swapped into the "small" region by incrementing I and swapping. After the loop, the pivot is placed at position I+1, which is its correct sorted position.
The Recursive Sort
procedure QuickSort(var Arr: array of Integer;
Low, High: Integer);
var
PivotIndex: Integer;
begin
if Low < High then begin
PivotIndex := Partition(Arr, Low, High);
QuickSort(Arr, Low, PivotIndex - 1); { Sort left of pivot }
QuickSort(Arr, PivotIndex + 1, High); { Sort right of pivot }
end;
end;
Tracing Quicksort in Detail
Let us trace on [38, 27, 43, 3, 9, 82, 10, 56] with last-element pivot:
QuickSort([38 27 43 3 9 82 10 56], 0, 7)
Pivot = 56 (Arr[7])
Partition scan:
J=0: 38 <= 56? Yes → I becomes 0, swap Arr[0] with Arr[0] (no change)
J=1: 27 <= 56? Yes → I becomes 1, swap Arr[1] with Arr[1] (no change)
J=2: 43 <= 56? Yes → I becomes 2, swap Arr[2] with Arr[2] (no change)
J=3: 3 <= 56? Yes → I becomes 3, swap Arr[3] with Arr[3] (no change)
J=4: 9 <= 56? Yes → I becomes 4, swap Arr[4] with Arr[4] (no change)
J=5: 82 <= 56? No → skip
J=6: 10 <= 56? Yes → I becomes 5, swap Arr[5] and Arr[6]: 82↔10
Array after scan: [38 27 43 3 9 10 82 56]
Place pivot: swap Arr[6] and Arr[7]: [38 27 43 3 9 10 56 82]
Pivot index = 6
QuickSort([38 27 43 3 9 10], 0, 5)
Pivot = 10 (Arr[5])
Partition: [3 9 10 38 27 43], pivot index = 2
QuickSort([3 9], 0, 1)
Pivot = 9. Partition: [3 9], pivot index = 1
QuickSort([3], 0, 0) → base case
QuickSort([], 2, 1) → base case (Low > High)
QuickSort([38 27 43], 3, 5)
Pivot = 43. Partition: [38 27 43], pivot index = 5
QuickSort([38 27], 3, 4)
Pivot = 27. Partition: [27 38], pivot index = 3
QuickSort([], 6, 5) → base case
QuickSort([82], 7, 7) → base case
Final: [3 9 10 27 38 43 56 82] ✓
Analysis
- Best case: O(n log n) — pivot always splits the array roughly in half.
- Average case: O(n log n) — random pivots typically give balanced splits.
- Worst case: O(n^2) — pivot is always the smallest or largest element (already sorted or reverse-sorted input with last-element pivot selection).
- Space: O(log n) stack space on average (recursive calls), O(n) in worst case.
- Stable? No (the partition swaps can change relative order of equal elements).
Improving Pivot Selection
The worst case occurs when the pivot is consistently the minimum or maximum. We can mitigate this with better pivot selection:
Median-of-three: Choose the median of the first, middle, and last elements:
function MedianOfThree(var Arr: array of Integer;
Low, High: Integer): Integer;
var
Mid, Temp: Integer;
begin
Mid := (Low + High) div 2;
{ Sort Low, Mid, High so that Arr[Mid] is the median }
if Arr[Low] > Arr[Mid] then begin
Temp := Arr[Low]; Arr[Low] := Arr[Mid]; Arr[Mid] := Temp;
end;
if Arr[Low] > Arr[High] then begin
Temp := Arr[Low]; Arr[Low] := Arr[High]; Arr[High] := Temp;
end;
if Arr[Mid] > Arr[High] then begin
Temp := Arr[Mid]; Arr[Mid] := Arr[High]; Arr[High] := Temp;
end;
{ Move median to High-1 position for use as pivot }
Temp := Arr[Mid]; Arr[Mid] := Arr[High - 1]; Arr[High - 1] := Temp;
MedianOfThree := High - 1;
end;
This makes the O(n^2) worst case extremely unlikely — you would need pathologically adversarial input to trigger it.
Shell Sort (A Brief Mention)
Between the O(n^2) simple sorts and the O(n log n) recursive sorts, there is an interesting middle ground: Shell sort, invented by Donald Shell in 1959. Shell sort is essentially insertion sort applied with decreasing gap sequences. Instead of comparing adjacent elements, it compares elements that are gap positions apart, then reduces the gap until it reaches 1 (which is standard insertion sort).
procedure ShellSort(var Arr: array of Integer; Size: Integer);
var
Gap, I, J, Temp: Integer;
begin
Gap := Size div 2;
while Gap > 0 do begin
for I := Gap to Size - 1 do begin
Temp := Arr[I];
J := I;
while (J >= Gap) and (Arr[J - Gap] > Temp) do begin
Arr[J] := Arr[J - Gap];
Dec(J, Gap);
end;
Arr[J] := Temp;
end;
Gap := Gap div 2;
end;
end;
Shell sort's complexity depends on the gap sequence. With the simple halving sequence shown here, worst case is O(n^2), but average case is significantly better — roughly O(n^1.3). With more sophisticated gap sequences (such as Sedgewick's), it can achieve near O(n^(4/3)). Shell sort is useful when you need something faster than insertion sort but simpler than merge sort or quicksort, and when O(1) extra space is required.
⚠️ Warning: Quicksort's Achilles Heel If you use last-element pivot selection on already-sorted data, quicksort degrades to O(n^2). This is exactly the kind of input you might encounter in practice (data that has been partially sorted by a previous operation). Always use median-of-three or random pivot selection in production code.
23.6 Comparison of Sorting Algorithms (Table, When to Use Which)
📊 Sorting Algorithm Comparison
| Algorithm | Best | Average | Worst | Space | Stable | Adaptive |
|---|---|---|---|---|---|---|
| Selection Sort | O(n^2) | O(n^2) | O(n^2) | O(1) | No | No |
| Insertion Sort | O(n) | O(n^2) | O(n^2) | O(1) | Yes | Yes |
| Shell Sort | O(n log n) | ~O(n^1.3) | O(n^2) | O(1) | No | Yes |
| Merge Sort | O(n log n) | O(n log n) | O(n log n) | O(n) | Yes | No |
| Quicksort | O(n log n) | O(n log n) | O(n^2) | O(log n) | No | No |
Key Terms
-
Stable: A sort is stable if equal elements retain their original relative order. If two expenses have the same amount, a stable sort preserves whichever was entered first. Merge sort and insertion sort are stable; selection sort and quicksort (as typically implemented) are not.
-
Adaptive: A sort is adaptive if it runs faster on partially sorted input. Insertion sort is adaptive (O(n) on sorted data); selection sort is not (always O(n^2)).
-
In-place: A sort is in-place if it uses O(1) extra memory (not counting the input array). Selection sort and insertion sort are in-place. Merge sort requires O(n) extra memory. Quicksort uses O(log n) stack space.
Stability: Why It Matters
Stability might seem like a minor detail, but it is crucial for multi-field sorting. Suppose Rosa's expenses are already sorted by date, and she now wants to sort by category (keeping the date order within each category). If she uses a stable sort for the category sort, the date ordering within each category is preserved. If she uses an unstable sort, expenses within the same category may lose their chronological order.
This is why database systems almost always use stable sorts: users expect secondary orderings to be preserved.
Decision Framework
Use this guide to choose a sorting algorithm:
-
Small array (n < 50)? Use insertion sort. The overhead of recursive algorithms is not worth it.
-
Nearly sorted data? Use insertion sort. Its O(n) best case exploits existing order.
-
Need guaranteed O(n log n)? Use merge sort. It never degrades to O(n^2).
-
Need good average performance with minimal memory? Use quicksort with median-of-three pivot. It is typically the fastest sort in practice.
-
Need stability? Use merge sort or insertion sort. Quicksort and selection sort are unstable.
-
Memory is extremely limited? Use selection sort or insertion sort (both O(1) extra space) or in-place quicksort.
-
Need simplicity with better-than-quadratic average? Use Shell sort. In-place, no recursion, respectable performance.
In practice, most high-quality sort implementations use a hybrid approach: quicksort or merge sort for the main recursive decomposition, switching to insertion sort for sub-arrays smaller than about 16 elements. This combines the theoretical advantages of O(n log n) algorithms with the practical speed of insertion sort on small inputs.
Benchmark Timing Code
You can measure sorting performance empirically in Pascal:
uses SysUtils;
procedure BenchmarkSort(Size: Integer);
var
Arr1, Arr2, Arr3: array of Integer;
I: Integer;
Start: QWord;
begin
SetLength(Arr1, Size);
SetLength(Arr2, Size);
SetLength(Arr3, Size);
{ Generate random data and copy to all arrays }
for I := 0 to Size - 1 do begin
Arr1[I] := Random(Size * 10);
Arr2[I] := Arr1[I];
Arr3[I] := Arr1[I];
end;
WriteLn('Array size: ', Size);
Start := GetTickCount64;
InsertionSort(Arr1, Size);
WriteLn(' Insertion sort: ', GetTickCount64 - Start, ' ms');
Start := GetTickCount64;
MergeSort(Arr2, 0, Size - 1);
WriteLn(' Merge sort: ', GetTickCount64 - Start, ' ms');
Start := GetTickCount64;
QuickSort(Arr3, 0, Size - 1);
WriteLn(' Quicksort: ', GetTickCount64 - Start, ' ms');
end;
begin
Randomize;
BenchmarkSort(1000);
BenchmarkSort(10000);
BenchmarkSort(100000);
end.
Typical results on a modern machine:
Array size: 1000
Insertion sort: 2 ms
Merge sort: 0 ms
Quicksort: 0 ms
Array size: 10000
Insertion sort: 156 ms
Merge sort: 3 ms
Quicksort: 2 ms
Array size: 100000
Insertion sort: 15234 ms
Merge sort: 31 ms
Quicksort: 19 ms
The pattern is unmistakable: insertion sort's quadratic growth makes it impractical beyond a few thousand elements. Merge sort and quicksort remain fast even at 100,000 elements. And quicksort is slightly faster than merge sort in practice (despite the same asymptotic complexity) because it has better cache locality and no temporary array allocation.
23.7 Sorting Records by Different Fields (Comparison Functions)
Real-world sorting rarely involves simple integers. Rosa's PennyWise expenses are records with multiple fields: date, amount, category, description. She wants to sort by any of these fields. How do we write a single sort that works for any comparison criterion?
The answer is comparison functions — functions that take two elements and return which one should come first.
Defining the Record Type
type
TExpense = record
Date: TDateTime;
Amount: Currency;
Category: string[30];
Description: string[80];
end;
TExpenseArray = array of TExpense;
{ Comparison function type: returns negative if A < B,
0 if equal, positive if A > B }
TExpenseCompare = function(const A, B: TExpense): Integer;
Comparison Functions
function CompareByAmount(const A, B: TExpense): Integer;
begin
if A.Amount < B.Amount then CompareByAmount := -1
else if A.Amount > B.Amount then CompareByAmount := 1
else CompareByAmount := 0;
end;
function CompareByDate(const A, B: TExpense): Integer;
begin
if A.Date < B.Date then CompareByDate := -1
else if A.Date > B.Date then CompareByDate := 1
else CompareByDate := 0;
end;
function CompareByCategory(const A, B: TExpense): Integer;
begin
if A.Category < B.Category then CompareByCategory := -1
else if A.Category > B.Category then CompareByCategory := 1
else CompareByCategory := 0;
end;
Generic Insertion Sort with Comparison Function
procedure SortExpenses(var Expenses: TExpenseArray;
Count: Integer;
Compare: TExpenseCompare);
var
I, J: Integer;
Key: TExpense;
begin
for I := 1 to Count - 1 do begin
Key := Expenses[I];
J := I - 1;
while (J >= 0) and (Compare(Expenses[J], Key) > 0) do begin
Expenses[J + 1] := Expenses[J];
Dec(J);
end;
Expenses[J + 1] := Key;
end;
end;
Now Rosa can sort by any criterion:
SortExpenses(Expenses, Count, @CompareByAmount); { Sort by amount }
SortExpenses(Expenses, Count, @CompareByDate); { Sort by date }
SortExpenses(Expenses, Count, @CompareByCategory); { Sort by category }
This is the strategy pattern in action: the sorting algorithm is the same, but the comparison strategy is pluggable. We pass the comparison function as a parameter using the @ operator to get its address.
Multi-Field Sorting
What if Rosa wants to sort by category first, then by amount within each category? Compose the comparisons:
function CompareByCategoryThenAmount(const A, B: TExpense): Integer;
begin
Result := CompareByCategory(A, B);
if Result = 0 then
Result := CompareByAmount(A, B);
end;
This is a powerful pattern. You can compose any number of comparison criteria, each serving as a tiebreaker for the previous one. For example, sort by category, then by date within each category, then by amount within the same category and date:
function CompareMultiField(const A, B: TExpense): Integer;
begin
Result := CompareByCategory(A, B);
if Result = 0 then begin
Result := CompareByDate(A, B);
if Result = 0 then
Result := CompareByAmount(A, B);
end;
end;
This compositional approach is far cleaner than writing a single monolithic comparison function that handles all the logic at once.
23.8 Pascal's Built-in Sorting (TFPGList.Sort)
Free Pascal provides sorting facilities through its generics library. The TFPGList class (from the fgl unit) has a Sort method that accepts a comparison function:
uses
fgl;
type
TIntegerList = specialize TFPGList<Integer>;
function CompareIntegers(const A, B: Integer): Integer;
begin
if A < B then CompareIntegers := -1
else if A > B then CompareIntegers := 1
else CompareIntegers := 0;
end;
var
Numbers: TIntegerList;
I: Integer;
begin
Numbers := TIntegerList.Create;
try
Numbers.Add(38);
Numbers.Add(27);
Numbers.Add(43);
Numbers.Add(3);
Numbers.Sort(@CompareIntegers);
for I := 0 to Numbers.Count - 1 do
WriteLn(Numbers[I]);
finally
Numbers.Free;
end;
end.
The TFPGList.Sort method uses a variant of quicksort internally. For most practical purposes, using the built-in sort is the right choice — it is well-tested, optimized, and handles edge cases correctly. Implement your own sort only when you need specific properties (stability, adaptive behavior) or when you are learning.
✅ Best Practice: Use Library Sorts in Production Implement sorting algorithms by hand for learning and for situations requiring specific guarantees. For production code, prefer library implementations — they are tested on millions of inputs and handle edge cases (empty arrays, single-element arrays, equal elements) that hand-written sorts often miss.
23.9 Retrieval Practice and Common Sorting Pitfalls
Before we move to the PennyWise project checkpoint, let us consolidate what we have learned with some retrieval practice and common mistakes.
Retrieval Practice Questions
Try to answer these without looking back. Then check your answers against the relevant sections.
-
What is the key prerequisite for binary search? The array must be sorted. Binary search on unsorted data gives wrong answers silently.
-
Which O(n^2) sort is adaptive? Insertion sort. It runs in O(n) on already-sorted data.
-
Which O(n log n) sort is stable? Merge sort. Quicksort (as typically implemented) is not.
-
What is quicksort's worst case, and how do you avoid it? O(n^2), triggered by consistently bad pivots (e.g., already-sorted data with last-element pivot). Use median-of-three or random pivot selection.
-
Can you sort an array of records by any field without rewriting the sort? Yes, using comparison functions passed as parameters.
Common Sorting Pitfalls
Pitfall 1: Off-by-one errors in partition. The Lomuto partition scheme (our implementation) is particularly tricky. The boundary variable I starts at Low - 1, which is confusing. Many students accidentally start at Low, which causes incorrect partitioning.
Pitfall 2: Unstable sort destroying secondary ordering. If your data is sorted by date and you re-sort by category using an unstable sort (quicksort, selection sort), the date ordering within each category is lost. Use merge sort or insertion sort when stability matters.
Pitfall 3: Forgetting the base case in recursive sorts. Both merge sort and quicksort need a base case (if Left >= Right or if Low >= High). Without it, infinite recursion ensues.
Pitfall 4: Modifying the comparison function mid-sort. If the comparison function is inconsistent (sometimes says A < B, sometimes says A > B for the same values), the sort may loop forever or produce corrupted output. Comparison functions must define a strict total ordering.
Pitfall 5: Sorting a subset but accessing the full array. When sorting a sub-array (as in the recursive calls of quicksort), ensure your indices stay within the sub-array bounds. Accessing Arr[High + 1] or Arr[Low - 1] is undefined behavior.
23.10 Project Checkpoint: PennyWise Sortable Expenses
Rosa's PennyWise application now stores expenses as an array of records. She wants to:
- Sort expenses by amount (ascending or descending) to find her biggest and smallest purchases.
- Sort expenses by date to see her spending history chronologically.
- Sort expenses by category to group related expenses together.
- Search for a specific expense by amount or date.
Implementation Overview
We combine the comparison function approach from Section 23.7 with quicksort for performance:
procedure QuickSortExpenses(var Arr: TExpenseArray;
Low, High: Integer;
Compare: TExpenseCompare);
var
PivotIdx, I, J: Integer;
Pivot, Temp: TExpense;
begin
if Low >= High then Exit;
Pivot := Arr[(Low + High) div 2];
I := Low;
J := High;
while I <= J do begin
while Compare(Arr[I], Pivot) < 0 do Inc(I);
while Compare(Arr[J], Pivot) > 0 do Dec(J);
if I <= J then begin
Temp := Arr[I];
Arr[I] := Arr[J];
Arr[J] := Temp;
Inc(I);
Dec(J);
end;
end;
if Low < J then QuickSortExpenses(Arr, Low, J, Compare);
if I < High then QuickSortExpenses(Arr, I, High, Compare);
end;
Rosa can also use binary search on sorted data to find expenses quickly:
function FindExpenseByAmount(const Arr: TExpenseArray;
Count: Integer;
Target: Currency): Integer;
var
Low, High, Mid: Integer;
begin
Low := 0;
High := Count - 1;
while Low <= High do begin
Mid := (Low + High) div 2;
if Arr[Mid].Amount = Target then begin
FindExpenseByAmount := Mid;
Exit;
end
else if Arr[Mid].Amount < Target then
Low := Mid + 1
else
High := Mid - 1;
end;
FindExpenseByAmount := -1;
end;
With these additions, PennyWise is no longer just a data entry tool — it is a data analysis tool. Rosa can sort her expenses by any field, search for specific transactions in logarithmic time, and group expenses by category for reporting.
🔗 Cross-Reference: In Chapter 24, we will use a binary search tree to provide dynamic sorting — a structure that maintains sorted order as expenses are added and removed, without needing to re-sort the entire array.
23.11 GradeBook Pro: Sorting Students by Multiple Criteria
To see these algorithms in a second context beyond PennyWise, consider a GradeBook application where a teacher needs to sort students by various criteria.
type
TStudent = record
LastName: string[30];
FirstName: string[20];
GPA: Real;
Credits: Integer;
end;
TStudentArray = array of TStudent;
TStudentCompare = function(const A, B: TStudent): Integer;
The teacher wants several views of the same data:
Honor roll (by GPA descending):
function CompareByGPADesc(const A, B: TStudent): Integer;
begin
if A.GPA > B.GPA then CompareByGPADesc := -1
else if A.GPA < B.GPA then CompareByGPADesc := 1
else CompareByGPADesc := 0;
end;
Alphabetical roster (by last name, then first name):
function CompareByName(const A, B: TStudent): Integer;
begin
if A.LastName < B.LastName then CompareByName := -1
else if A.LastName > B.LastName then CompareByName := 1
else begin
{ Last names equal — sort by first name }
if A.FirstName < B.FirstName then CompareByName := -1
else if A.FirstName > B.FirstName then CompareByName := 1
else CompareByName := 0;
end;
end;
Credits remaining (ascending, to identify students nearing graduation):
function CompareByCredits(const A, B: TStudent): Integer;
begin
CompareByCredits := A.Credits - B.Credits;
end;
The same merge sort or quicksort implementation handles all three cases — the only difference is which comparison function is passed. This is the power of abstracting the comparison: the sort algorithm is completely independent of what is being compared.
A particularly useful trick is descending sort without a separate comparison function. Simply negate the result of the ascending comparison:
function CompareByGPAAsc(const A, B: TStudent): Integer;
begin
if A.GPA < B.GPA then CompareByGPAAsc := -1
else if A.GPA > B.GPA then CompareByGPAAsc := 1
else CompareByGPAAsc := 0;
end;
{ To sort descending, wrap the ascending comparator: }
function CompareByGPADescWrap(const A, B: TStudent): Integer;
begin
CompareByGPADescWrap := -CompareByGPAAsc(A, B);
end;
This pattern — write one comparison function per field, then negate or compose as needed — scales to any number of fields and orderings. It is the same pattern used by database query engines when you write ORDER BY gpa DESC, last_name ASC.
23.12 Summary
This chapter covered the foundational algorithms of computer science: searching and sorting.
Searching: - Linear search examines every element sequentially. It works on unsorted data and has O(n) time complexity. - Binary search eliminates half the search space with each comparison. It requires sorted data and has O(log n) time complexity. The difference between O(n) and O(log n) is the difference between a million comparisons and twenty.
Sorting — O(n^2) algorithms: - Selection sort finds the minimum and places it at the front, repeating for each position. Always O(n^2), minimal swaps. - Insertion sort inserts each element into its correct position in the sorted portion. O(n) on sorted input, O(n^2) worst case. Stable and adaptive.
Sorting — O(n log n) algorithms: - Merge sort divides the array in half, sorts each half recursively, and merges the sorted halves. Guaranteed O(n log n) but requires O(n) extra memory. Stable. - Quicksort partitions the array around a pivot, then recursively sorts the partitions. O(n log n) on average, O(n^2) worst case (mitigated by good pivot selection). In-place (O(log n) stack space). Typically the fastest sort in practice.
Other sorts: - Shell sort extends insertion sort with gap sequences, achieving better-than-quadratic average performance with O(1) extra space. - Natural merge sort exploits existing order in the data, forming the basis of practical hybrid sorts like Timsort.
Practical considerations: - Use comparison functions to sort records by different fields. - Compose comparison functions for multi-field sorting. - Use library sorts in production; implement your own for learning and special requirements. - Choose your algorithm based on data size, existing order, stability requirements, and memory constraints. - Benchmark your sorting choices with realistic data to validate theoretical predictions.
Wirth's equation comes alive in this chapter: Algorithms + Data Structures = Programs. The data structure (an array of records) determines what operations are possible. The algorithm (sort, search) determines how efficiently those operations execute. Choosing the right combination is the essence of programming.
📊 Key Concepts at a Glance | Concept | Definition | |---------|-----------| | Linear search | Sequential scan through all elements; O(n) | | Binary search | Halving search on sorted data; O(log n) | | Selection sort | Repeatedly find minimum; O(n^2) always | | Insertion sort | Insert each element in sorted position; O(n) best, O(n^2) worst | | Shell sort | Gap-based insertion sort; sub-quadratic average | | Merge sort | Divide, sort halves, merge; O(n log n) guaranteed, O(n) space | | Quicksort | Partition around pivot, sort partitions; O(n log n) average | | Stable sort | Preserves relative order of equal elements | | Adaptive sort | Runs faster on partially sorted input | | Comparison function | A pluggable function that defines sort order |
Next: Chapter 24 — Trees and Graphs: Hierarchical and Network Data Structures