Chapter 10 Key Takeaways
Core Concepts
-
Pascal has multiple string types.
ShortStringis stack-allocated with a 255-character limit;AnsiString(enabled via{$H+}) is heap-allocated, dynamically sized, and reference-counted. For most programs,AnsiStringis the practical choice. -
Strings are 1-indexed. The first character of a string
siss[1], nots[0]. This is consistent throughout Pascal and differs from C-family languages. -
String comparison is lexicographic and case-sensitive.
'Zebra' < 'apple'isTruebecause uppercase letters have lower ordinal values than lowercase. UseLowerCaseorUpperCasebefore comparison for case-insensitive matching. -
Know your functions vs. procedures.
Copy,Pos,UpperCase,LowerCase, andTrimare functions that return new strings.InsertandDeleteare procedures that modify the string variable in place. -
Posreturns 0 on failure. This is the standard "not found" signal in Pascal string searching. Always check for 0 before using the result as an index. -
Valgives you error position;StrToInt/StrToFloatgive you convenience. UseValwhen parsing untrusted data (files, user input) where you need to report precisely what went wrong. UseStrToIntDef/StrToFloatDefwhen you want a default value on failure. -
Characters are ordinal values.
OrdandChrbridge the gap between characters and integers. Character arithmetic (Ord(c) - Ord('A')) is the foundation for case conversion, digit parsing, and simple ciphers. -
State machines are the reliable way to parse text. Whether you are parsing commands, CSV lines, or any structured text, tracking your current state (
inWord,inQuotes, etc.) and transitioning based on each character produces robust, readable parsers. -
The normalize-tokenize-classify-dispatch pattern is universal for command processing. Normalize the input (trim, lowercase, collapse spaces), tokenize it (split into parts), classify the parts (verb, object, preposition), and dispatch to the appropriate handler.
-
CSV is harder than "split on commas." Quoted fields, escaped quotes, empty fields, and malformed rows all require careful handling. A state machine parser handles these edge cases cleanly.
Common Pitfalls
- Off-by-one errors with
CopyandPos. When combining the two (e.g., finding all occurrences), keep careful track of whether positions are relative to the original string or to a substring. - Forgetting that
InsertandDeletemodify in place. They do not return a new string — they change the variable you pass to them. - Assuming case-insensitive comparison. It is not. Always explicitly convert to a common case.
- Building long strings by repeated concatenation in a loop. This works but can be slow for very long strings. For performance-critical code, use
SetLengthand index assignment. - Using
StrToInton untrusted input without error handling. It raises an exception on invalid input. UseValorStrToIntDefinstead.
Threshold Concept
Strings are both arrays and abstractions. At the lowest level, a string is just an array of bytes. But the string functions (Pos, Copy, Insert, Delete, etc.) let you think at a higher level — searching for patterns, extracting substrings, transforming text. The ability to shift between these levels of abstraction — seeing the characters when you need to, seeing the meaning when you need to — is a skill that transfers to every programming language and every text-processing task you will ever encounter.