Chapter 10 Exercises: Strings — Text Processing in Pascal
Part A: Fundamentals (Knowledge / Comprehension)
Exercise A1. What is the maximum number of characters that a ShortString can hold? Where is the length stored?
Exercise A2. Explain the difference between ShortString and AnsiString in terms of (a) maximum length, (b) memory allocation, and (c) what happens when you assign one variable to another.
Exercise A3. What does the compiler directive {$H+} do, and why do we use it in this textbook?
Exercise A4. Given s := 'Pascal';, what are the values of:
- s[1]
- s[6]
- Length(s)
- Pos('cal', s)
- Copy(s, 4, 3)
Exercise A5. What value does Pos return when the substring is not found in the target string? Why is this a useful design choice compared to, say, returning -1?
Exercise A6. Explain why the following comparison yields True:
WriteLn('Zebra' < 'apple');
Exercise A7. What is the difference between Insert and Copy? Which one modifies the original string variable?
Part B: Guided Practice (Application)
Exercise B1. Write a function Initials(const fullName: String): String that takes a full name like 'Ada Byron Lovelace' and returns the initials 'ABL'. Assume words are separated by single spaces.
Exercise B2. Write a function CensorWord(const text, word: String): String that replaces every occurrence of word in text with asterisks of the same length. For example, CensorWord('The bad cat did bad things', 'bad') should return 'The *** cat did *** things'.
Exercise B3. Write a procedure PrintReversed(const s: String) that prints the string in reverse without creating a reversed copy. Use a downto loop that writes one character at a time.
Exercise B4. Write a function CountVowels(const s: String): Integer that counts the number of vowels (a, e, i, o, u, both upper and lower case) in a string.
Exercise B5. Write a function RemoveSpaces(const s: String): String that removes all spaces from a string. For example, 'Hello World' becomes 'HelloWorld'.
Exercise B6. Write a function PadRight(const s: String; width: Integer; padChar: Char): String that pads a string on the right to the given width. If the string is already at least width characters, return it unchanged. For example, PadRight('Hi', 5, '.') returns 'Hi...'.
Exercise B7. Using Val, write a function IsValidInteger(const s: String): Boolean that returns True if and only if the entire string represents a valid integer (no extra characters, no trailing spaces after the number).
Exercise B8. Write a function ReplaceFirst(const s, old, replacement: String): String that replaces only the first occurrence of old in s with replacement. Do not use StringReplace — implement it with Pos, Copy, and concatenation.
Part C: Independent Problem Solving (Analysis / Synthesis)
Exercise C1. Write a function TitleCase(const s: String): String that capitalizes the first letter of every word and lowercases all other letters. For example, 'hello WORLD from pascal' becomes 'Hello World From Pascal'. Handle multiple consecutive spaces correctly (do not crash or produce garbage).
Exercise C2. Write a function CompressSpaces(const s: String): String that replaces every run of consecutive spaces with a single space and trims leading/trailing spaces. For example, ' Hello World ' becomes 'Hello World'.
Exercise C3. Write a procedure WordFrequency(const text: String) that prints each unique word in the text along with how many times it appears. Use case-insensitive comparison. For simplicity, assume words are separated by spaces and contain only letters.
Hint: Use parallel arrays — one for unique words, one for counts. Scan each word in the text, check if it already exists in the unique-words array, and either increment its count or add it as new.
Exercise C4. Write a function IsAnagram(const s1, s2: String): Boolean that returns True if the two strings are anagrams of each other (contain exactly the same letters in any order, ignoring spaces and case). For example, 'Listen' and 'Silent' are anagrams.
Hint: Count the frequency of each letter (a–z) in both strings and compare the counts.
Exercise C5. Write a function ExtractEmails(const text: String): String that scans through a block of text and extracts anything that looks like an email address (contains @ with non-space characters on both sides). Return all found emails separated by semicolons.
Exercise C6. Implement a ROT13 function: function ROT13(const s: String): String. ROT13 shifts each letter by 13 positions in the alphabet, wrapping around. Non-letter characters are unchanged. ROT13 has the nice property that applying it twice returns the original text — verify this in your test code.
Exercise C7. Write a function ExpandTabs(const s: String; tabWidth: Integer): String that replaces each tab character (#9) with enough spaces to reach the next tab stop. Tab stops are at columns that are multiples of tabWidth. For example, with tabWidth = 4, a tab at column 1 inserts 3 spaces (to reach column 4), a tab at column 3 inserts 1 space, and a tab at column 4 inserts 4 spaces (to reach column 8).
Part D: Challenge Problems (Synthesis / Evaluation)
Exercise D1: Simple Markup Parser. Write a program that reads text containing simple markup tags: **bold**, _italic_, and `code`. Convert these to plain text labels for a console display:
**bold text**→[BOLD: bold text]_italic text_→[ITALIC: italic text]`code text`→[CODE: code text]
Handle nested and overlapping tags gracefully (if overlapping, just process them left to right).
Exercise D2: Text Wrapping. Write a function WordWrap(const text: String; lineWidth: Integer): String that inserts line breaks (#13#10) into a block of text so that no line exceeds lineWidth characters. Break only at spaces — never in the middle of a word. If a single word exceeds lineWidth, place it on its own line.
Exercise D3: CSV to Fixed-Width. Write a program that reads a CSV file and outputs a fixed-width formatted table. The program should: 1. Read all rows first to determine the maximum width of each column. 2. Output a header separator line. 3. Output each row with columns padded to their maximum width.
Handle quoted CSV fields that contain commas.
Exercise D4: Pig Latin Translator. Write a function ToPigLatin(const word: String): String that converts an English word to Pig Latin. Rules:
- If the word begins with a vowel, append 'yay' (e.g., 'apple' → 'appleyay').
- If the word begins with one or more consonants, move the consonant cluster to the end and append 'ay' (e.g., 'string' → 'ingstray', 'chair' → 'airchay').
- Preserve the original capitalization pattern (if the original starts with uppercase, the result should too).
Then write a TranslateSentence function that applies Pig Latin to each word while preserving punctuation and spacing.
Exercise D5: Text Diff. Write a program that takes two strings and identifies the differences between them. Output the position and nature of each difference:
String 1: "The quick brown fox"
String 2: "The quack brown box"
Difference at position 7: 'i' vs 'a'
Difference at position 8: 'c' vs 'c' (same)
Difference at position 17: 'f' vs 'b'
Part E: Conceptual and Analytical Questions
Exercise E1. The AnsiString type uses copy-on-write semantics. Explain what this means and give an example scenario where it provides a significant performance benefit.
Exercise E2. Why does Pascal index strings starting at 1 instead of 0 (as C and many modern languages do)? What are the advantages and disadvantages of each convention?
Exercise E3. Our SplitString procedure in Section 10.4 uses an open array parameter. What would happen if the caller passes an array that is too small to hold all the parts? How does our implementation handle this? Suggest an improvement.
Exercise E4. Compare the ParseCSVField function from Section 10.9 with the simpler SplitString from Section 10.4. What specific CSV edge cases does ParseCSVField handle that SplitString does not? Why is CSV parsing harder than it first appears?
Exercise E5. The wildcard matching algorithm in Section 10.8 uses backtracking. Explain what backtracking means in this context. What happens when the pattern is 'a*b*c*d' and the text is 'aXXbYYcZZd'? Walk through the algorithm's steps.
Exercise E6. Theme 5 states "Algorithms + Data Structures = Programs." Identify three specific examples from this chapter where a particular algorithm is tightly coupled to the string data structure in a way that would work differently (or not at all) with a different data structure.
Part M: Mastery Problems
Exercise M1: Full-Featured Command Parser. Extend the Crypts of Pascalia command parser from Section 10.7 to handle:
- Synonyms: get, grab, take, and pick up all map to the same action.
- Articles: The parser should ignore the, a, and an in object references (take the sword works the same as take sword).
- Abbreviations: The player can type n for north, s for south, inv for inventory, etc.
- Error suggestions: If the verb is not recognized but is close to a valid verb (e.g., tke for take), suggest the correct verb. Implement a simple Levenshtein distance function or a simpler "differs by one character" check.
Test with at least 15 different inputs covering normal commands, abbreviations, articles, synonyms, and error cases.
Exercise M2: Simple Template Engine. Build a template engine that reads a template string containing {{variable}} placeholders and replaces them with values from a list of name-value pairs. For example:
Template: "Dear {{name}}, your order #{{orderNum}} totaling ${{total}} has shipped."
Variables: name=Alice, orderNum=4521, total=89.99
Output: "Dear Alice, your order #4521 totaling $89.99 has shipped."
Support:
- Multiple variables in a single template.
- Variables appearing more than once.
- Nested braces (e.g., {{{var}}} should produce {value}).
- Missing variables: output {{variableName}} unchanged.
Exercise M3: String Compression. Implement run-length encoding (RLE) for strings:
- Compress('aaabbbccca') → '3a3b3c1a'
- Decompress('3a3b3c1a') → 'aaabbbccca'
Handle edge cases: single characters (count of 1), counts greater than 9 (multi-digit), and the case where "compression" makes the string longer (in which case, return the original string with a prefix marker).
Exercise M4: PennyWise Report Generator. Using the transaction data structure from Section 10.9, write a GenerateReport procedure that:
1. Groups transactions by category.
2. Calculates subtotals for each category.
3. Outputs a formatted report with headers, aligned columns, category subtotals, and a grand total.
4. Includes a date range (earliest and latest transaction dates) in the report header.
Use Format and StringOfChar for alignment. The report should look professional enough to print.