Exercises — Chapter 17: String Handling

Exercise 17.1: Basic STRING Concatenation

Difficulty: Beginner

Write a program that takes three separate address fields — street, city, and state/zip — and uses STRING to produce a single-line formatted address. Handle the case where any field might be blank.

Input fields:

WS-STREET   PIC X(30) VALUE '123 MAIN STREET'
WS-CITY     PIC X(20) VALUE 'SPRINGFIELD'
WS-STATE    PIC X(02) VALUE 'IL'
WS-ZIP      PIC X(10) VALUE '62704'

Expected output: 123 MAIN STREET, SPRINGFIELD, IL 62704

Exercise 17.2: UNSTRING Parsing

Difficulty: Beginner

Write a program that parses a date string in the format "Month DD, YYYY" (e.g., "January 15, 2024") using UNSTRING. Extract the month name, day, and year into separate fields. Handle the comma and space as delimiters.

Exercise 17.3: INSPECT TALLYING and REPLACING

Difficulty: Beginner

Write a program that takes a text string and: 1. Counts the number of vowels (A, E, I, O, U — both cases) 2. Counts the number of digits 3. Replaces all exclamation marks with periods 4. Converts all lowercase letters to uppercase

Display the counts and the modified string.

Exercise 17.4: Reference Modification — Date Formatting

Difficulty: Beginner

Write a program that converts dates between formats using reference modification: 1. YYYYMMDD to MM/DD/YYYY 2. YYYYMMDD to DD-MON-YYYY (where MON is a 3-letter month abbreviation) 3. MM/DD/YYYY to YYYYMMDD

Test with at least 5 different dates.

Exercise 17.5: CSV File Parser

Difficulty: Intermediate

Write a complete program that reads a CSV file containing employee records:

EMP_ID,FIRST_NAME,LAST_NAME,DEPARTMENT,SALARY,HIRE_DATE
E001,Maria,Chen,IT,95000,2015-03-15
E002,Derek,Washington,IT,72000,2022-08-01

Parse each line using UNSTRING, skip the header line, and write a fixed-format output file. Count commas first (INSPECT TALLYING) to validate the field count before parsing.

Exercise 17.6: Name Formatting Library

Difficulty: Intermediate

Write a program that reads names in "FIRST MIDDLE LAST" format and produces all of the following formats: 1. Last, First Middle 2. Last, First M. 3. F.M. Last 4. LAST, FIRST (uppercase, no middle) 5. First Last (no middle)

Handle names with no middle name gracefully. Use TALLYING to count spaces to determine if a middle name/initial is present.

Exercise 17.7: Phone Number Formatter

Difficulty: Intermediate

Write a program that accepts phone numbers in any of these formats and normalizes them: - Input: "5551234567", "(555) 123-4567", "555-123-4567", "555.123.4567", "1-555-123-4567" - Output: "(555) 123-4567"

Use INSPECT REPLACING to strip non-numeric characters, then reference modification to extract area code, exchange, and subscriber number, then STRING to format the output.

Exercise 17.8: Pipe-Delimited Data Transformer

Difficulty: Intermediate

Write a program that reads a pipe-delimited file and produces CSV output with proper quoting. Any field that contains a comma must be enclosed in double quotes in the CSV output:

Input: ACCT001|Smith, John|NEW YORK|12500.00 Output: ACCT001,"Smith, John",NEW YORK,12500.00

Use INSPECT TALLYING to detect commas in each field before building the output.

Exercise 17.9: Multi-Pass UNSTRING Parser

Difficulty: Advanced

Write a program that parses log file entries using UNSTRING with POINTER for multi-pass parsing. Each log entry has the format:

YYYY-MM-DD HH:MM:SS [LEVEL] PROGRAM: Message text here

Parse into: date, time, level, program name, and message. Use the POINTER phrase to parse the structured prefix first, then extract the remaining free-form message.

Exercise 17.10: ICD-10 Code Validator

Difficulty: Advanced

Based on the MedClaim case study, write a complete ICD-10 diagnosis code validator and parser that: 1. Validates the format (letter + 2 digits + optional dot + up to 4 characters) 2. Extracts category, etiology, and detail components 3. Looks up the category letter in a table to produce a description 4. Handles both ICD-10 and legacy ICD-9 formats (numeric only) 5. Produces a report of valid codes, invalid codes, and format distribution

Exercise 17.11: XML Builder

Difficulty: Advanced

Write a program that reads a fixed-format customer file and produces well-formed XML output:

<?xml version="1.0" encoding="UTF-8"?>
<customers>
  <customer id="C001">
    <name>MARIA CHEN</name>
    <address>
      <street>123 MAIN ST</street>
      <city>SPRINGFIELD</city>
      <state>IL</state>
      <zip>62704</zip>
    </address>
  </customer>
</customers>

Use STRING with POINTER to build each XML line, and INSPECT to escape any special characters (& to &, < to <) that might appear in data fields.

Exercise 17.12: Production Data Cleansing Program

Difficulty: Challenge

Write a program that cleanses a customer data file by: 1. Trimming leading and trailing spaces from all fields 2. Standardizing phone numbers to (999) 999-9999 format 3. Converting state abbreviations to uppercase 4. Validating email addresses (must contain @ and at least one . after @) 5. Replacing special characters in names with spaces 6. Detecting and flagging duplicate records (by name + zip) 7. Producing a cleansed output file and an error report

This exercise integrates STRING, UNSTRING, INSPECT, and reference modification in a realistic data quality scenario.