Exercises — Chapter 13: Relative Files and RRDS

Exercise 13.1: Basic Relative File Operations

Difficulty: Beginner

Create two programs that demonstrate basic relative file CRUD operations.

Program 1 — REL-LOAD.cbl (Loader): 1. Create a relative file with at least 50 slots 2. Write records to specific slot numbers (e.g., slots 1, 3, 5, 7, ..., 49) 3. Leave even-numbered slots empty to demonstrate sparse files 4. Each record should contain: a product code (PIC X(10)), product name (PIC X(30)), price (PIC 9(05)V99), and quantity on hand (PIC 9(05)) 5. Display each successful write with its slot number

Program 2 — REL-ACCESS.cbl (Accessor): 1. Open the file from Program 1 for I-O 2. Read a specific occupied slot — display the record 3. Read an empty slot — display "slot empty" message 4. Update one record (REWRITE with new price) 5. Delete one record 6. Sequential read — display all remaining records with their slot numbers 7. Display count of occupied vs. empty slots


Exercise 13.2: Division/Remainder Hash Implementation

Difficulty: Intermediate

Build a hash-based lookup table for a student directory.

Requirements:

  1. Student records: Student ID (PIC X(08), format "STU00001"), Name (PIC X(30)), Major (PIC X(20)), GPA (PIC 9V99)

  2. Relative file with 1,300 slots (capacity for ~1,000 students at 77% load factor)

  3. Implement the division/remainder hash function: - Extract the numeric portion of the student ID - Divide by prime 1297 - Use remainder + 1 as the RRN

  4. Implement linear probing for collision handling with a maximum of 20 probes

  5. Write a loader program that: - Loads 800 student records - Tracks total collisions and maximum probe chain length - Reports load statistics at the end

  6. Write a lookup program that: - Accepts a student ID - Computes the hash - Probes until found or empty slot - Displays the record or "not found"


Exercise 13.3: Hash Function Comparison

Difficulty: Intermediate-Advanced

Compare the effectiveness of three hash functions on the same data set.

Requirements:

  1. Generate (or hardcode) 500 key values with various patterns: - 200 sequential keys (000001-000200) - 150 clustered keys (keys that share common prefixes) - 150 random keys

  2. For each hash function below, compute the hash for all 500 keys into a table of 750 slots: - Division/remainder (divide by prime 743) - Folding (split key into 2-digit groups, sum, mod 750) - Mid-square (square key, extract middle digits, mod 750)

  3. For each function, track: - Number of collisions - Average probe length (using linear probing) - Maximum probe chain - Distribution histogram (how many slots have 0, 1, 2, 3+ records mapped)

  4. Display a comparison table and recommend which function works best for this data


Exercise 13.4: Cache-Ahead Pattern

Difficulty: Advanced

Implement the GlobalBank quick lookup pattern from Section 13.6.

Requirements:

  1. Create a KSDS "master" file with 5,000 records (account number, name, balance, status)
  2. Create a sequential "hot accounts" extract with the top 1,000 accounts
  3. Build an RRDS cache (1,500 slots) from the hot accounts extract using a hash function
  4. Write a transaction processing program that: - Reads transaction requests from a sequential file - For each transaction, first checks the RRDS cache - If not found in cache, reads from the KSDS master - Tracks cache hits vs. misses - Reports hit ratio at end of processing

  5. Generate 2,000 test transactions where 70% hit cached accounts and 30% hit non-cached accounts. Verify that your hit ratio matches expectations.


Exercise 13.5: Sparse File Analysis

Difficulty: Intermediate

Explore the effects of sparsity on relative files.

  1. Create a relative file with 10,000 slots
  2. Write records at the following densities: - Phase 1: Load 1,000 records (10% density) into random slots - Phase 2: Add 3,000 more records (40% density) - Phase 3: Add 3,000 more records (70% density)

  3. After each phase, sequentially read all records and time the operation (ACCEPT FROM TIME before and after)

  4. Report for each phase: - Number of records - Density percentage - Time to sequentially read all records

  5. Discussion: How does density affect sequential read performance? At what density does a sequential scan become impractical?


Exercise 13.6: MedClaim Provider Cache

Difficulty: Advanced (Project)

Build a complete provider cache system modeled on Section 13.7.

  1. Provider KSDS master (PROVIDER-KSDS.cbl): - Load 500 provider records (Provider ID PRV00001 through PRV00500) - Include: Provider ID, NPI, Name, Specialty, Contract Status, Max Allowed Amount

  2. Provider RRDS cache builder (PROVIDER-CACHE.cbl): - Read KSDS master sequentially - Map PRV-ID numeric portion directly to RRN (no hashing needed) - Load all 500 records into RRDS

  3. Claims processor with dual lookup (CLAIMS-PROC.cbl): - Read claims from sequential input - Each claim has a provider ID - Look up provider in RRDS first (1 I/O) - If not found (provider ID > 500 or invalid), fall back to KSDS - Validate provider: check contract status, date range, max allowed - Write results to output file

  4. Test data: Create 200 test claims — 150 for providers 1-500, 50 for providers 501-600 (testing the fallback)


Exercise 13.7: Relative File Reorganization

Difficulty: Intermediate

Write a reorganization utility for a relative file.

Scenario: A relative file has been in use for months. Many records have been deleted, leaving it 40% sparse. Write a program that:

  1. Sequentially reads the old file, capturing all occupied records
  2. Computes a new hash for each record based on its business key
  3. Writes records to a new relative file with optimal distribution
  4. Compares the old and new load factors
  5. Verifies all records were transferred by counting

Extra credit: After reorganization, run 100 random lookups on both old and new files. Compare the average probe counts.