Exercises — Chapter 13: Relative Files and RRDS

DataField.Dev

Exercises — Chapter 13: Relative Files and RRDS

Exercise 13.1: Basic Relative File Operations

Difficulty: Beginner

Create two programs that demonstrate basic relative file CRUD operations.

Program 1 — REL-LOAD.cbl (Loader): 1. Create a relative file with at least 50 slots 2. Write records to specific slot numbers (e.g., slots 1, 3, 5, 7, ..., 49) 3. Leave even-numbered slots empty to demonstrate sparse files 4. Each record should contain: a product code (PIC X(10)), product name (PIC X(30)), price (PIC 9(05)V99), and quantity on hand (PIC 9(05)) 5. Display each successful write with its slot number

Program 2 — REL-ACCESS.cbl (Accessor): 1. Open the file from Program 1 for I-O 2. Read a specific occupied slot — display the record 3. Read an empty slot — display "slot empty" message 4. Update one record (REWRITE with new price) 5. Delete one record 6. Sequential read — display all remaining records with their slot numbers 7. Display count of occupied vs. empty slots

Exercise 13.2: Division/Remainder Hash Implementation

Difficulty: Intermediate

Build a hash-based lookup table for a student directory.

Requirements:

Student records: Student ID (PIC X(08), format "STU00001"), Name (PIC X(30)), Major (PIC X(20)), GPA (PIC 9V99)
Relative file with 1,300 slots (capacity for ~1,000 students at 77% load factor)
Implement the division/remainder hash function: - Extract the numeric portion of the student ID - Divide by prime 1297 - Use remainder + 1 as the RRN
Implement linear probing for collision handling with a maximum of 20 probes
Write a loader program that: - Loads 800 student records - Tracks total collisions and maximum probe chain length - Reports load statistics at the end
Write a lookup program that: - Accepts a student ID - Computes the hash - Probes until found or empty slot - Displays the record or "not found"

Exercise 13.3: Hash Function Comparison

Difficulty: Intermediate-Advanced

Compare the effectiveness of three hash functions on the same data set.

Requirements:

Generate (or hardcode) 500 key values with various patterns: - 200 sequential keys (000001-000200) - 150 clustered keys (keys that share common prefixes) - 150 random keys
For each hash function below, compute the hash for all 500 keys into a table of 750 slots: - Division/remainder (divide by prime 743) - Folding (split key into 2-digit groups, sum, mod 750) - Mid-square (square key, extract middle digits, mod 750)
For each function, track: - Number of collisions - Average probe length (using linear probing) - Maximum probe chain - Distribution histogram (how many slots have 0, 1, 2, 3+ records mapped)
Display a comparison table and recommend which function works best for this data

Exercise 13.4: Cache-Ahead Pattern

Difficulty: Advanced

Implement the GlobalBank quick lookup pattern from Section 13.6.

Requirements:

Create a KSDS "master" file with 5,000 records (account number, name, balance, status)
Create a sequential "hot accounts" extract with the top 1,000 accounts
Build an RRDS cache (1,500 slots) from the hot accounts extract using a hash function
Write a transaction processing program that: - Reads transaction requests from a sequential file - For each transaction, first checks the RRDS cache - If not found in cache, reads from the KSDS master - Tracks cache hits vs. misses - Reports hit ratio at end of processing
Generate 2,000 test transactions where 70% hit cached accounts and 30% hit non-cached accounts. Verify that your hit ratio matches expectations.

Exercise 13.5: Sparse File Analysis

Difficulty: Intermediate

Explore the effects of sparsity on relative files.

Create a relative file with 10,000 slots
Write records at the following densities: - Phase 1: Load 1,000 records (10% density) into random slots - Phase 2: Add 3,000 more records (40% density) - Phase 3: Add 3,000 more records (70% density)
After each phase, sequentially read all records and time the operation (ACCEPT FROM TIME before and after)
Report for each phase: - Number of records - Density percentage - Time to sequentially read all records
Discussion: How does density affect sequential read performance? At what density does a sequential scan become impractical?

Exercise 13.6: MedClaim Provider Cache

Difficulty: Advanced (Project)

Build a complete provider cache system modeled on Section 13.7.

Provider KSDS master (PROVIDER-KSDS.cbl): - Load 500 provider records (Provider ID PRV00001 through PRV00500) - Include: Provider ID, NPI, Name, Specialty, Contract Status, Max Allowed Amount
Provider RRDS cache builder (PROVIDER-CACHE.cbl): - Read KSDS master sequentially - Map PRV-ID numeric portion directly to RRN (no hashing needed) - Load all 500 records into RRDS
Claims processor with dual lookup (CLAIMS-PROC.cbl): - Read claims from sequential input - Each claim has a provider ID - Look up provider in RRDS first (1 I/O) - If not found (provider ID > 500 or invalid), fall back to KSDS - Validate provider: check contract status, date range, max allowed - Write results to output file
Test data: Create 200 test claims — 150 for providers 1-500, 50 for providers 501-600 (testing the fallback)

Exercise 13.7: Relative File Reorganization

Difficulty: Intermediate

Write a reorganization utility for a relative file.

Scenario: A relative file has been in use for months. Many records have been deleted, leaving it 40% sparse. Write a program that:

Sequentially reads the old file, capturing all occupied records
Computes a new hash for each record based on its business key
Writes records to a new relative file with optimal distribution
Compares the old and new load factors
Verifies all records were transferred by counting

Extra credit: After reorganization, run 100 random lookups on both old and new files. Compare the average probe counts.