Quiz — Chapter 14: Advanced File Techniques
Multiple Choice
1. What is the critical prerequisite for the balanced-line algorithm?
a) Both files must be the same length b) Both files must have the same record layout c) Both files must be sorted on the same key d) Both files must be VSAM KSDS
2. In the balanced-line algorithm, what does HIGH-VALUES represent?
a) The largest valid account number b) A sentinel value indicating end-of-file c) The highest balance in the file d) An error condition
3. Why does the balanced-line algorithm write to a new master file instead of updating the old one in place?
a) COBOL does not support in-place updates b) It is faster than random updates c) It provides recovery, audit trail, and simplicity d) The compiler requires it
4. In control break processing, when a Region changes, which breaks must be triggered?
a) Region break only b) Region break and Grand total c) Branch break, then Region break d) Branch break only
5. What is the "grandfather-father-son" backup scheme?
a) Three generations of backup tapes for recovery b) A three-level control break pattern c) Three levels of VSAM index d) Three copies of the program source
6. In a file comparison program, what category describes a record found in File A but not in File B?
a) Match b) Mismatch c) A-only d) Orphan
7. What is the primary purpose of checkpoint/restart in batch processing?
a) To speed up processing b) To resume from the last checkpoint after a failure c) To save disk space d) To validate data integrity
8. In the balanced-line algorithm, when Master Key > Transaction Key, what does this indicate?
a) The transaction has no matching master record b) The master has no matching transaction c) Both files are at end-of-file d) A sort error has occurred
9. How frequently should checkpoints typically be written?
a) Every record b) Every 5,000-50,000 records or every 5-15 minutes c) Once at the end of the program d) Only when an error occurs
10. Which of the following is NOT a benefit of the old-master/new-master pattern?
a) Easy recovery if the job fails b) Audit trail between old and new c) Reduced disk space usage d) Clean sequential writing
True or False
11. The balanced-line algorithm can only handle two input files.
12. HIGH-VALUES is the highest possible collating value in both EBCDIC and ASCII.
13. In control break processing, data must be sorted by the break fields.
14. A checkpoint must save ALL program state that affects the final output.
15. In a multi-file merge, all input files must have the same record layout.
16. The balanced-line algorithm reads each input file exactly once.
17. James Okafor's rule states that it is acceptable to silently skip records that cannot be processed.
18. Control break processing can be implemented on unsorted data by using a hash table.
19. A three-file merge requires comparing three keys to find the lowest.
20. The REWRITE statement can be used to update a sequential master file in the balanced-line algorithm.
Short Answer
21. Explain why the HIGH-VALUES sentinel technique works. What would happen if you used SPACES instead of HIGH-VALUES?
22. A balanced-line update processes 1,000,000 master records and 150,000 transactions. The resulting new master has 1,005,000 records. Show the reconciliation arithmetic that validates this result (assuming some transactions are adds and some are deletes).
23. Describe the difference between the balanced-line algorithm's handling of a "matched" record and a "transaction-only" record. What business scenarios does each represent?
24. In multi-level control break processing, explain why a higher-level break must trigger all lower-level breaks. Give an example of what goes wrong if you skip a level.
25. Your checkpoint/restart program saves the last key processed but forgets to save the running total of interest calculated. What happens when the program restarts?
Code Analysis
26. Find the bug in this balanced-line loop:
PERFORM UNTIL WS-BOTH-EOF
IF MASTER-KEY < TXN-KEY
PERFORM PROCESS-MASTER-ONLY
PERFORM READ-MASTER
END-IF
IF MASTER-KEY = TXN-KEY
PERFORM PROCESS-MATCH
PERFORM READ-MASTER
PERFORM READ-TXN
END-IF
IF MASTER-KEY > TXN-KEY
PERFORM PROCESS-TXN-ONLY
PERFORM READ-TXN
END-IF
END-PERFORM
27. This control break has a logic error. What is it?
2000-PROCESS-RECORDS.
IF AR-BRANCH-CODE NOT = WS-PREV-BRANCH
PERFORM 2500-BRANCH-BREAK
END-IF
ADD 1 TO WS-BRANCH-COUNT
ADD AR-BALANCE TO WS-BRANCH-TOTAL
READ ACCOUNT-FILE
AT END
SET WS-EOF TO TRUE
END-READ.
28. What happens if the transaction file is not sorted correctly in the balanced-line algorithm? Does the program detect this error?
Answer Key
- c — Both files must be sorted on the same key.
- b — HIGH-VALUES serves as a sentinel value indicating end-of-file.
- c — Recovery (old master intact), audit trail, and simpler sequential writing.
- c — Branch break first, then Region break. Higher-level breaks trigger all lower levels.
- a — Three generations of master file backups for recovery.
- c — A-only describes a record in File A but not File B.
- b — To resume from the last checkpoint instead of restarting from scratch.
- a — The transaction has no matching master record (possible new record or error).
- b — Every 5,000-50,000 records or 5-15 minutes is the common guideline.
- c — The old-master/new-master pattern actually uses MORE disk space (two copies), not less.
- False — The concept extends to three or more files (multi-way merge).
- True — HIGH-VALUES (X'FF') is the highest byte value in any collating sequence.
- True — Control break processing requires sorted data; unsorted data produces incorrect subtotals.
- True — All accumulated state (counters, totals, flags) must be saved for correct restart.
- False — Input files can have different layouts as long as they share a common key field for comparison.
- True — Each file is read sequentially from beginning to end, exactly once.
- False — James Okafor's rule states that every record must exit through a defined path (updated output, exception file, or report) — never silently skipped.
- False — Control break processing requires sorted data. A hash table is a different approach (grouping/aggregation, not sequential break processing).
- True — All three keys must be compared to determine which file provides the next output record.
- False — The balanced-line algorithm writes to a NEW master file sequentially. REWRITE is for indexed/relative files with random access.
- HIGH-VALUES works because it is guaranteed to be greater than any real key value (X'FF' repeated). When one file reaches EOF, its key becomes HIGH-VALUES, so all comparisons favor processing from the other file. SPACES would not work because many real key values are greater than spaces — records from the other file would appear to come "before" the sentinel and would not be processed.
- New master = Old master + Adds - Physical deletes. So: 1,005,000 = 1,000,000 + Adds - Deletes. If there are 6,000 adds and 1,000 physical deletes: 1,000,000 + 6,000 - 1,000 = 1,005,000. The remaining 143,000 transactions (150,000 - 6,000 adds - 1,000 deletes) would be updates (deposits, withdrawals, etc.) to existing records.
- A "matched" record means a transaction applies to an existing master — a deposit to an existing account, a GPA update for an existing student. The master is modified and written to the new master. A "transaction-only" record means a transaction has no matching master. If it is an "Add" type, a new master record is created. For any other type, it is an error (exception) because you cannot deposit to, withdraw from, or close an account that does not exist.
- If Region changes, the current branch subtotals must be printed first (branch break), then the current region subtotals (region break). If you skip the branch break, the last branch's records are added to the next region's branch subtotals instead of being closed out. Example: Branch BR001 has $50K in Region East. Region changes to West. Without triggering the branch break, that $50K rolls into the first branch of Region West, making East's total $50K too low and West's first branch $50K too high.
- The program correctly positions to the restart key and processes the remaining records, but the running total starts at zero instead of the accumulated value from the first 3 million records. The final total will only include interest from records after the checkpoint, not the full file. The reported total will be approximately 40% of the correct value.
- The three IF statements are independent, not mutually exclusive. After the first IF advances the master (READ-MASTER), the second IF compares the NEW master key with the original transaction key. Use EVALUATE TRUE with WHEN clauses (or IF/ELSE IF) to make the comparison mutually exclusive. Also, in the MATCH case, it reads only one transaction, but there may be multiple transactions for the same master key.
- The program reads the next record AFTER processing the current record, but there is no check for EOF before the branch-break comparison on the next iteration. When EOF is reached, the last record's branch break will not be processed. A FINAL-BREAK paragraph at the end of the main loop is needed to handle the last group's subtotals and grand totals.
- The balanced-line algorithm does NOT detect unsorted input. It will produce incorrect results silently: some transactions will appear to have no matching master (written to exception file), and some master records will appear to have no transactions (copied unchanged). The output will be wrong but will look structurally correct. This is why production programs should verify sort order by comparing each key to the previous key and aborting if a sequence error is detected.