Case Study 1: Control Break Reporting at State Revenue Department
Background
The State Revenue Department of a mid-Atlantic state processes property tax payments from 5 regions, each containing 3-8 districts, with each district having 2-12 local tax offices. Every month, the department generates a comprehensive report summarizing tax collections at all three levels: by office, by district, and by region, with a statewide grand total.
The original report program was written in 1987 and has been modified dozens of times over the decades. The most recent project, led by senior COBOL developer Patricia Chen, involved a complete rewrite to replace the aging GO TO-based code with clean, structured PERFORM logic.
This case study examines the multi-level control break report implementation, focusing on how nested PERFORM loops and accumulator patterns work together to produce a complex hierarchical report.
The Business Requirements
The monthly tax collection report must show:
- Detail lines for each office showing office ID, office name, and monthly collection amount.
- Office subtotals at each office change (though typically there is only one line per office, making this a simple accumulation).
- District subtotals whenever the district code changes, showing the total collections for all offices in that district.
- Region subtotals whenever the region code changes, showing the total collections for all districts in that region.
- Grand total at the end of the report showing statewide collections.
- Variance analysis comparing current month to the same month last year.
The input data is a sequential file sorted by region, then by district within region, then by office within district.
The Data Structures
*================================================================*
* Input record layout *
*================================================================*
01 TAX-COLLECTION-RECORD.
05 TCR-REGION-CODE PIC X(2).
05 TCR-REGION-NAME PIC X(20).
05 TCR-DISTRICT-CODE PIC X(3).
05 TCR-DISTRICT-NAME PIC X(25).
05 TCR-OFFICE-CODE PIC X(4).
05 TCR-OFFICE-NAME PIC X(30).
05 TCR-MONTH-AMOUNT PIC S9(9)V99.
05 TCR-PRIOR-YEAR-AMT PIC S9(9)V99.
05 TCR-YTD-AMOUNT PIC S9(11)V99.
*================================================================*
* Control break tracking variables *
*================================================================*
01 WS-PREVIOUS-KEYS.
05 WS-PREV-REGION PIC X(2) VALUE HIGH-VALUES.
05 WS-PREV-DISTRICT PIC X(3) VALUE HIGH-VALUES.
05 WS-PREV-OFFICE PIC X(4) VALUE HIGH-VALUES.
01 WS-SAVE-NAMES.
05 WS-SAVE-REGION-NAME PIC X(20).
05 WS-SAVE-DIST-NAME PIC X(25).
*================================================================*
* Accumulators (one set for each break level) *
*================================================================*
01 WS-DISTRICT-ACCUM.
05 WS-DIST-CURRENT PIC S9(11)V99 VALUE ZEROS.
05 WS-DIST-PRIOR-YR PIC S9(11)V99 VALUE ZEROS.
05 WS-DIST-OFFICE-COUNT PIC 9(3) VALUE ZEROS.
01 WS-REGION-ACCUM.
05 WS-REG-CURRENT PIC S9(12)V99 VALUE ZEROS.
05 WS-REG-PRIOR-YR PIC S9(12)V99 VALUE ZEROS.
05 WS-REG-DIST-COUNT PIC 9(3) VALUE ZEROS.
05 WS-REG-OFFICE-COUNT PIC 9(4) VALUE ZEROS.
01 WS-GRAND-ACCUM.
05 WS-GRAND-CURRENT PIC S9(13)V99 VALUE ZEROS.
05 WS-GRAND-PRIOR-YR PIC S9(13)V99 VALUE ZEROS.
05 WS-GRAND-REG-COUNT PIC 9(2) VALUE ZEROS.
05 WS-GRAND-DIST-COUNT PIC 9(3) VALUE ZEROS.
05 WS-GRAND-OFFICE-COUNT PIC 9(4) VALUE ZEROS.
*================================================================*
* Variance calculation fields *
*================================================================*
01 WS-VARIANCE-AMOUNT PIC S9(12)V99 VALUE ZEROS.
01 WS-VARIANCE-PERCENT PIC S9(3)V99 VALUE ZEROS.
The PERFORM Structure
The heart of the program is the control break processing loop. Here is the paragraph hierarchy Patricia designed:
MAIN-PROCESS
├── PERFORM INITIALIZATION
│ ├── PERFORM OPEN-FILES
│ ├── PERFORM INITIALIZE-ACCUMULATORS
│ └── PERFORM PRINT-REPORT-HEADER
├── PERFORM PROCESS-ALL-RECORDS
│ ├── PERFORM READ-TAX-RECORD (priming read)
│ └── PERFORM UNTIL END-OF-FILE
│ ├── PERFORM CHECK-CONTROL-BREAKS
│ │ ├── (if region changed)
│ │ │ ├── PERFORM DISTRICT-BREAK
│ │ │ ├── PERFORM REGION-BREAK
│ │ │ └── PERFORM START-NEW-REGION
│ │ ├── (if district changed)
│ │ │ ├── PERFORM DISTRICT-BREAK
│ │ │ └── PERFORM START-NEW-DISTRICT
│ │ └── (else: same district)
│ ├── PERFORM ACCUMULATE-DETAIL
│ ├── PERFORM PRINT-DETAIL-LINE
│ └── PERFORM READ-TAX-RECORD
├── PERFORM FINAL-BREAKS
│ ├── PERFORM DISTRICT-BREAK
│ ├── PERFORM REGION-BREAK
│ └── PERFORM PRINT-GRAND-TOTAL
└── PERFORM WRAP-UP
├── PERFORM PRINT-REPORT-FOOTER
└── PERFORM CLOSE-FILES
The Control Break Logic
The control break detection follows a specific pattern. When the outermost key changes, all inner breaks must fire first (in inner-to-outer order), and then the new group begins:
*================================================================*
* PROCESS-ALL-RECORDS: Main processing loop with priming read. *
*================================================================*
PROCESS-ALL-RECORDS.
PERFORM READ-TAX-RECORD
PERFORM UNTIL END-OF-FILE
PERFORM CHECK-CONTROL-BREAKS
PERFORM ACCUMULATE-DETAIL
PERFORM PRINT-DETAIL-LINE
PERFORM READ-TAX-RECORD
END-PERFORM
.
*================================================================*
* CHECK-CONTROL-BREAKS: Detects changes in control keys. *
* CRITICAL: Breaks must fire from innermost to outermost. *
* When a higher-level key changes, all lower-level breaks *
* must fire first to print their subtotals. *
*================================================================*
CHECK-CONTROL-BREAKS.
* Check for region change (outermost break)
IF TCR-REGION-CODE NOT = WS-PREV-REGION
* Fire inner breaks first
IF WS-PREV-REGION NOT = HIGH-VALUES
PERFORM DISTRICT-BREAK
PERFORM REGION-BREAK
END-IF
PERFORM START-NEW-REGION
PERFORM START-NEW-DISTRICT
ELSE
* Check for district change (inner break)
IF TCR-DISTRICT-CODE NOT = WS-PREV-DISTRICT
IF WS-PREV-DISTRICT NOT = HIGH-VALUES
PERFORM DISTRICT-BREAK
END-IF
PERFORM START-NEW-DISTRICT
END-IF
END-IF
.
The Break Paragraphs
Each break paragraph follows the same pattern: print the subtotal, roll the accumulators up to the next level, and reset the current level:
*================================================================*
* DISTRICT-BREAK: Print district subtotal and roll up to region. *
*================================================================*
DISTRICT-BREAK.
PERFORM CALCULATE-VARIANCE-DISTRICT
DISPLAY " District Total: "
WS-SAVE-DIST-NAME
DISPLAY " Offices: " WS-DIST-OFFICE-COUNT
DISPLAY " Current: " WS-DIST-CURRENT
DISPLAY " Prior Yr: " WS-DIST-PRIOR-YR
DISPLAY " Variance: " WS-VARIANCE-AMOUNT
" (" WS-VARIANCE-PERCENT "%)"
DISPLAY " " WS-REPORT-SEPARATOR
* Roll district accumulators up to region
ADD WS-DIST-CURRENT TO WS-REG-CURRENT
ADD WS-DIST-PRIOR-YR TO WS-REG-PRIOR-YR
ADD WS-DIST-OFFICE-COUNT TO WS-REG-OFFICE-COUNT
ADD 1 TO WS-REG-DIST-COUNT
* Reset district accumulators
MOVE ZEROS TO WS-DISTRICT-ACCUM
.
*================================================================*
* REGION-BREAK: Print region subtotal and roll up to grand. *
*================================================================*
REGION-BREAK.
PERFORM CALCULATE-VARIANCE-REGION
DISPLAY " ==============================="
DISPLAY " Region Total: "
WS-SAVE-REGION-NAME
DISPLAY " Districts: " WS-REG-DIST-COUNT
DISPLAY " Offices: " WS-REG-OFFICE-COUNT
DISPLAY " Current: " WS-REG-CURRENT
DISPLAY " Prior Yr: " WS-REG-PRIOR-YR
DISPLAY " Variance: " WS-VARIANCE-AMOUNT
" (" WS-VARIANCE-PERCENT "%)"
DISPLAY " ==============================="
* Roll region accumulators up to grand
ADD WS-REG-CURRENT TO WS-GRAND-CURRENT
ADD WS-REG-PRIOR-YR TO WS-GRAND-PRIOR-YR
ADD WS-REG-DIST-COUNT TO WS-GRAND-DIST-COUNT
ADD WS-REG-OFFICE-COUNT TO WS-GRAND-OFFICE-COUNT
ADD 1 TO WS-GRAND-REG-COUNT
* Reset region accumulators
MOVE ZEROS TO WS-REGION-ACCUM
.
The Accumulation Pattern
Each detail record contributes to the innermost accumulator (district level). The accumulators cascade upward only during breaks:
*================================================================*
* ACCUMULATE-DETAIL: Add current record to district totals. *
* Note: We only accumulate at the district level. Region and *
* grand totals are built by rolling up during breaks. *
*================================================================*
ACCUMULATE-DETAIL.
ADD TCR-MONTH-AMOUNT TO WS-DIST-CURRENT
ADD TCR-PRIOR-YEAR-AMT TO WS-DIST-PRIOR-YR
ADD 1 TO WS-DIST-OFFICE-COUNT
.
This "accumulate at the lowest level, roll up during breaks" pattern is the standard COBOL approach. It ensures that: - Each amount is counted exactly once. - Subtotals at each level are correct. - The grand total equals the sum of all region totals, which equals the sum of all district totals.
The Final Breaks
After the main loop ends (end-of-file), the final record's group has not yet been totaled. The final breaks must be fired explicitly:
*================================================================*
* FINAL-BREAKS: Fire all remaining breaks after end-of-file. *
* This is easy to forget and is a common source of bugs in *
* control break programs! *
*================================================================*
FINAL-BREAKS.
IF WS-PREV-REGION NOT = HIGH-VALUES
PERFORM DISTRICT-BREAK
PERFORM REGION-BREAK
END-IF
PERFORM PRINT-GRAND-TOTAL
.
Variance Calculation
The variance calculation demonstrates reusable paragraphs. The same calculation logic is used at each level, just with different input values:
*================================================================*
* CALCULATE-VARIANCE-DISTRICT: Compute district-level variance. *
*================================================================*
CALCULATE-VARIANCE-DISTRICT.
SUBTRACT WS-DIST-PRIOR-YR FROM WS-DIST-CURRENT
GIVING WS-VARIANCE-AMOUNT
IF WS-DIST-PRIOR-YR NOT = ZEROS
COMPUTE WS-VARIANCE-PERCENT ROUNDED =
(WS-VARIANCE-AMOUNT / WS-DIST-PRIOR-YR)
* 100
ELSE
MOVE ZEROS TO WS-VARIANCE-PERCENT
END-IF
.
Lessons Learned
Patricia Chen documented several key lessons from the project:
1. The Final Break Problem
The most common bug in control break programs is forgetting to fire the final breaks after the main loop ends. Patricia's team caught this during testing when the last district's and last region's subtotals were missing from the report. The solution was the explicit FINAL-BREAKS paragraph.
2. HIGH-VALUES Initialization
Initializing the previous-key variables to HIGH-VALUES (rather than spaces or zeros) provides a clean first-record detection mechanism. Since no real key value matches HIGH-VALUES, the first record always triggers all breaks, which in turn triggers the START-NEW paragraphs for proper initialization.
3. Break Ordering
When a higher-level key changes, the inner breaks must fire before the outer break. In a region-district-office hierarchy, when the region changes: 1. First, fire the district break (to total the last district of the old region) 2. Then, fire the region break (to total the old region) 3. Then, start the new region and new district
Getting this ordering wrong produces incorrect subtotals. Patricia's team created a standard "break template" that enforces the correct ordering.
4. Accumulator Discipline
The rule "accumulate at the lowest level, roll up during breaks" eliminates the risk of double-counting. Patricia's team initially tried accumulating at all levels simultaneously and encountered discrepancies caused by timing issues (when exactly does a break fire relative to accumulation?).
5. Testing Strategy
The team developed test data that specifically targets control break edge cases: - A file with only one record (tests all breaks firing for a single-record group) - A file with one record per district per region (tests single-record groups at every level) - A file where the last record is the only record in its district/region (tests final breaks for small groups) - A file with maximum-size groups to test accumulator overflow
The PERFORM Pattern Summary
The control break report demonstrates several key PERFORM patterns:
| Pattern | Where Used |
|---|---|
| IPT (Init-Process-Terminate) | MAIN-PROCESS structure |
| Priming read | PROCESS-ALL-RECORDS |
| PERFORM UNTIL with condition name | Main processing loop |
| Nested PERFORM (paragraphs calling paragraphs) | Break detection calling break processing |
| Reusable paragraphs | Variance calculation used at multiple levels |
| Accumulator cascade | Detail to district to region to grand |
| Inline PERFORM | Short utility calculations |
Discussion Questions
-
Why is the "accumulate at lowest level, roll up during breaks" pattern preferable to accumulating at all levels simultaneously?
-
The original 1987 program used GO TO statements to jump between break paragraphs. What specific problems would this cause compared to the structured PERFORM approach?
-
If a new level were added to the hierarchy (e.g., "zone" between region and district), what paragraphs would need to be added and what existing paragraphs would need modification?
-
How would you modify this program to handle the case where the input file is not sorted correctly? What validation would you add, and where in the PERFORM hierarchy would it go?
-
The control break pattern uses persistent WORKING-STORAGE variables for the accumulators. In a modern language, you might use function-local variables. What are the advantages and disadvantages of COBOL's approach?