Case Study 1: Control Break Reporting at State Revenue Department

Background

The State Revenue Department of a mid-Atlantic state processes property tax payments from 5 regions, each containing 3-8 districts, with each district having 2-12 local tax offices. Every month, the department generates a comprehensive report summarizing tax collections at all three levels: by office, by district, and by region, with a statewide grand total.

The original report program was written in 1987 and has been modified dozens of times over the decades. The most recent project, led by senior COBOL developer Patricia Chen, involved a complete rewrite to replace the aging GO TO-based code with clean, structured PERFORM logic.

This case study examines the multi-level control break report implementation, focusing on how nested PERFORM loops and accumulator patterns work together to produce a complex hierarchical report.

The Business Requirements

The monthly tax collection report must show:

  1. Detail lines for each office showing office ID, office name, and monthly collection amount.
  2. Office subtotals at each office change (though typically there is only one line per office, making this a simple accumulation).
  3. District subtotals whenever the district code changes, showing the total collections for all offices in that district.
  4. Region subtotals whenever the region code changes, showing the total collections for all districts in that region.
  5. Grand total at the end of the report showing statewide collections.
  6. Variance analysis comparing current month to the same month last year.

The input data is a sequential file sorted by region, then by district within region, then by office within district.

The Data Structures

      *================================================================*
      * Input record layout                                            *
      *================================================================*
       01  TAX-COLLECTION-RECORD.
           05 TCR-REGION-CODE      PIC X(2).
           05 TCR-REGION-NAME      PIC X(20).
           05 TCR-DISTRICT-CODE    PIC X(3).
           05 TCR-DISTRICT-NAME    PIC X(25).
           05 TCR-OFFICE-CODE      PIC X(4).
           05 TCR-OFFICE-NAME      PIC X(30).
           05 TCR-MONTH-AMOUNT     PIC S9(9)V99.
           05 TCR-PRIOR-YEAR-AMT   PIC S9(9)V99.
           05 TCR-YTD-AMOUNT       PIC S9(11)V99.

      *================================================================*
      * Control break tracking variables                               *
      *================================================================*
       01  WS-PREVIOUS-KEYS.
           05 WS-PREV-REGION       PIC X(2) VALUE HIGH-VALUES.
           05 WS-PREV-DISTRICT     PIC X(3) VALUE HIGH-VALUES.
           05 WS-PREV-OFFICE       PIC X(4) VALUE HIGH-VALUES.

       01  WS-SAVE-NAMES.
           05 WS-SAVE-REGION-NAME  PIC X(20).
           05 WS-SAVE-DIST-NAME    PIC X(25).

      *================================================================*
      * Accumulators (one set for each break level)                    *
      *================================================================*
       01  WS-DISTRICT-ACCUM.
           05 WS-DIST-CURRENT      PIC S9(11)V99 VALUE ZEROS.
           05 WS-DIST-PRIOR-YR     PIC S9(11)V99 VALUE ZEROS.
           05 WS-DIST-OFFICE-COUNT PIC 9(3)      VALUE ZEROS.

       01  WS-REGION-ACCUM.
           05 WS-REG-CURRENT       PIC S9(12)V99 VALUE ZEROS.
           05 WS-REG-PRIOR-YR      PIC S9(12)V99 VALUE ZEROS.
           05 WS-REG-DIST-COUNT    PIC 9(3)      VALUE ZEROS.
           05 WS-REG-OFFICE-COUNT  PIC 9(4)      VALUE ZEROS.

       01  WS-GRAND-ACCUM.
           05 WS-GRAND-CURRENT     PIC S9(13)V99 VALUE ZEROS.
           05 WS-GRAND-PRIOR-YR    PIC S9(13)V99 VALUE ZEROS.
           05 WS-GRAND-REG-COUNT   PIC 9(2)      VALUE ZEROS.
           05 WS-GRAND-DIST-COUNT  PIC 9(3)      VALUE ZEROS.
           05 WS-GRAND-OFFICE-COUNT PIC 9(4)     VALUE ZEROS.

      *================================================================*
      * Variance calculation fields                                    *
      *================================================================*
       01  WS-VARIANCE-AMOUNT      PIC S9(12)V99 VALUE ZEROS.
       01  WS-VARIANCE-PERCENT     PIC S9(3)V99  VALUE ZEROS.

The PERFORM Structure

The heart of the program is the control break processing loop. Here is the paragraph hierarchy Patricia designed:

MAIN-PROCESS
├── PERFORM INITIALIZATION
│   ├── PERFORM OPEN-FILES
│   ├── PERFORM INITIALIZE-ACCUMULATORS
│   └── PERFORM PRINT-REPORT-HEADER
├── PERFORM PROCESS-ALL-RECORDS
│   ├── PERFORM READ-TAX-RECORD              (priming read)
│   └── PERFORM UNTIL END-OF-FILE
│       ├── PERFORM CHECK-CONTROL-BREAKS
│       │   ├── (if region changed)
│       │   │   ├── PERFORM DISTRICT-BREAK
│       │   │   ├── PERFORM REGION-BREAK
│       │   │   └── PERFORM START-NEW-REGION
│       │   ├── (if district changed)
│       │   │   ├── PERFORM DISTRICT-BREAK
│       │   │   └── PERFORM START-NEW-DISTRICT
│       │   └── (else: same district)
│       ├── PERFORM ACCUMULATE-DETAIL
│       ├── PERFORM PRINT-DETAIL-LINE
│       └── PERFORM READ-TAX-RECORD
├── PERFORM FINAL-BREAKS
│   ├── PERFORM DISTRICT-BREAK
│   ├── PERFORM REGION-BREAK
│   └── PERFORM PRINT-GRAND-TOTAL
└── PERFORM WRAP-UP
    ├── PERFORM PRINT-REPORT-FOOTER
    └── PERFORM CLOSE-FILES

The Control Break Logic

The control break detection follows a specific pattern. When the outermost key changes, all inner breaks must fire first (in inner-to-outer order), and then the new group begins:

      *================================================================*
      * PROCESS-ALL-RECORDS: Main processing loop with priming read.   *
      *================================================================*
       PROCESS-ALL-RECORDS.
           PERFORM READ-TAX-RECORD

           PERFORM UNTIL END-OF-FILE
               PERFORM CHECK-CONTROL-BREAKS
               PERFORM ACCUMULATE-DETAIL
               PERFORM PRINT-DETAIL-LINE
               PERFORM READ-TAX-RECORD
           END-PERFORM
           .

      *================================================================*
      * CHECK-CONTROL-BREAKS: Detects changes in control keys.         *
      * CRITICAL: Breaks must fire from innermost to outermost.        *
      * When a higher-level key changes, all lower-level breaks        *
      * must fire first to print their subtotals.                      *
      *================================================================*
       CHECK-CONTROL-BREAKS.
      *    Check for region change (outermost break)
           IF TCR-REGION-CODE NOT = WS-PREV-REGION
      *        Fire inner breaks first
               IF WS-PREV-REGION NOT = HIGH-VALUES
                   PERFORM DISTRICT-BREAK
                   PERFORM REGION-BREAK
               END-IF
               PERFORM START-NEW-REGION
               PERFORM START-NEW-DISTRICT
           ELSE
      *        Check for district change (inner break)
               IF TCR-DISTRICT-CODE NOT = WS-PREV-DISTRICT
                   IF WS-PREV-DISTRICT NOT = HIGH-VALUES
                       PERFORM DISTRICT-BREAK
                   END-IF
                   PERFORM START-NEW-DISTRICT
               END-IF
           END-IF
           .

The Break Paragraphs

Each break paragraph follows the same pattern: print the subtotal, roll the accumulators up to the next level, and reset the current level:

      *================================================================*
      * DISTRICT-BREAK: Print district subtotal and roll up to region. *
      *================================================================*
       DISTRICT-BREAK.
           PERFORM CALCULATE-VARIANCE-DISTRICT

           DISPLAY "    District Total: "
               WS-SAVE-DIST-NAME
           DISPLAY "      Offices:  " WS-DIST-OFFICE-COUNT
           DISPLAY "      Current:  " WS-DIST-CURRENT
           DISPLAY "      Prior Yr: " WS-DIST-PRIOR-YR
           DISPLAY "      Variance: " WS-VARIANCE-AMOUNT
               " (" WS-VARIANCE-PERCENT "%)"
           DISPLAY "    " WS-REPORT-SEPARATOR

      *    Roll district accumulators up to region
           ADD WS-DIST-CURRENT    TO WS-REG-CURRENT
           ADD WS-DIST-PRIOR-YR   TO WS-REG-PRIOR-YR
           ADD WS-DIST-OFFICE-COUNT TO WS-REG-OFFICE-COUNT
           ADD 1 TO WS-REG-DIST-COUNT

      *    Reset district accumulators
           MOVE ZEROS TO WS-DISTRICT-ACCUM
           .

      *================================================================*
      * REGION-BREAK: Print region subtotal and roll up to grand.      *
      *================================================================*
       REGION-BREAK.
           PERFORM CALCULATE-VARIANCE-REGION

           DISPLAY "  ==============================="
           DISPLAY "  Region Total: "
               WS-SAVE-REGION-NAME
           DISPLAY "    Districts: " WS-REG-DIST-COUNT
           DISPLAY "    Offices:   " WS-REG-OFFICE-COUNT
           DISPLAY "    Current:   " WS-REG-CURRENT
           DISPLAY "    Prior Yr:  " WS-REG-PRIOR-YR
           DISPLAY "    Variance:  " WS-VARIANCE-AMOUNT
               " (" WS-VARIANCE-PERCENT "%)"
           DISPLAY "  ==============================="

      *    Roll region accumulators up to grand
           ADD WS-REG-CURRENT     TO WS-GRAND-CURRENT
           ADD WS-REG-PRIOR-YR    TO WS-GRAND-PRIOR-YR
           ADD WS-REG-DIST-COUNT  TO WS-GRAND-DIST-COUNT
           ADD WS-REG-OFFICE-COUNT TO WS-GRAND-OFFICE-COUNT
           ADD 1 TO WS-GRAND-REG-COUNT

      *    Reset region accumulators
           MOVE ZEROS TO WS-REGION-ACCUM
           .

The Accumulation Pattern

Each detail record contributes to the innermost accumulator (district level). The accumulators cascade upward only during breaks:

      *================================================================*
      * ACCUMULATE-DETAIL: Add current record to district totals.      *
      * Note: We only accumulate at the district level. Region and     *
      * grand totals are built by rolling up during breaks.            *
      *================================================================*
       ACCUMULATE-DETAIL.
           ADD TCR-MONTH-AMOUNT   TO WS-DIST-CURRENT
           ADD TCR-PRIOR-YEAR-AMT TO WS-DIST-PRIOR-YR
           ADD 1                  TO WS-DIST-OFFICE-COUNT
           .

This "accumulate at the lowest level, roll up during breaks" pattern is the standard COBOL approach. It ensures that: - Each amount is counted exactly once. - Subtotals at each level are correct. - The grand total equals the sum of all region totals, which equals the sum of all district totals.

The Final Breaks

After the main loop ends (end-of-file), the final record's group has not yet been totaled. The final breaks must be fired explicitly:

      *================================================================*
      * FINAL-BREAKS: Fire all remaining breaks after end-of-file.     *
      * This is easy to forget and is a common source of bugs in       *
      * control break programs!                                        *
      *================================================================*
       FINAL-BREAKS.
           IF WS-PREV-REGION NOT = HIGH-VALUES
               PERFORM DISTRICT-BREAK
               PERFORM REGION-BREAK
           END-IF
           PERFORM PRINT-GRAND-TOTAL
           .

Variance Calculation

The variance calculation demonstrates reusable paragraphs. The same calculation logic is used at each level, just with different input values:

      *================================================================*
      * CALCULATE-VARIANCE-DISTRICT: Compute district-level variance.  *
      *================================================================*
       CALCULATE-VARIANCE-DISTRICT.
           SUBTRACT WS-DIST-PRIOR-YR FROM WS-DIST-CURRENT
               GIVING WS-VARIANCE-AMOUNT
           IF WS-DIST-PRIOR-YR NOT = ZEROS
               COMPUTE WS-VARIANCE-PERCENT ROUNDED =
                   (WS-VARIANCE-AMOUNT / WS-DIST-PRIOR-YR)
                   * 100
           ELSE
               MOVE ZEROS TO WS-VARIANCE-PERCENT
           END-IF
           .

Lessons Learned

Patricia Chen documented several key lessons from the project:

1. The Final Break Problem

The most common bug in control break programs is forgetting to fire the final breaks after the main loop ends. Patricia's team caught this during testing when the last district's and last region's subtotals were missing from the report. The solution was the explicit FINAL-BREAKS paragraph.

2. HIGH-VALUES Initialization

Initializing the previous-key variables to HIGH-VALUES (rather than spaces or zeros) provides a clean first-record detection mechanism. Since no real key value matches HIGH-VALUES, the first record always triggers all breaks, which in turn triggers the START-NEW paragraphs for proper initialization.

3. Break Ordering

When a higher-level key changes, the inner breaks must fire before the outer break. In a region-district-office hierarchy, when the region changes: 1. First, fire the district break (to total the last district of the old region) 2. Then, fire the region break (to total the old region) 3. Then, start the new region and new district

Getting this ordering wrong produces incorrect subtotals. Patricia's team created a standard "break template" that enforces the correct ordering.

4. Accumulator Discipline

The rule "accumulate at the lowest level, roll up during breaks" eliminates the risk of double-counting. Patricia's team initially tried accumulating at all levels simultaneously and encountered discrepancies caused by timing issues (when exactly does a break fire relative to accumulation?).

5. Testing Strategy

The team developed test data that specifically targets control break edge cases: - A file with only one record (tests all breaks firing for a single-record group) - A file with one record per district per region (tests single-record groups at every level) - A file where the last record is the only record in its district/region (tests final breaks for small groups) - A file with maximum-size groups to test accumulator overflow

The PERFORM Pattern Summary

The control break report demonstrates several key PERFORM patterns:

Pattern Where Used
IPT (Init-Process-Terminate) MAIN-PROCESS structure
Priming read PROCESS-ALL-RECORDS
PERFORM UNTIL with condition name Main processing loop
Nested PERFORM (paragraphs calling paragraphs) Break detection calling break processing
Reusable paragraphs Variance calculation used at multiple levels
Accumulator cascade Detail to district to region to grand
Inline PERFORM Short utility calculations

Discussion Questions

  1. Why is the "accumulate at lowest level, roll up during breaks" pattern preferable to accumulating at all levels simultaneously?

  2. The original 1987 program used GO TO statements to jump between break paragraphs. What specific problems would this cause compared to the structured PERFORM approach?

  3. If a new level were added to the hierarchy (e.g., "zone" between region and district), what paragraphs would need to be added and what existing paragraphs would need modification?

  4. How would you modify this program to handle the case where the input file is not sorted correctly? What validation would you add, and where in the PERFORM hierarchy would it go?

  5. The control break pattern uses persistent WORKING-STORAGE variables for the accumulators. In a modern language, you might use function-local variables. What are the advantages and disadvantages of COBOL's approach?