28 min read

> "Any fool can write code that a computer can understand. Good programmers write code that humans can understand." — Martin Fowler

Chapter 35: Code Review and Static Analysis

"Any fool can write code that a computer can understand. Good programmers write code that humans can understand." — Martin Fowler

Maria Chen keeps a folder on her workstation labeled "Hall of Shame." It contains printouts of COBOL code she has encountered over her twenty years at GlobalBank — code that compiled and ran correctly but was so poorly written that modifying it was an exercise in archaeological excavation. Her favorite specimen is a 1,200-line paragraph with no comments, 47 levels of nested IF statements, and variable names like WS-A1, WS-A2, through WS-A37.

"This program works," Maria tells new team members, showing them the printout. "It has worked for twenty-two years. It processes $14 million in transactions every night. And nobody — nobody — wants to touch it, because nobody understands it. That's the real cost of bad code."

This chapter is about the practices and tools that prevent code from ending up in Maria's Hall of Shame: code review, coding standards, and static analysis. These practices catch a fundamentally different class of defects than testing — not errors in behavior (the program produces wrong output), but errors in quality (the program is fragile, unreadable, or unmaintainable).

35.1 Why Code Review Matters for COBOL

Code review — the practice of having another developer examine your code before it enters production — is one of the most effective defect-detection techniques in software engineering. Studies consistently show that code review catches 60-90% of defects, compared to 25-40% for testing alone. The combination of review and testing catches more defects than either practice alone.

For COBOL specifically, code review is critical because:

Longevity of Code

COBOL code lives for decades. A poorly written paragraph introduced today will still be confusing developers in 2050. The cost of a quality defect compounds over time — every developer who reads, modifies, or debugs the code pays the price of its opacity.

Knowledge Transfer

With the average COBOL developer approaching retirement age, code review is one of the few mechanisms for transferring institutional knowledge from senior to junior developers. When Maria Chen reviews Derek Washington's code, she doesn't just catch bugs — she teaches Derek the patterns, conventions, and domain knowledge that make COBOL maintainable.

Regulatory Requirements

In financial services and healthcare — the primary domains of COBOL — regulatory frameworks often require documented code review as part of the software development lifecycle. SOX compliance, HIPAA, and PCI DSS all expect evidence of review processes.

💡 Readability is a Feature: In COBOL, readability isn't a nice-to-have — it's a structural requirement. A program that nobody can understand is a program that nobody can safely modify. And a program that can't be safely modified is a program that becomes a liability, not an asset. Code review is the primary mechanism for ensuring readability.

35.2 Common COBOL Defect Patterns

Before we discuss how to conduct reviews, let's catalog the defects we're looking for. These are the patterns that appear repeatedly in COBOL code reviews — the "usual suspects" that experienced reviewers learn to spot automatically.

Pattern 1: Uninitialized Fields

One of the most common and dangerous COBOL defects. Unlike languages with automatic initialization, COBOL WORKING-STORAGE fields contain whatever was in memory unless explicitly initialized.

      * BAD: WS-TOTAL not initialized before accumulation
       01  WS-TOTAL    PIC 9(9)V99.
       ...
       2000-PROCESS-RECORDS.
           READ INPUT-FILE INTO WS-RECORD
               AT END SET EOF TO TRUE
           END-READ
           ADD WS-AMOUNT TO WS-TOTAL.

The fix:

      * GOOD: Initialize before first use
       01  WS-TOTAL    PIC 9(9)V99  VALUE 0.
       ...
       1000-INITIALIZE.
           MOVE 0 TO WS-TOTAL.
       ...
       2000-PROCESS-RECORDS.
           READ INPUT-FILE INTO WS-RECORD
               AT END SET EOF TO TRUE
           END-READ
           ADD WS-AMOUNT TO WS-TOTAL.

⚠️ Caution: The VALUE clause only sets the initial value when the program first loads. If a paragraph is PERFORMed multiple times within a run (e.g., processing multiple account groups), you must explicitly re-initialize fields at the start of each iteration. Relying on VALUE clauses for re-initialization is a common source of accumulation bugs.

Pattern 2: Missing Scope Terminators

Before the 1985 standard, COBOL had no explicit scope terminators. The period (.) ended all open scopes. This led to subtle bugs when code was modified:

      * DANGEROUS: Period-delimited IF
       2000-CHECK-BALANCE.
           IF ACCT-BALANCE < 0
               MOVE "OVERDRAWN" TO ACCT-STATUS
               PERFORM 3000-SEND-ALERT.
           MOVE "PROCESSED" TO WS-RESULT.

Is MOVE "PROCESSED" inside or outside the IF? It's outside — the period after 3000-SEND-ALERT ends the IF scope. But a careless modifier might add a statement between the PERFORM and the period, thinking they're inside the IF:

      * BUG: Developer added a line thinking it was inside the IF
       2000-CHECK-BALANCE.
           IF ACCT-BALANCE < 0
               MOVE "OVERDRAWN" TO ACCT-STATUS
               PERFORM 3000-SEND-ALERT
               ADD 1 TO WS-OVERDRAFT-COUNT.
           MOVE "PROCESSED" TO WS-RESULT.

Now the period after WS-OVERDRAFT-COUNT ends the IF, and MOVE "PROCESSED" always executes. The safe version uses explicit scope terminators:

      * SAFE: Explicit END-IF
       2000-CHECK-BALANCE.
           IF ACCT-BALANCE < 0
               MOVE "OVERDRAWN" TO ACCT-STATUS
               PERFORM 3000-SEND-ALERT
               ADD 1 TO WS-OVERDRAFT-COUNT
           END-IF
           MOVE "PROCESSED" TO WS-RESULT.

Pattern 3: Wrong PICTURE Clause

PIC clause errors are insidious because they often produce almost correct results:

      * BUG: PIC 9(5)V99 cannot hold values over 99,999.99
       01  WS-ANNUAL-SALARY   PIC 9(5)V99.
       ...
           MOVE 125000.00 TO WS-ANNUAL-SALARY.
      *    Result: WS-ANNUAL-SALARY = 25000.00 (truncated!)
      * BUG: Missing V causes decimal position error
       01  WS-TAX-RATE        PIC 9V99.
      *    Developer intended PIC V999 for rates like 0.065
      *    PIC 9V99 allows 0.00-9.99, wasting the integer digit
      * BUG: Unsigned PIC cannot hold negative result
       01  WS-NET-CHANGE      PIC 9(7)V99.
       ...
           SUBTRACT WS-DEBITS FROM WS-CREDITS
               GIVING WS-NET-CHANGE.
      *    If DEBITS > CREDITS, result is wrong (no sign storage)
      * FIX: PIC S9(7)V99 (add the S for signed)

Pattern 4: Unreachable Code After STOP RUN or GOBACK

      * BUG: Code after STOP RUN never executes
       9000-ERROR-HANDLER.
           DISPLAY "FATAL ERROR: " WS-ERROR-MSG
           STOP RUN.
           PERFORM 9100-WRITE-AUDIT-LOG.
      *    ^^^ This line NEVER executes

Pattern 5: Incorrect PERFORM THRU Range

      * DANGEROUS: PERFORM THRU includes all paragraphs in between
       PERFORM 2000-START THRU 2999-END.

      * If someone later adds paragraph 2500-NEW-LOGIC between
      * 2000-START and 2999-END, it will be included in the
      * PERFORM range -- possibly unintentionally.

Pattern 6: Decimal Alignment Errors in Arithmetic

       01  WS-PRICE      PIC 9(5)V99.
       01  WS-QUANTITY    PIC 9(3).
       01  WS-TOTAL       PIC 9(5)V99.
       ...
      * BUG: MULTIPLY without GIVING silently truncates
           MULTIPLY WS-PRICE BY WS-QUANTITY.
      *    WS-QUANTITY is PIC 9(3) -- the result replaces WS-QUANTITY
      *    and is truncated to 3 integer digits with no decimal!

      * CORRECT:
           MULTIPLY WS-PRICE BY WS-QUANTITY
               GIVING WS-TOTAL.

Pattern 7: File Status Not Checked

      * BAD: No check after I/O operation
           OPEN INPUT MASTER-FILE
           READ MASTER-FILE INTO WS-MASTER-REC
           ...

      * GOOD: Always check file status
           OPEN INPUT MASTER-FILE
           IF WS-FILE-STATUS NOT = "00"
               DISPLAY "OPEN ERROR: " WS-FILE-STATUS
               PERFORM 9000-ABORT-PROGRAM
           END-IF
           READ MASTER-FILE INTO WS-MASTER-REC
           IF WS-FILE-STATUS NOT = "00"
               AND WS-FILE-STATUS NOT = "10"
               DISPLAY "READ ERROR: " WS-FILE-STATUS
               PERFORM 9000-ABORT-PROGRAM
           END-IF

Pattern 8: GO TO Spaghetti

      * UNMAINTAINABLE: GO TO creates tangled control flow
       2000-PROCESS.
           IF WS-TYPE = "A"
               GO TO 2100-TYPE-A
           END-IF
           IF WS-TYPE = "B"
               GO TO 2200-TYPE-B
           END-IF
           GO TO 2900-DEFAULT.

       2100-TYPE-A.
           ...
           IF WS-SPECIAL-FLAG = "Y"
               GO TO 2300-SPECIAL
           END-IF
           GO TO 3000-CONTINUE.

       2200-TYPE-B.
           ...
           GO TO 3000-CONTINUE.

       2300-SPECIAL.
           ...
           GO TO 2100-TYPE-A.    <<<< Creates a loop!

The structured equivalent:

      * MAINTAINABLE: EVALUATE with PERFORM
       2000-PROCESS.
           EVALUATE WS-TYPE
               WHEN "A"
                   PERFORM 2100-TYPE-A
               WHEN "B"
                   PERFORM 2200-TYPE-B
               WHEN OTHER
                   PERFORM 2900-DEFAULT
           END-EVALUATE.

📊 By the Numbers: A study of 200 production COBOL defects at a major insurance company found the following distribution: uninitialized fields (18%), wrong PIC clause (15%), missing scope terminators (12%), file status not checked (11%), boundary condition errors (10%), decimal alignment errors (8%), dead/unreachable code (7%), other (19%). Note that most of these are detectable by either code review or static analysis — they don't require execution to find.

35.3 Coding Standards for COBOL

Coding standards serve two purposes: they prevent known defect patterns, and they ensure consistency across a codebase so that any developer can read any program. A good standard is specific enough to be enforceable but flexible enough to avoid unnecessary bureaucracy.

Naming Conventions

      *================================================================*
      * NAMING CONVENTION STANDARD
      *================================================================*
      *
      * Program IDs: 8 characters, format: SYS-FUNC
      *   SYS = system prefix (GB=GlobalBank, MC=MedClaim)
      *   FUNC = function (BALC=balance calc, ADJD=adjudicate)
      *   Example: GB-BALC, MC-ADJD
      *
      * Paragraphs: NNNN-DESCRIPTIVE-NAME
      *   NNNN = 4-digit number for ordering
      *   0000 = main driver
      *   1000-1999 = initialization
      *   2000-7999 = processing logic
      *   8000-8999 = reporting/output
      *   9000-9999 = error handling/cleanup
      *
      * Working-Storage: WS-DESCRIPTIVE-NAME
      * File fields: No prefix (use FD/record name for context)
      * Linkage: LS-DESCRIPTIVE-NAME
      * 88 levels: Condition name (no prefix needed)
      *   Example: 88 ACCOUNT-ACTIVE VALUE "A"
      *
      * Copybooks: CPY-DESCRIPTIVE-NAME or system-specific
      *   Example: CPY-ACCT-REC, CPY-ERROR-CODES
      *================================================================*

Structure Rules

Rule Rationale
Maximum paragraph length: 50 lines Longer paragraphs are difficult to understand
Maximum nesting depth: 3 levels Deep nesting indicates logic that should be refactored
All IF/EVALUATE must have explicit scope terminators Prevents the missing-scope-terminator defect pattern
No GO TO except in error-handling exit paragraphs Prevents spaghetti control flow
Every I/O operation must check FILE STATUS Prevents silent I/O failures
Every COMPUTE must have ON SIZE ERROR Prevents silent arithmetic overflow
All numeric fields used in arithmetic must be COMP or COMP-3 Prevents inefficient display arithmetic

Documentation Requirements

      *================================================================*
      * Program: GB-BALC (Balance Calculation)
      * Author:  Maria Chen
      * Date:    2025-10-15
      * Purpose: Calculate daily interest for all account types.
      *
      * Input:   ACCT-MASTER (VSAM KSDS)
      * Output:  ACCT-MASTER (updated), INTEREST-REPORT (sequential)
      *
      * Modification History:
      * Date       Author    Description
      * ---------- --------- ----------------------------------------
      * 2025-10-15 M.Chen    Initial version (refactored from BAL-CALC)
      * 2025-11-01 D.Wash    Added money market tier logic
      *================================================================*

Each paragraph should have a one-line comment explaining its purpose:

      *---------------------------------------------------------------
      * Calculate compound daily interest for the current account.
      * Input:  ACCT-BALANCE, ACCT-ANNUAL-RATE, WS-DAYS-IN-PERIOD
      * Output: WS-INTEREST-AMT
      *---------------------------------------------------------------
       3110-COMPOUND-DAILY.

35.4 Static Analysis Tools for COBOL

Static analysis examines source code without executing it, identifying potential defects, standards violations, and structural problems automatically. For COBOL, several commercial and open-source tools are available.

SonarQube for COBOL

SonarQube, the widely-used code quality platform, has a COBOL plugin (commercial) that analyzes COBOL programs against a configurable rule set. It reports issues in categories:

  • Bugs: Likely defects (uninitialized variables, dead code, unreachable statements)
  • Vulnerabilities: Security issues (SQL injection in embedded SQL, hardcoded credentials)
  • Code Smells: Maintainability issues (long paragraphs, deep nesting, copy-paste)
  • Security Hotspots: Code that needs human security review

A typical SonarQube report for a COBOL program:

File: BAL-CALC.cbl
Lines of Code: 3,847
Duplicated Lines: 12.3%

Issues Found:
  CRITICAL (2):
    Line 847: Variable WS-TEMP-RATE used before initialization
    Line 2103: Arithmetic overflow possible in COMPUTE (no SIZE ERROR)

  MAJOR (7):
    Line 156: Paragraph 2000-PROCESS exceeds 50-line limit (87 lines)
    Line 302: IF nesting depth exceeds 3 (depth: 5)
    Line 445: Missing END-IF scope terminator
    Line 678: GO TO used outside error-handling context
    Line 901: File status not checked after READ
    Line 1205: PICTURE clause mismatch in MOVE
    Line 1890: Commented-out code block (47 lines)

  MINOR (15):
    Line 45: Non-standard paragraph naming (missing numeric prefix)
    Line 89: Missing program header documentation
    ...

Quality Gate: FAILED
  - Reliability: C (2 critical issues)
  - Maintainability: D (12.3% duplication)
  - Coverage: Not measured

IBM Application Discovery and Delivery Intelligence (ADDI)

IBM's ADDI tool provides deep analysis of COBOL applications, including:

  • Cross-reference analysis: Which programs call which, which copybooks are used where, which files are accessed by which programs.
  • Impact analysis: If you change field X in copybook Y, which programs are affected?
  • Data flow analysis: How does a field value propagate through the program?
  • Dead code detection: Paragraphs that are never PERFORMed, variables that are never referenced.

Micro Focus Enterprise Analyzer

Micro Focus provides visualization and analysis of COBOL applications, including control flow graphs, data flow diagrams, and complexity analysis.

GnuCOBOL Compiler Warnings

Even without commercial tools, the GnuCOBOL compiler provides useful static analysis through its warning flags:

cobc -Wall -Wextra -x PROGRAM.cbl

Key warnings: - -Wunreachable: Unreachable code - -Wunused: Unused data items - -Wimplicit-define: Implicitly defined data items - -Wtruncate: Possible truncation in MOVE operations - -Wno-dialect: Non-standard COBOL extensions

Try It Yourself: Compile one of your previous exercise programs with cobc -Wall -Wextra and examine the warnings. You may be surprised how many potential issues the compiler identifies that you missed. Fix all warnings, then recompile to verify a clean build.

35.5 Complexity Metrics

Complexity metrics quantify how difficult a program is to understand, test, and maintain. While no single metric tells the whole story, they provide objective data to guide refactoring decisions.

Cyclomatic Complexity

Cyclomatic complexity, developed by Thomas McCabe in 1976, measures the number of independent paths through a program. For COBOL, each of the following adds one to the cyclomatic complexity:

  • Each IF statement
  • Each WHEN clause in EVALUATE
  • Each PERFORM UNTIL / VARYING (the loop condition)
  • Each AND / OR in a compound condition
  • The program entry point (starts at 1)
      *---------------------------------------------------------------
      * Cyclomatic complexity analysis for 3000-CALC-INTEREST
      *---------------------------------------------------------------
       3000-CALC-INTEREST.                        *  Base: 1
           EVALUATE ACCT-TYPE                     *
               WHEN "CHK"                         *  +1 = 2
                   IF ACCT-BALANCE >              *  +1 = 3
                      WS-CHK-INT-THRESHOLD
                       PERFORM 3100-APPLY-RATE
                   ELSE
                       MOVE 0 TO WS-INTEREST-AMT
                   END-IF
               WHEN "SAV"                         *  +1 = 4
                   PERFORM 3100-APPLY-RATE
               WHEN "CD"                          *  +1 = 5
                   PERFORM 3100-APPLY-RATE
               WHEN "MMA"                         *  +1 = 6
                   PERFORM 3200-CALC-TIERED-RATE
                   PERFORM 3100-APPLY-RATE
               WHEN OTHER                         *  +1 = 7
                   MOVE "UNKNOWN-TYPE"
                       TO WS-ERROR-MSG
                   PERFORM 9100-LOG-ERROR
           END-EVALUATE.
      *    Cyclomatic complexity = 7

Interpretation guidelines:

Complexity Risk Level Recommendation
1-10 Low Simple, easy to test
11-20 Moderate Reasonable, but review for simplification
21-50 High Difficult to test and maintain; refactor
51+ Very High Untestable; immediate refactoring needed

Lines of Code (LOC)

While crude, LOC metrics for individual paragraphs correlate strongly with defect density:

Paragraph Size Defect Likelihood
1-25 lines Low
26-50 lines Moderate
51-100 lines High
100+ lines Very high — refactoring recommended

Halstead Metrics

Halstead complexity metrics, while less commonly used than cyclomatic complexity, provide additional insight:

  • Program vocabulary (n): The number of distinct operators and operands
  • Program length (N): The total number of operators and operands
  • Volume (V): N * log2(n) — a measure of program size adjusted for vocabulary
  • Difficulty (D): A measure of how prone to error the program is
  • Effort (E): V * D — an estimate of the mental effort required to understand the program

Maintainability Index

The Maintainability Index (MI) combines several metrics into a single score:

MI = 171 - 5.2 * ln(V) - 0.23 * CC - 16.2 * ln(LOC)

Where:
  V  = Halstead Volume
  CC = Cyclomatic Complexity
  LOC = Lines of Code
MI Score Maintainability
85-171 Highly maintainable
65-84 Moderately maintainable
Below 65 Difficult to maintain

35.6 Dead Code Identification

Dead code is code that exists in the source but is never executed. In COBOL programs that have been maintained for decades, dead code is remarkably common — studies suggest that 10-30% of lines in large COBOL applications are dead code.

Types of Dead Code

Unreachable paragraphs: Paragraphs that no PERFORM, GO TO, or fall-through path can reach:

       PROCEDURE DIVISION.
       0000-MAIN.
           PERFORM 1000-INIT
           PERFORM 2000-PROCESS
           PERFORM 9000-CLEANUP
           STOP RUN.

       1000-INIT.
           ...

       1500-OLD-INIT.
      *    This paragraph is never PERFORMed or referenced
      *    It's dead code from a previous version
           ...

       2000-PROCESS.
           ...

Unreachable branches: Conditions that can never be true:

       01  WS-STATUS   PIC X.
           88 STATUS-ACTIVE   VALUE "A".
           88 STATUS-INACTIVE VALUE "I".
       ...
           IF WS-STATUS = "A"
               ...
           ELSE IF WS-STATUS = "I"
               ...
           ELSE IF WS-STATUS = "X"
      *        ^^^ Can this ever be true? If WS-STATUS is only
      *        set to "A" or "I" elsewhere, this is dead code.
               ...
           END-IF.

Commented-out code: Large blocks of code hidden behind comment markers:

      *    PERFORM 3500-OLD-CALCULATION
      *    IF WS-OLD-FLAG = "Y"
      *        MOVE WS-OLD-RATE TO WS-RATE
      *    END-IF
      *    These 47 lines were commented out in 2003
      *    Nobody knows why or whether they're still relevant

Unused data items: WORKING-STORAGE fields that are defined but never referenced in the PROCEDURE DIVISION:

       01  WS-OLD-COUNTER     PIC 9(5).
      *    This field is defined but never used anywhere.
      *    It may have been used by deleted code.

Detection Strategies

  1. Compiler cross-reference listing: The XREF compiler option (Enterprise COBOL) or -fxref (GnuCOBOL) produces a cross-reference listing showing every data item and paragraph, with the line numbers where they are referenced. Items with no references (other than their definition) are dead.

  2. Static analysis tools: SonarQube and ADDI automatically identify dead code.

  3. Manual review: Search for paragraphs that don't appear in any PERFORM, GO TO, or Section header.

The Case for Removing Dead Code

Dead code has real costs:

  • Confusion: Developers read dead code and try to understand its purpose, wasting time.
  • False dependencies: Dead code may reference copybooks or data items, creating the illusion that they are needed.
  • Compilation overhead: Dead code increases compile time and listing size.
  • Audit risk: Auditors may flag dead code as a maintenance risk.

However, removing dead code also has risks:

  • Hidden activation: Some "dead" code may be reached through dynamic calls or fall-through paths that are difficult to trace statically.
  • Documentation value: Commented-out code sometimes serves as documentation of past approaches.
  • Regression risk: Removing code, even dead code, changes the program. In risk-averse environments, any change requires testing.

⚖️ Debate: Should Dead Code Be Removed? The conservative position says "if it ain't broke, don't fix it" — dead code isn't hurting anything, and removing it introduces change risk. The progressive position says dead code is technical debt that accumulates interest — every developer who encounters it wastes time understanding it, and its presence makes the codebase larger and harder to navigate. Most experienced COBOL developers land somewhere in the middle: remove dead code opportunistically (when you're already modifying the program for another reason) rather than undertaking a dedicated cleanup project.

35.7 Copy/Paste Detection

Copy/paste duplication is endemic in COBOL. Before copybooks and subprograms were widely used, developers routinely copied blocks of code and modified them slightly. The result is code that looks similar across multiple programs — and bugs that must be fixed in multiple places.

Detecting Duplication

Static analysis tools measure duplication as a percentage of total lines. SonarQube, for example, reports:

Duplication Analysis: ACCT-MAINT.cbl
Total Lines: 4,200
Duplicated Lines: 630 (15.0%)
Duplicated Blocks: 12

Duplicate Block 1 (Lines 450-487 ≈ Lines 1200-1237):
  Account validation logic copied with minor changes
  Similarity: 94%

Duplicate Block 2 (Lines 890-920 ≈ Lines 2100-2130):
  Error handling boilerplate
  Similarity: 100%

Refactoring Duplication

When you find duplicated code, the refactoring path depends on the type of duplication:

Identical code: Extract into a copybook or subprogram.

      * BEFORE: Duplicated error handling in 5 paragraphs
       2000-PROCESS-CHECKING.
           ...
           IF WS-FILE-STATUS NOT = "00"
               MOVE "READ-ERROR" TO WS-ERR-TYPE
               MOVE WS-FILE-STATUS TO WS-ERR-CODE
               MOVE "ACCT-MASTER" TO WS-ERR-FILE
               PERFORM 9000-LOG-ERROR
               PERFORM 9100-ABORT
           END-IF.

      * AFTER: Extracted into a reusable paragraph
       9500-CHECK-FILE-STATUS.
           IF WS-FILE-STATUS NOT = "00"
               MOVE WS-FILE-STATUS TO WS-ERR-CODE
               PERFORM 9000-LOG-ERROR
               PERFORM 9100-ABORT
           END-IF.

      * Called as:
       2000-PROCESS-CHECKING.
           ...
           MOVE "READ-ERROR"  TO WS-ERR-TYPE
           MOVE "ACCT-MASTER" TO WS-ERR-FILE
           PERFORM 9500-CHECK-FILE-STATUS.

Similar code with minor variations: Use a parameterized subprogram or copybook with REPLACING:

      * COPY with REPLACING for parameterized inclusion
           COPY CPY-VALIDATE-FIELD
               REPLACING ==:FIELD-NAME:== BY ==ACCT-NUMBER==
                         ==:FIELD-SIZE:== BY ==10==
                         ==:FIELD-TYPE:== BY =="NUMERIC"==.

35.8 The Code Review Process

An effective code review process for COBOL has three stages: preparation, review, and follow-up.

Stage 1: Preparation

The developer submitting code for review should provide:

  1. The changed code (ideally with a diff showing what changed)
  2. The reason for the change (defect fix, new feature, refactoring)
  3. The test results (unit test output, integration test comparison)
  4. The impact assessment (what other programs or files might be affected)

Stage 2: Review

The reviewer examines the code against the review checklist (see below) and provides feedback. Reviews should be:

  • Specific: "Line 847: WS-TEMP-RATE is used before being initialized" — not "check your variables."
  • Constructive: "Consider using EVALUATE instead of nested IFs here for clarity" — not "this code is confusing."
  • Prioritized: Distinguish between critical issues (bugs), major issues (standards violations), and minor issues (style preferences).

Stage 3: Follow-up

The developer addresses all critical and major issues, responds to minor issues, and resubmits if necessary. The reviewer verifies the fixes.

The COBOL Code Review Checklist

COBOL CODE REVIEW CHECKLIST
============================

DATA DIVISION:
[ ] All numeric fields used in arithmetic are COMP or COMP-3
[ ] PIC clauses are appropriate for expected data ranges
[ ] Signed fields (PIC S...) used where negative values possible
[ ] VALUE clauses present for all accumulators and counters
[ ] 88-level conditions used for flag fields
[ ] No unused data items
[ ] Copybooks used for shared record layouts

PROCEDURE DIVISION:
[ ] All IF/EVALUATE/PERFORM have explicit scope terminators
[ ] No GO TO (except in error-handling exit patterns)
[ ] All I/O operations check FILE STATUS
[ ] All COMPUTE statements have ON SIZE ERROR
[ ] INITIALIZE used before first use of group items
[ ] Paragraphs are 50 lines or fewer
[ ] Nesting depth 3 levels or fewer
[ ] No PERFORM THRU (or justified if used)

LOGIC:
[ ] Boundary conditions handled (zero, max, exactly-at-threshold)
[ ] Error paths return meaningful messages
[ ] All EVALUATE branches have WHEN OTHER
[ ] Numeric comparisons account for decimal precision

DOCUMENTATION:
[ ] Program header complete (purpose, inputs, outputs, history)
[ ] Each paragraph has a purpose comment
[ ] Complex logic has inline explanations
[ ] Modification history updated

TESTING:
[ ] Unit tests cover new/changed logic
[ ] Regression suite passes
[ ] Test data includes boundary values and error cases

🧪 Lab Exercise: Take a program you wrote for a previous chapter and review it against the checklist above. How many items fail? Create a list of specific fixes, prioritized by severity, and refactor the program to address them.

35.9 GlobalBank Case Study: Reviewing Legacy ACCT-MAINT

When GlobalBank decided to modernize their account maintenance program (ACCT-MAINT), the first step was a thorough code review. The program had been in production since 1992 and had been modified by at least fifteen different developers. At 4,200 lines, it was one of the largest programs in the GLOBALBANK-CORE system.

Maria Chen and Derek Washington conducted the review together, combining Maria's deep domain knowledge with Derek's fresh eyes and formal training in code quality.

What They Found

Critical Issues (4): 1. Line 847: WS-TEMP-RATE used in a COMPUTE before initialization. In normal operation, the field happened to contain zero from a previous MOVE, but a change in execution order could produce wrong results. 2. Line 1456: VSAM WRITE with no FILE STATUS check. A duplicate-key condition would cause a silent failure. 3. Line 2103: COMPUTE without ON SIZE ERROR for an annual fee calculation. Accounts with extremely high balances could overflow the result field. 4. Line 3012: A nested IF structure where the period-based scope terminator was on the wrong line, causing the ELSE clause to associate with the wrong IF.

Major Issues (11): - Three paragraphs exceeding 100 lines (the longest was 187 lines) - Four instances of GO TO creating non-obvious control flow - Two blocks of code duplicated across different paragraphs (copy/paste) - One missing WHEN OTHER in an EVALUATE with 8 branches - One instance of a PIC 9(5)V99 field storing values that could exceed 99,999.99

Minor Issues (23): - Inconsistent paragraph naming (some with numeric prefixes, some without) - Missing paragraph header comments (8 paragraphs) - Commented-out code blocks (3 blocks totaling 94 lines) - Unused WORKING-STORAGE fields (6 fields)

Dead Code: - 3 paragraphs never referenced (from a 2004 feature that was later removed) - 94 lines of commented-out code - 6 unused data items

The Remediation Plan

Rather than fixing everything at once (which would require retesting the entire program), Maria proposed a phased remediation:

Phase 1 (Immediate): Fix the 4 critical issues. These represented potential production defects.

Phase 2 (Next Sprint): Address the 11 major issues. Break long paragraphs into smaller ones, replace GO TOs, eliminate duplication.

Phase 3 (Opportunistic): Fix minor issues whenever a developer is modifying nearby code. Remove dead code during Phase 2 refactoring.

Metrics Before and After

Metric Before After Phase 2
Lines of code 4,200 3,650
Cyclomatic complexity (max paragraph) 47 12
Dead code lines 194 0
Duplicated lines 15% 3%
Paragraphs > 50 lines 7 0
Nesting depth (max) 7 3
GO TO statements 4 0
Code review issues 38 2 (minor only)

Derek noted that the refactored program was actually shorter than the original, despite being more thoroughly documented. "Dead code removal alone gave us 194 lines back," he said. "And breaking up the long paragraphs actually reduced total lines because we eliminated the duplication."

35.10 MedClaim Case Study: Applying Standards to CLM-ADJUD

When MedClaim hired three new COBOL developers to handle the regulatory change backlog, James Okafor realized they needed coding standards — not just for new code, but to establish a baseline for how CLM-ADJUD should look after modifications.

The Standards Document

James, Sarah Kim, and Tomás Rivera collaborated on a 12-page coding standards document. Key decisions:

  1. Paragraph naming: NNNN-VERB-NOUN format (e.g., 3000-VALIDATE-CLAIM, 4500-CALCULATE-COPAY). Numbering in increments of 100 within a section to allow insertions.

  2. Error handling pattern: Every paragraph that could fail must set WS-ERROR-CODE and WS-ERROR-MSG, then PERFORM 9000-ERROR-HANDLER. No inline error handling — centralize it.

  3. DB2 interaction pattern: All EXEC SQL must be followed by a check of SQLCODE. Use a standard 9500-CHECK-SQLCODE paragraph.

  4. Comment standard: Block header for every paragraph. Inline comments for non-obvious logic (but not for self-documenting code like MOVE 0 TO WS-COUNTER).

Enforcing Standards on New Code

The new CLM-ADJUD module for telehealth processing was the first program written under the new standards. James used SonarQube to enforce compliance:

SonarQube Quality Gate: CLM-ADJUD-TELE
Status: PASSED

Metrics:
  Lines of Code: 1,847
  Duplicated Lines: 0.0%
  Cyclomatic Complexity: 32 (max paragraph: 8)
  Code Smells: 0
  Bugs: 0
  Vulnerabilities: 0

All 47 rules passed.

The contrast with the legacy CLM-ADJUD (which failed 23 of 47 rules) was striking. James presented both reports to management as evidence that the standards investment was paying off.

Gradual Migration of Legacy Code

James adopted a "boy scout rule" — leave the code better than you found it. Any time a developer modified a paragraph in the legacy CLM-ADJUD, they were required to:

  1. Bring that paragraph into compliance with the new standards
  2. Add a unit test for the paragraph (as discussed in Chapter 34)
  3. Remove any dead code in the immediate vicinity

Over six months, approximately 40% of CLM-ADJUD's paragraphs had been touched and brought into compliance, without a dedicated refactoring project.

35.11 Technical Debt Measurement

Technical debt is the accumulated cost of shortcuts, workarounds, and quality compromises in a codebase. Static analysis tools can quantify technical debt in terms of remediation effort:

Technical Debt Report: GLOBALBANK-CORE
============================================
Total Programs: 147
Total Lines of Code: 1,247,000

Technical Debt Summary:
  Estimated Remediation Effort: 4,200 person-hours
  Technical Debt Ratio: 8.3%
    (Remediation cost / development cost)

Breakdown by Category:
  Duplicated code:        1,400 hours (33%)
  Complexity:               900 hours (21%)
  Dead code:                650 hours (15%)
  Missing error handling:   500 hours (12%)
  Standards violations:     450 hours (11%)
  Documentation gaps:       300 hours  (7%)

Top 10 Most Indebted Programs:
  1. ACCT-MAINT.cbl    - 340 hours
  2. TXN-PROC.cbl      - 280 hours
  3. RPT-DAILY.cbl     - 220 hours
  ...

Using Technical Debt to Prioritize

Technical debt metrics help managers make informed decisions:

  • Which programs need attention first? The ones with the highest debt and the most frequent modifications.
  • How much investment is needed? The total remediation estimate provides a budget number.
  • Are we getting better or worse? Tracking debt over time reveals whether new development is adding or reducing debt.

🔗 Cross-Reference: Technical debt measurement directly informs the modernization decisions we'll explore in Chapter 37 (Migration and Modernization). A program with low technical debt is a good candidate for wrapping or extending. A program with high technical debt may be better served by rewriting.

35.12 Automated Analysis Pipeline

In mature COBOL shops, static analysis is not a one-time activity — it is embedded in the development workflow as an automated pipeline. Every time code changes, the pipeline re-analyzes the program and reports new issues before they reach production.

Pipeline Architecture

┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│ Developer│    │  Source   │    │  Static  │    │ Quality  │
│  Commits │───►│  Control │───►│ Analysis │───►│  Gate    │
│  Code    │    │  (Git/   │    │ (SonarQube│    │ Pass/   │
│          │    │  Endevor)│    │  ADDI)   │    │  Fail   │
└──────────┘    └──────────┘    └──────────┘    └──────────┘
                                                     │
                                     ┌───────────────┼───────────────┐
                                     │               │               │
                                     ▼               ▼               ▼
                                ┌─────────┐   ┌──────────┐   ┌──────────┐
                                │ Pass:   │   │ Warn:    │   │ Fail:    │
                                │ Promote │   │ Review   │   │ Block    │
                                │ to Test │   │ Required │   │ Promotion│
                                └─────────┘   └──────────┘   └──────────┘

JCL for Automated Analysis

The following JCL invokes GnuCOBOL's compiler warnings and a custom analysis program in sequence:

//ANALYZE  JOB (ACCT),'STATIC ANALYSIS',CLASS=A,
//         MSGCLASS=X,MSGLEVEL=(1,1)
//*
//* Step 1: Compile with maximum warnings
//COMPILE  EXEC PGM=COBC,
//         PARM='-Wall -Wextra -fxref -x'
//SYSIN    DD DSN=PROD.SOURCE(BALCALC),DISP=SHR
//SYSLIB   DD DSN=PROD.COPYLIB,DISP=SHR
//SYSPRINT DD DSN=ANALYSIS.WARNINGS(BALCALC),DISP=SHR
//*
//* Step 2: Run cross-reference analysis
//XREF     EXEC PGM=XREFANLZ
//STEPLIB  DD DSN=TOOLS.LOAD,DISP=SHR
//INPUT    DD DSN=ANALYSIS.WARNINGS(BALCALC),DISP=SHR
//OUTPUT   DD DSN=ANALYSIS.XREF(BALCALC),DISP=SHR
//*
//* Step 3: Check for dead code
//DEADCODE EXEC PGM=DEADCHEK
//STEPLIB  DD DSN=TOOLS.LOAD,DISP=SHR
//SOURCE   DD DSN=PROD.SOURCE(BALCALC),DISP=SHR
//XREF     DD DSN=ANALYSIS.XREF(BALCALC),DISP=SHR
//REPORT   DD SYSOUT=*
//*
//* Step 4: Quality gate check
//QGATE    EXEC PGM=QLTYGAT
//STEPLIB  DD DSN=TOOLS.LOAD,DISP=SHR
//WARNINGS DD DSN=ANALYSIS.WARNINGS(BALCALC),DISP=SHR
//RULES    DD DSN=ANALYSIS.RULES(STANDARD),DISP=SHR
//RESULT   DD SYSOUT=*
//* RC=0: Pass  RC=4: Warnings  RC=8: Failed

Integrating with Jenkins

For shops using Jenkins for mainframe CI/CD, the analysis pipeline can be triggered automatically:

// Jenkinsfile stage for static analysis
stage('Static Analysis') {
    steps {
        sh 'submit JCL/ANALYZE-ALL.jcl'
        sh 'wait-for-job ANALYZE-ALL'
        sh 'parse-analysis-results.sh > analysis-report.json'
    }
    post {
        always {
            archiveArtifacts 'analysis-report.json'
            script {
                def report = readJSON file: 'analysis-report.json'
                if (report.critical_issues > 0) {
                    error "Static analysis found ${report.critical_issues} critical issues"
                }
                if (report.major_issues > 5) {
                    unstable "Static analysis found ${report.major_issues} major issues"
                }
            }
        }
    }
}

💡 Shift Left: The earlier you catch a defect, the cheaper it is to fix. A defect caught by static analysis during development costs approximately $50 to fix. The same defect caught in integration testing costs $500. In production, it costs $5,000-$50,000. Automated static analysis pipelines catch defects at the cheapest possible point.

35.13 Dead Code Detection Patterns in Depth

Dead code detection requires more than simply scanning for unreferenced paragraphs. Sophisticated detection must account for COBOL's unique control flow mechanisms.

Pattern A: Dynamic CALL Targets

Static analysis may flag a paragraph or program as dead when it is actually reached through a dynamic CALL:

      * This subprogram appears unreferenced in static analysis
      * because it's called dynamically
       01  WS-PROGRAM-NAME    PIC X(8).
       ...
           MOVE "RPTGEN01" TO WS-PROGRAM-NAME
           CALL WS-PROGRAM-NAME USING WS-REPORT-DATA

The program RPTGEN01 does not appear literally in any CALL statement — only the variable WS-PROGRAM-NAME does. A naive cross-reference analysis would mark RPTGEN01 as dead. The detection tool must trace the values assigned to WS-PROGRAM-NAME to identify the actual call targets.

Pattern B: PERFORM THRU Hidden Inclusion

       PERFORM 2000-START THRU 2999-END.

       2000-START.
           ...

       2500-MIDDLE.
      *    This paragraph is NOT dead — it's included in the
      *    PERFORM THRU range. But it appears unreferenced in
      *    a simple paragraph-reference search.
           ...

       2999-END.
           EXIT.

Static analysis must account for PERFORM THRU ranges. Any paragraph numerically between the start and end paragraph is potentially active.

Pattern C: Fall-Through Execution

In legacy COBOL programs without explicit STOP RUN or GOBACK at the end of each paragraph, execution can "fall through" from one paragraph to the next:

       2000-PROCESS-A.
           MOVE "A" TO WS-TYPE
           PERFORM 3000-CALCULATE.
      *    No period-terminated STOP RUN here — execution falls
      *    through to 2100-PROCESS-B

       2100-PROCESS-B.
      *    This paragraph appears unreferenced but is reached
      *    by fall-through from 2000-PROCESS-A
           MOVE "B" TO WS-TYPE
           PERFORM 3000-CALCULATE.

Building a Dead Code Inventory

When conducting a dead code audit, build a structured inventory:

DEAD CODE INVENTORY: ACCT-MAINT.cbl
Audit Date: 2025-11-01
Auditor: Derek Washington

PARAGRAPHS (never PERFORMed or referenced):
  Line  Paragraph            Last Modified  Confidence
  ----  ---------            -------------  ----------
  1250  2500-OLD-VALIDATE    2004-03-15     HIGH
  1380  2600-LEGACY-FORMAT   2003-08-22     HIGH
  2810  7500-TEMP-CALC       2019-01-10     MEDIUM
        (may be called dynamically — investigate)

DATA ITEMS (defined but never referenced in PROCEDURE DIVISION):
  Line  Field                PIC            Confidence
  ----  -----                ---            ----------
   45   WS-OLD-COUNTER       PIC 9(5)       HIGH
   67   WS-TEMP-FLAG-2       PIC X          HIGH
   89   WS-ARCHIVE-DATE      PIC X(10)      MEDIUM
        (may be referenced via MOVE CORRESPONDING)
  112   WS-RESERVED-01       PIC X(50)      LOW
        (filler/reserved field — check if intentional)

COMMENTED-OUT CODE:
  Lines      Description            Last Modified
  -----      -----------            -------------
  890-936    Old fee calculation     2003-08-22
  1560-1575  Debug display stmts     2018-05-14
  2200-2241  Unused error handler    2011-11-30

TOTAL: 3 dead paragraphs, 4 unused fields, 94 commented lines
ESTIMATED CLEANUP EFFORT: 4 hours

Try It Yourself: Choose a COBOL program you have written (or one from a previous chapter exercise). Compile it with -fxref (GnuCOBOL) to produce a cross-reference listing. Examine the listing for data items and paragraphs that appear only once (at their definition). These are candidates for dead code. Document each one and determine whether it is truly dead or reached through an indirect mechanism.

35.14 Complexity Metrics Calculation: A Worked Example

Let us walk through a complete complexity analysis of a realistic COBOL paragraph. This exercise demonstrates how to calculate cyclomatic complexity, assess maintainability, and make data-driven refactoring decisions.

The Paragraph Under Analysis

       3000-PROCESS-CLAIM.
           IF CLM-STATUS = "N"                            * +1 (IF)
               IF CLM-MEMBER-ACTIVE = "Y"                 * +1 (IF)
                   IF CLM-PROVIDER-TYPE = "I"              * +1 (IF)
                      AND CLM-NETWORK-STATUS = "Y"         * +1 (AND)
                       PERFORM 3100-IN-NETWORK-CALC
                   ELSE
                       IF CLM-PROVIDER-TYPE = "O"          * +1 (IF)
                           PERFORM 3200-OUT-NETWORK-CALC
                       ELSE
                           MOVE "INVALID-PROV-TYPE"
                               TO CLM-ERROR-MSG
                           PERFORM 9000-ERROR-HANDLER
                       END-IF
                   END-IF
               ELSE
                   EVALUATE CLM-INACTIVE-REASON            *
                       WHEN "T"                            * +1 (WHEN)
                           PERFORM 3300-TERMED-MEMBER
                       WHEN "C"                            * +1 (WHEN)
                           PERFORM 3400-COBRA-PROCESS
                       WHEN "S"                            * +1 (WHEN)
                           PERFORM 3500-SUSPENDED-MEMBER
                       WHEN OTHER                          * +1 (WHEN)
                           MOVE "UNKNOWN-INACTIVE"
                               TO CLM-ERROR-MSG
                           PERFORM 9000-ERROR-HANDLER
                   END-EVALUATE
               END-IF
           ELSE
               IF CLM-STATUS = "P"                         * +1 (IF)
                   PERFORM 3600-REPROCESS-PENDING
               ELSE IF CLM-STATUS = "H"                    * +1 (IF)
                   PERFORM 3700-RELEASE-HOLD
               ELSE
                   MOVE "INVALID-STATUS" TO CLM-ERROR-MSG
                   PERFORM 9000-ERROR-HANDLER
               END-IF
           END-IF.

Calculating Cyclomatic Complexity

Starting from the base of 1 (program entry), we count each decision point:

Decision Point Line Type Running Total
Base Entry point 1
CLM-STATUS = "N" 2 IF 2
CLM-MEMBER-ACTIVE = "Y" 3 IF 3
CLM-PROVIDER-TYPE = "I" 4 IF 4
AND CLM-NETWORK-STATUS = "Y" 5 AND 5
CLM-PROVIDER-TYPE = "O" 8 IF 6
WHEN "T" 16 WHEN 7
WHEN "C" 18 WHEN 8
WHEN "S" 20 WHEN 9
WHEN OTHER 22 WHEN 10
CLM-STATUS = "P" 28 IF 11
CLM-STATUS = "H" 30 IF 12

Cyclomatic Complexity = 12 — in the "moderate" range, but approaching "high." This paragraph is a candidate for refactoring.

Refactoring for Reduced Complexity

We can reduce the complexity by extracting the inner decision trees into separate paragraphs:

       3000-PROCESS-CLAIM.
           EVALUATE CLM-STATUS
               WHEN "N"
                   PERFORM 3010-PROCESS-NEW-CLAIM
               WHEN "P"
                   PERFORM 3600-REPROCESS-PENDING
               WHEN "H"
                   PERFORM 3700-RELEASE-HOLD
               WHEN OTHER
                   MOVE "INVALID-STATUS" TO CLM-ERROR-MSG
                   PERFORM 9000-ERROR-HANDLER
           END-EVALUATE.
      *    Cyclomatic complexity: 4

       3010-PROCESS-NEW-CLAIM.
           IF CLM-MEMBER-ACTIVE = "Y"
               PERFORM 3020-ACTIVE-MEMBER-CLAIM
           ELSE
               PERFORM 3030-INACTIVE-MEMBER-CLAIM
           END-IF.
      *    Cyclomatic complexity: 2

       3020-ACTIVE-MEMBER-CLAIM.
           EVALUATE CLM-PROVIDER-TYPE
               WHEN "I"
                   IF CLM-NETWORK-STATUS = "Y"
                       PERFORM 3100-IN-NETWORK-CALC
                   ELSE
                       PERFORM 3200-OUT-NETWORK-CALC
                   END-IF
               WHEN "O"
                   PERFORM 3200-OUT-NETWORK-CALC
               WHEN OTHER
                   MOVE "INVALID-PROV-TYPE"
                       TO CLM-ERROR-MSG
                   PERFORM 9000-ERROR-HANDLER
           END-EVALUATE.
      *    Cyclomatic complexity: 5

       3030-INACTIVE-MEMBER-CLAIM.
           EVALUATE CLM-INACTIVE-REASON
               WHEN "T"
                   PERFORM 3300-TERMED-MEMBER
               WHEN "C"
                   PERFORM 3400-COBRA-PROCESS
               WHEN "S"
                   PERFORM 3500-SUSPENDED-MEMBER
               WHEN OTHER
                   MOVE "UNKNOWN-INACTIVE"
                       TO CLM-ERROR-MSG
                   PERFORM 9000-ERROR-HANDLER
           END-EVALUATE.
      *    Cyclomatic complexity: 4

The refactored version has a maximum paragraph complexity of 5 (down from 12). Each paragraph is independently understandable and testable. The total logic is the same, but the cognitive load on the reader is dramatically lower.

Calculating the Maintainability Index

For the original paragraph (before refactoring):

Lines of Code (LOC) = 35
Cyclomatic Complexity (CC) = 12
Halstead Volume (V) ≈ 450 (estimated)

MI = 171 - 5.2 * ln(450) - 0.23 * 12 - 16.2 * ln(35)
   = 171 - 5.2 * 6.11 - 2.76 - 16.2 * 3.56
   = 171 - 31.77 - 2.76 - 57.67
   = 78.8

Maintainability Index of 78.8 falls in the "moderately maintainable" range. After refactoring, with each paragraph under 15 lines and complexity under 5, the MI for each paragraph would be in the "highly maintainable" range (above 85).

📊 By the Numbers: A study of 50 COBOL programs at a major financial institution found that paragraphs with cyclomatic complexity above 15 had a defect rate 4.2 times higher than paragraphs with complexity below 10. The correlation between complexity and defect density was 0.78 — a strong positive correlation. Refactoring high-complexity paragraphs reduced defect injection rates by 60% in subsequent modification cycles.

35.15 Code Review Workflow in Practice

A formal code review process in a COBOL shop involves specific roles, artifacts, and decision points. Let us trace a complete review workflow through the GlobalBank process.

The Review Workflow

┌─────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐
│Developer│   │  Submit   │   │ Reviewer │   │ Approved │
│ Writes  │──►│  Review   │──►│ Examines │──►│  or      │
│ Code    │   │  Package  │   │  Code    │   │ Rejected │
└─────────┘   └──────────┘   └──────────┘   └──────────┘
                   │                              │
                   │         ┌──────────┐         │
                   │         │ Rework   │◄────────┘
                   │         │ (if      │    (if rejected)
                   │         │ rejected)│
                   │         └──────────┘
                   │              │
                   │              ▼
                   │         ┌──────────┐
                   │         │ Resubmit │
                   │         └──────────┘
                   │              │
                   └──────────────┘

The Review Package

At GlobalBank, Derek Washington submits a review package containing:

  1. Change Request Reference: CR-2025-0847 — "Add money market penalty calculation for early withdrawal"
  2. Diff Listing: Generated by Endevor or SCLM showing exactly which lines changed
  3. Impact Assessment: "Changes ACCT-MAINT paragraphs 3200-3250. No copybook changes. No DB2 changes. Affects money market accounts only."
  4. Test Results: "Unit tests: 47 passed, 0 failed. Integration test: nightly batch comparison clean."
  5. Static Analysis: "SonarQube: 0 critical, 0 major, 1 minor (paragraph comment style)"

Reviewer Checklist with Specific Findings

Maria Chen reviews Derek's code against the standard checklist and documents her findings:

CODE REVIEW FINDINGS
Reviewer: Maria Chen
Date: 2025-11-05
Program: ACCT-MAINT
Change Request: CR-2025-0847

CRITICAL (must fix before promotion):
  None

MAJOR (should fix before promotion):
  1. Line 3215: COMPUTE without ON SIZE ERROR
     Risk: Penalty amount could overflow PIC S9(7)V99
           for very high balance accounts
     Fix: Add ON SIZE ERROR clause

  2. Line 3230: Missing boundary check
     Risk: If WS-WITHDRAWAL-DATE < ACCT-MATURITY-DATE
           is false AND dates are equal, penalty
           is incorrectly applied
     Fix: Change < to <= for maturity date comparison

MINOR (fix opportunistically):
  1. Line 3200: Paragraph header comment references
     "early termination" but paragraph name says
     "penalty calculation" — terminology inconsistent
  2. Line 3245: WS-PENALTY-RATE declared as DISPLAY
     instead of COMP-3 — performance impact negligible
     for current volumes but inconsistent with standards

APPROVED: With conditions (fix major issues 1 and 2)

This structured feedback gives Derek specific, actionable items with clear prioritization. The major issues represent real defect risks that would affect production behavior. The minor issues improve code quality but do not block promotion.

⚖️ Debate: Formal vs. Informal Reviews: Formal reviews (structured process, documented findings, sign-off required) are more thorough but slower. Informal reviews (desk check, pair programming, quick walkthrough) are faster but less rigorous. Most COBOL shops use formal reviews for production code changes and informal reviews for development-phase code. The critical factor is that some form of review happens for every change — the formality level can vary based on risk.

35.16 Additional Defect Patterns: Advanced Examples

Beyond the eight common patterns in Section 35.2, experienced reviewers learn to spot subtler defects that arise from COBOL's specific language semantics. These patterns are less common but often more damaging when they occur.

Pattern 9: MOVE Truncation Without Warning

When moving data between fields of different sizes, COBOL silently truncates without raising an error:

      * SILENT BUG: Alphanumeric truncation
       01  WS-FULL-NAME       PIC X(40).
       01  WS-SHORT-NAME      PIC X(20).
       ...
           MOVE "WASHINGTON-MONTGOMERY, DEREK JAMES III"
               TO WS-FULL-NAME
           MOVE WS-FULL-NAME TO WS-SHORT-NAME
      *    WS-SHORT-NAME = "WASHINGTON-MONTGOMER"
      *    Last 20 characters lost — no error, no warning
      * SILENT BUG: Numeric truncation on the left
       01  WS-LARGE-AMOUNT    PIC S9(11)V99 COMP-3.
       01  WS-SMALL-AMOUNT    PIC S9(7)V99  COMP-3.
       ...
           MOVE 12345678.90 TO WS-LARGE-AMOUNT
           MOVE WS-LARGE-AMOUNT TO WS-SMALL-AMOUNT
      *    WS-SMALL-AMOUNT = 2345678.90
      *    High-order digit "1" lost — $10 million vanished!

Reviewers should verify that destination PIC clauses are always large enough to hold the maximum value from the source field. This is especially critical in financial calculations where intermediate results may exceed the size of output fields.

Pattern 10: Improper Subscript Handling

      * BUG: Subscript not validated before use
       01  WS-TABLE.
           05 WS-ENTRY OCCURS 100 TIMES  PIC X(20).
       01  WS-INDEX            PIC 9(4).
       ...
           ACCEPT WS-INDEX FROM ENVIRONMENT
           MOVE WS-ENTRY(WS-INDEX) TO WS-OUTPUT
      *    If WS-INDEX = 0 or > 100, this causes a storage
      *    overlay — corrupting adjacent memory silently
      *    (or an S0C4 ABEND if SSRANGE is active)

      * FIX: Always validate subscripts
           ACCEPT WS-INDEX FROM ENVIRONMENT
           IF WS-INDEX >= 1 AND WS-INDEX <= 100
               MOVE WS-ENTRY(WS-INDEX) TO WS-OUTPUT
           ELSE
               MOVE "INVALID INDEX" TO WS-ERROR-MSG
               PERFORM 9000-ERROR-HANDLER
           END-IF

Pattern 11: Incorrect String Handling with INSPECT

      * SUBTLE BUG: INSPECT REPLACING counts wrong
       01  WS-ACCOUNT-NUM     PIC X(10) VALUE "0012345678".
       01  WS-ZERO-COUNT      PIC 9(3).
       ...
      *    Developer wants to strip leading zeros
           INSPECT WS-ACCOUNT-NUM
               TALLYING WS-ZERO-COUNT
               FOR LEADING "0"
      *    WS-ZERO-COUNT = 2 (correct)

           INSPECT WS-ACCOUNT-NUM
               REPLACING LEADING "0" BY SPACES
      *    WS-ACCOUNT-NUM = "  12345678" (correct)

      *    BUT: If the developer forgot to initialize WS-ZERO-COUNT
      *    before the TALLYING, it accumulates across calls!
      *    Second call: WS-ZERO-COUNT starts at 2, not 0
      *    After tallying: WS-ZERO-COUNT = 4 (wrong!)

Pattern 12: REDEFINES Alignment Mismatch

      * DANGEROUS: REDEFINES with different interpretations
       01  WS-DATE-FIELD.
           05 WS-DATE-NUM      PIC 9(8).
       01  WS-DATE-PARTS REDEFINES WS-DATE-FIELD.
           05 WS-DATE-YEAR     PIC 9(4).
           05 WS-DATE-MONTH    PIC 9(2).
           05 WS-DATE-DAY      PIC 9(2).
       ...
           MOVE 20251015 TO WS-DATE-NUM
           IF WS-DATE-MONTH > 12
               DISPLAY "Invalid month"
           END-IF
      *    Works if format is always YYYYMMDD
      *    FAILS SILENTLY if someone moves MMDDYYYY format:
           MOVE 10152025 TO WS-DATE-NUM
      *    Now WS-DATE-YEAR = 1015, WS-DATE-MONTH = 20
      *    The month check catches it, but YEAR is corrupted

Reviewers should verify that every REDEFINES has clear documentation of the expected format, and that all code paths that populate the redefined field use the correct format.

🧪 Lab Exercise: Create a COBOL program that contains at least one instance of each defect pattern from Patterns 9-12. Then write a reviewer's report identifying each defect, explaining the risk, and proposing a fix. Compile the program and observe which defects the compiler catches (with -Wall -Wextra) and which it misses. This exercise demonstrates why code review and static analysis are complementary — neither catches everything alone.

35.17 Measuring Review Effectiveness

How do you know if your code review process is actually working? Measuring review effectiveness requires tracking both leading indicators (review activity) and lagging indicators (production defects).

Metrics to Track

Metric What It Measures Target
Review coverage % of code changes that receive review 100%
Defect density Issues found per 1,000 lines reviewed 5-15 (lower = cleaner code)
Review latency Time from submission to review completion < 2 business days
Rework rate % of reviews requiring re-submission < 20%
Escape rate Production defects that should have been caught in review < 10% of production defects
Critical escape rate Critical production defects missed by review 0% (target)

GlobalBank's Review Metrics Dashboard

After six months of formal code reviews, Maria Chen presented the following data to management:

CODE REVIEW EFFECTIVENESS REPORT
Period: April 2025 - September 2025
Programs Reviewed: 89
Total Changes Reviewed: 234
Lines of Code Reviewed: 47,800

DEFECT DETECTION:
  Issues Found in Review:     412
    Critical:                   18
    Major:                     127
    Minor:                     267

  Production Defects (same period): 4
    Defects that review should have caught: 1
    Escape rate: 25% (1 of 4)
    Critical escape rate: 0%

COMPARISON TO PRE-REVIEW PERIOD:
  Production defects (previous 6 months): 14
  Reduction: 71%

REVIEW EFFICIENCY:
  Average review time: 45 minutes per review
  Average defects found: 1.8 per review
  Cost of review: ~$150 per review (45 min * $200/hr)
  Cost of production defect: ~$15,000 per defect
  Defects prevented: ~10 (estimated)
  Savings: ~$150,000 - $35,100 review cost = $114,900 net savings

The data made an unambiguous case: code review was saving GlobalBank approximately $115,000 per six-month period, with a 327% return on investment. Management approved expanding the program to include MedClaim's codebase.

Beyond point-in-time metrics, tracking trends reveals whether code quality is improving or degrading:

Month     Reviews   Issues/Review   Critical   Rework%
───────   ───────   ─────────────   ────────   ───────
April        35          2.8           4         28%
May          38          2.4           3         24%
June         42          2.1           2         20%
July         40          1.8           1         16%
August       39          1.6           0         14%
September    40          1.4           0         12%

The downward trend in issues-per-review indicates that developers are learning from review feedback and writing better code initially. The elimination of critical issues by month five shows that the most dangerous defect patterns are being internalized. The declining rework rate means reviews are finding fewer issues that require re-submission — developers are getting it right the first time.

Maria Chen called this the "review dividend" — the investment in reviews pays off not just by catching bugs, but by teaching developers to avoid bugs in the first place. "The best code review," she told her team, "is the one that finds nothing — because the developer already caught everything before submitting."

Setting Up Review Metrics Collection

To collect review metrics systematically, GlobalBank built a simple COBOL program that parsed review reports and produced the monthly dashboard:

      *================================================================*
      * REVW-METRICS: Aggregate code review findings for reporting.
      * Input: Sequential file of review records
      * Output: Monthly summary report
      *================================================================*
       IDENTIFICATION DIVISION.
       PROGRAM-ID. REVW-METRICS.

       DATA DIVISION.
       FILE SECTION.
       FD  REVIEW-FILE.
       01  REVIEW-RECORD.
           05 RVW-DATE            PIC X(10).
           05 RVW-PROGRAM         PIC X(8).
           05 RVW-REVIEWER        PIC X(20).
           05 RVW-DEVELOPER       PIC X(20).
           05 RVW-CRITICAL-COUNT  PIC 9(3).
           05 RVW-MAJOR-COUNT     PIC 9(3).
           05 RVW-MINOR-COUNT     PIC 9(3).
           05 RVW-RESULT          PIC X.
               88 RVW-APPROVED    VALUE "A".
               88 RVW-REWORK      VALUE "R".

       WORKING-STORAGE SECTION.
       01  WS-MONTH-TOTALS.
           05 WS-REVIEW-COUNT     PIC 9(5)  VALUE 0.
           05 WS-TOTAL-CRITICAL   PIC 9(5)  VALUE 0.
           05 WS-TOTAL-MAJOR      PIC 9(5)  VALUE 0.
           05 WS-TOTAL-MINOR      PIC 9(5)  VALUE 0.
           05 WS-REWORK-COUNT     PIC 9(5)  VALUE 0.
           05 WS-AVG-ISSUES       PIC 9(3)V99.
           05 WS-REWORK-PCT       PIC 9(3)V9.

       PROCEDURE DIVISION.
       0000-MAIN.
           OPEN INPUT REVIEW-FILE
           PERFORM 1000-READ-REVIEW
           PERFORM 2000-ACCUMULATE
               UNTIL END-OF-FILE
           PERFORM 3000-CALCULATE-AVERAGES
           PERFORM 4000-PRINT-REPORT
           CLOSE REVIEW-FILE
           STOP RUN.

This program exemplifies the principle that quality processes themselves can be supported by COBOL tooling — the language's strength in batch processing and report generation makes it well-suited for analyzing its own quality metrics.

📊 By the Numbers: The software engineering literature consistently finds that code review is one of the highest-ROI quality practices available. A landmark study by Fagan (1976) found that formal inspections removed 60-90% of defects before testing. More recent studies confirm these findings: Google reports that code review catches approximately 15% of all bugs discovered in their codebase, with a much higher rate for design-level issues. For COBOL specifically, the high cost of production defects in financial and healthcare systems makes the ROI of review even more compelling.

35.18 MedClaim Case Study: Static Analysis Catches a Critical Defect

Six months after implementing SonarQube for COBOL, James Okafor's static analysis pipeline caught a defect that could have cost MedClaim millions. During a routine analysis run on a modified version of CLM-ADJUD, the tool flagged a critical issue:

CRITICAL: Line 2847 — Variable WS-COORD-BENEFIT-AMT
used before initialization in paragraph 4200-CALC-COB.
This variable is used in a SUBTRACT operation at line 2852.
If the program reaches line 2847 without executing
paragraph 4100-LOAD-COB-DATA, WS-COORD-BENEFIT-AMT
contains unpredictable residual data.

Investigation revealed that a recent code change had added a new path through the adjudication logic that bypassed paragraph 4100-LOAD-COB-DATA under certain conditions — specifically, when a claim had coordination of benefits (COB) but the secondary payer had already paid in full. In this case, the program jumped directly to 4200-CALC-COB, where WS-COORD-BENEFIT-AMT was subtracted from the total payment. With residual data in the field, the subtraction could produce any result — including negative payment amounts or massively inflated payments.

The defect had been introduced two weeks earlier and had passed code review because the reviewer focused on the new logic path and did not trace the data flow back to the initialization point. Static analysis caught what human review missed — exactly the complementary relationship between the two practices.

"This is the defect that justified our entire SonarQube investment," James told his team. "If this had reached production during a high-COB batch run, we could have overpaid or underpaid millions of dollars in claims before anyone noticed."

The fix was simple — add an INITIALIZE statement for WS-COORD-BENEFIT-AMT at the beginning of 4200-CALC-COB — but the consequences of the unfixed defect could have been severe. This is the value proposition of static analysis in a single example: a $50 fix prevented a potential $5 million error.

🔗 Cross-Reference: This case study illustrates why the testing, review, and analysis practices in Chapters 34-35 work best as a three-layered defense. Unit tests (Chapter 34) would have caught this defect only if a test case exercised the specific path that bypassed 4100-LOAD-COB-DATA — and since the path was newly introduced, no such test existed yet. Code review (this chapter) missed it because the reviewer did not trace data flow across paragraphs. Static analysis caught it because automated tools systematically trace every possible path through the code. No single practice catches everything — but together, they catch almost everything.

35.19 Integrating Static Analysis with Source Control

Modern COBOL shops increasingly use source control systems like Git or Endevor to manage code changes. Integrating static analysis with source control creates a feedback loop where quality checks happen automatically on every change.

Pre-Commit Analysis

Before code is committed to the repository, a pre-commit hook runs the static analysis pipeline:

#!/bin/bash
# Pre-commit hook for COBOL static analysis
# Place in .git/hooks/pre-commit

CHANGED_FILES=$(git diff --cached --name-only --diff-filter=ACM \
    | grep '\.cbl$')

if [ -z "$CHANGED_FILES" ]; then
    exit 0
fi

echo "Running static analysis on changed COBOL files..."

for FILE in $CHANGED_FILES; do
    echo "Analyzing: $FILE"
    cobc -Wall -Wextra -fsyntax-only "$FILE" 2> /tmp/cobol-warnings.txt

    CRITICAL=$(grep -c "error:" /tmp/cobol-warnings.txt)
    if [ "$CRITICAL" -gt 0 ]; then
        echo "BLOCKED: $FILE has $CRITICAL critical issues"
        cat /tmp/cobol-warnings.txt
        exit 1
    fi
done

echo "Static analysis passed."
exit 0

This hook prevents any COBOL file with critical static analysis issues from being committed. Developers must fix the issues before their changes enter the codebase — enforcing quality at the earliest possible point.

Trend Reports from Source History

By analyzing the static analysis results stored alongside each commit, you can generate trend reports showing how code quality evolves with each change:

Program: ACCT-MAINT.cbl
Quality Trend by Commit:
──────────────────────────────────────────────────────────
Commit    Date         Issues  CC(max)  Dead Lines  Dupl%
──────────────────────────────────────────────────────────
a3f2c9    2025-08-01     38      47       194       15.0%
b7d4e1    2025-09-15     31      12       194       12.3%
c9a1f3    2025-10-01     24      12         0        3.0%
d2b5g7    2025-10-20     18      10         0        2.8%
e4c8h2    2025-11-05     12       8         0        2.5%
──────────────────────────────────────────────────────────
Trend: Improving (68% reduction in issues over 3 months)

This trend shows the cumulative effect of the "boy scout rule" — leaving code better than you found it. Each commit reduces the issue count, complexity, dead code, and duplication, even though no dedicated cleanup project was undertaken.

35.20 Summary

Code review and static analysis form the quality foundation of professional COBOL development. Together with the unit testing practices from Chapter 34, they create a three-layered defense against defects:

  1. Static analysis catches mechanical defects — the patterns that tools can detect automatically.
  2. Code review catches design defects — the problems that require human judgment to identify.
  3. Testing catches behavioral defects — the bugs that only manifest at runtime.

The key concepts from this chapter:

  • Common defect patterns in COBOL include uninitialized fields, missing scope terminators, wrong PIC clauses, unchecked file status, and GO TO spaghetti. Learn to recognize these on sight.
  • Coding standards prevent defects and ensure consistency. They should cover naming, structure, documentation, and error handling.
  • Static analysis tools (SonarQube, ADDI, compiler warnings) automate the detection of known defect patterns and standards violations.
  • Complexity metrics (cyclomatic complexity, lines of code, maintainability index) quantify how difficult code is to understand and maintain.
  • Dead code should be identified and removed opportunistically to reduce confusion and maintenance burden.
  • Copy/paste detection reveals duplication that should be refactored into copybooks or subprograms.
  • Technical debt measurement provides data-driven prioritization for quality improvement investments.

As Maria Chen tells her team: "We can't fix thirty years of technical debt overnight. But we can make sure that every program we touch leaves our hands in better shape than when we found it."

In the next chapter, we'll tackle performance tuning — ensuring that our well-structured, thoroughly tested, carefully reviewed code also runs fast.