Key Takeaways — Chapter 41: Legacy Code Archaeology

Core Concepts

  1. Never read undocumented code from top to bottom. Use a systematic approach: external evidence first, then data survey, then control flow, then data flow, then business rules.

  2. External evidence reveals purpose before you read a single line of code. JCL, file names, scheduling position, and program names all provide context.

  3. Build a call graph early. The PERFORM hierarchy shows you the program's structure and narrative.

  4. Data flow analysis is often more valuable than procedural analysis. Understanding where data comes from and where it goes reveals the program's purpose.

  5. Business rules hide in EVALUATE and IF statements. Extract them systematically into a business rule catalog with standard format (ID, source, description, conditions, actions).

  6. Impact analysis maps the iceberg. The visible change is the tip — downstream effects, cross-program impacts, and system-level consequences are much larger.

  7. Tribal knowledge is fragile. Knowledge that exists only in people's heads is lost when they leave. Capture it proactively.

  8. Code archaeology is a team sport. Technical people read the code; domain experts verify the business logic. Neither can produce a complete understanding alone.

  9. Documentation recovery builds from bottom up: Data dictionary, program inventory, call graph, business rules, job stream map, system overview.

  10. Do not judge the original developers. They worked under constraints you may not understand. The goal is to understand the code, not to criticize it.

Common Anti-Patterns to Recognize

Anti-Pattern What It Looks Like Archaeology Approach
GO TO spaghetti GO TO statements creating non-linear flow Draw flow diagram with all GO TO targets
PERFORM THRU PERFORM 2000 THRU 2999 Map every paragraph in the range
Generic work fields WS-WORK-1 through WS-WORK-N Trace each field through every paragraph
Numeric status codes MOVE 3 TO WS-STATUS Search for every test of WS-STATUS, build legend
Implicit record types REDEFINES on FD record Map record type codes to REDEFINES
Dead code Paragraphs never PERFORMed Build call graph, identify orphans

Archaeology Toolkit

  • grep / ISPF FIND: Trace field references
  • Call graph: Understand program structure
  • Data flow diagram: Trace data from input to output
  • Business rule catalog: Document decisions
  • Impact analysis matrix: Map change effects
  • IBM ADDI / vendor analyzers: Automated analysis for large systems