Key Takeaways — Chapter 41: Legacy Code Archaeology
Core Concepts
-
Never read undocumented code from top to bottom. Use a systematic approach: external evidence first, then data survey, then control flow, then data flow, then business rules.
-
External evidence reveals purpose before you read a single line of code. JCL, file names, scheduling position, and program names all provide context.
-
Build a call graph early. The PERFORM hierarchy shows you the program's structure and narrative.
-
Data flow analysis is often more valuable than procedural analysis. Understanding where data comes from and where it goes reveals the program's purpose.
-
Business rules hide in EVALUATE and IF statements. Extract them systematically into a business rule catalog with standard format (ID, source, description, conditions, actions).
-
Impact analysis maps the iceberg. The visible change is the tip — downstream effects, cross-program impacts, and system-level consequences are much larger.
-
Tribal knowledge is fragile. Knowledge that exists only in people's heads is lost when they leave. Capture it proactively.
-
Code archaeology is a team sport. Technical people read the code; domain experts verify the business logic. Neither can produce a complete understanding alone.
-
Documentation recovery builds from bottom up: Data dictionary, program inventory, call graph, business rules, job stream map, system overview.
-
Do not judge the original developers. They worked under constraints you may not understand. The goal is to understand the code, not to criticize it.
Common Anti-Patterns to Recognize
| Anti-Pattern | What It Looks Like | Archaeology Approach |
|---|---|---|
| GO TO spaghetti | GO TO statements creating non-linear flow | Draw flow diagram with all GO TO targets |
| PERFORM THRU | PERFORM 2000 THRU 2999 |
Map every paragraph in the range |
| Generic work fields | WS-WORK-1 through WS-WORK-N | Trace each field through every paragraph |
| Numeric status codes | MOVE 3 TO WS-STATUS |
Search for every test of WS-STATUS, build legend |
| Implicit record types | REDEFINES on FD record | Map record type codes to REDEFINES |
| Dead code | Paragraphs never PERFORMed | Build call graph, identify orphans |
Archaeology Toolkit
- grep / ISPF FIND: Trace field references
- Call graph: Understand program structure
- Data flow diagram: Trace data from input to output
- Business rule catalog: Document decisions
- Impact analysis matrix: Map change effects
- IBM ADDI / vendor analyzers: Automated analysis for large systems