Chapter 10: Tables and Arrays -- Key Takeaways
Chapter Summary
Tables, known as arrays in most other programming languages, are essential data structures in COBOL. They allow programs to store and process collections of related data items, such as lists of account numbers, monthly sales totals, or rate tables used for calculations. This chapter covered the OCCURS clause that defines table structures, the distinction between subscripts and indexes for accessing table elements, and the SEARCH and SEARCH ALL statements for locating specific entries within a table.
COBOL tables are defined in the DATA DIVISION using the OCCURS clause, which specifies that a data item repeats a certain number of times. Unlike dynamically sized arrays in languages like Java or Python, COBOL tables are typically fixed-size, with the number of occurrences determined at compile time. The OCCURS DEPENDING ON clause provides a limited form of variable-length tables, where the actual number of active entries is controlled by a separate numeric data item, but the maximum size is still fixed at compile time. Tables can be one-dimensional, two-dimensional, or even three-dimensional, with each dimension defined by a nested OCCURS clause.
Accessing table elements requires either subscripts or indexes. Subscripts are ordinary numeric data items or literals used in parentheses after the table name, while indexes are special items declared with the INDEXED BY phrase and manipulated with the SET statement. The SEARCH statement performs a sequential scan through a table, testing each element against a condition. SEARCH ALL performs a more efficient binary search but requires the table to be sorted on the search key and the key field to be declared with the KEY IS phrase. Understanding when to use sequential versus binary search, and how to structure tables for efficient access, is critical knowledge for COBOL programmers working with reference data and lookup tables.
Key Concepts
- The OCCURS clause defines a table by specifying that a data item or group repeats a fixed number of times; it cannot be used on 01-level or 77-level items.
- A subscript is an integer data item or literal in parentheses that identifies which occurrence of a table element to access; subscripts are 1-based in COBOL.
- An index is a special displacement-based item declared with INDEXED BY on the OCCURS clause; it provides more efficient access than subscripts because it avoids repeated multiplication.
- The SET statement is used to initialize, increment, and decrement index items; you cannot use MOVE, ADD, or SUBTRACT with indexes.
- SET index-name TO integer sets an index to point to a specific occurrence; SET index-name UP BY 1 advances the index to the next occurrence.
- SEARCH performs a serial (sequential) search through a table, starting from the current index position and advancing until a WHEN condition is satisfied or the end of the table is reached.
- The AT END clause of SEARCH specifies actions to take when the search reaches the end of the table without finding a match.
- SEARCH ALL performs a binary search, which is significantly faster for large tables but requires the table to be sorted in ascending or descending order on the specified key.
- The KEY IS clause on the OCCURS item declares the sort key for SEARCH ALL and specifies whether the table is in ASCENDING or DESCENDING KEY order.
- OCCURS DEPENDING ON defines a variable-length table where the number of active occurrences is determined at runtime by a separate numeric data item.
- Multi-dimensional tables are created by nesting OCCURS clauses; a two-dimensional table has an OCCURS at the group level and another OCCURS on a subordinate item.
- Accessing a multi-dimensional table element requires one subscript or index per dimension, listed in order from outermost to innermost:
WS-TABLE(row, col). - Table elements can be initialized using VALUE clauses with the OCCURS item, INITIALIZE statement, or by loading data from a file or hardcoded 01-level redefines structure.
- The REDEFINES clause is commonly used to overlay a table definition on a series of hardcoded VALUE literals, providing a way to initialize lookup tables with compile-time data.
- PERFORM VARYING is the standard mechanism for iterating through all elements of a table, with nested VARYING AFTER for multi-dimensional tables.
Common Pitfalls
- Subscript out of range: Accessing a table element with a subscript less than 1 or greater than the OCCURS count causes a runtime abend (typically S0C4 on z/OS). Always validate subscripts before use, or compile with bounds-checking options during testing.
- Forgetting to SET the index before SEARCH: The SEARCH statement begins searching from the current index position. If the index is not set to 1 before the search, elements before the current position are skipped, and the search may fail to find a matching entry.
- Using SEARCH ALL on an unsorted table: SEARCH ALL assumes the table is sorted on the declared key. If the data is not actually sorted, binary search produces incorrect results without any error message.
- Confusing subscripts and indexes: You cannot use a subscript where an index is required, or vice versa, without conversion. SET is used to convert between them. Mixing them in the same table reference causes compilation errors.
- OCCURS DEPENDING ON with incorrect ODO object: The ODO object (the data item controlling the table size) must be set to the correct value before accessing the table. If it contains zero, no elements are accessible. If it exceeds the maximum OCCURS value, behavior is undefined.
- Hardcoding table data without REDEFINES: Attempting to use VALUE clauses directly on items with OCCURS is restricted in some COBOL standards. The standard approach is to define the raw data with VALUE clauses at the group level and REDEFINES the group with the table structure.
- Nested OCCURS exceeding three dimensions: While COBOL supports up to seven dimensions according to the standard, tables beyond three dimensions become extremely difficult to understand and maintain. Most real-world applications use one or two dimensions.
- Not accounting for empty table entries: When a table is partially filled (common with OCCURS DEPENDING ON), processing all OCCURS entries instead of only the active entries leads to processing garbage data in the unused slots.
Quick Reference
* One-dimensional table with index
01 WS-MONTH-TABLE.
05 WS-MONTH-ENTRY OCCURS 12 TIMES
INDEXED BY WS-MONTH-IDX.
10 WS-MONTH-NAME PIC X(09).
10 WS-MONTH-DAYS PIC 9(02).
* Initializing a table via REDEFINES
01 WS-STATE-DATA.
05 FILLER PIC X(07) VALUE "ALAlaba".
05 FILLER PIC X(07) VALUE "AKAlask".
05 FILLER PIC X(07) VALUE "AZArizo".
01 WS-STATE-TABLE REDEFINES WS-STATE-DATA.
05 WS-STATE-ENTRY OCCURS 3 TIMES
ASCENDING KEY IS WS-STATE-CODE
INDEXED BY WS-STATE-IDX.
10 WS-STATE-CODE PIC X(02).
10 WS-STATE-ABBREV PIC X(05).
* OCCURS DEPENDING ON
01 WS-VARIABLE-TABLE.
05 WS-ITEM-COUNT PIC 9(03).
05 WS-ITEM OCCURS 1 TO 500 TIMES
DEPENDING ON WS-ITEM-COUNT
PIC X(20).
* Two-dimensional table
01 WS-MATRIX.
05 WS-ROW OCCURS 10 TIMES
INDEXED BY WS-ROW-IDX.
10 WS-COL OCCURS 5 TIMES
INDEXED BY WS-COL-IDX
PIC 9(05)V99.
* SET index and access element
SET WS-MONTH-IDX TO 6
DISPLAY WS-MONTH-NAME(WS-MONTH-IDX)
* SEARCH (serial search)
SET WS-MONTH-IDX TO 1
SEARCH WS-MONTH-ENTRY
AT END
DISPLAY "Month not found"
WHEN WS-MONTH-NAME(WS-MONTH-IDX) =
WS-SEARCH-NAME
DISPLAY "Found at position "
WS-MONTH-IDX
END-SEARCH
* SEARCH ALL (binary search)
SEARCH ALL WS-STATE-ENTRY
AT END
DISPLAY "State not found"
WHEN WS-STATE-CODE(WS-STATE-IDX) =
WS-SEARCH-CODE
DISPLAY WS-STATE-ABBREV(WS-STATE-IDX)
END-SEARCH
* PERFORM VARYING through a table
PERFORM VARYING WS-MONTH-IDX
FROM 1 BY 1
UNTIL WS-MONTH-IDX > 12
DISPLAY WS-MONTH-NAME(WS-MONTH-IDX)
" has "
WS-MONTH-DAYS(WS-MONTH-IDX)
" days"
END-PERFORM
What's Next
Chapter 11 moves into Part 3 of the textbook and introduces sequential file processing, one of COBOL's most important capabilities. You will learn how to define files with SELECT and FD entries, open and close files, read records sequentially, and write output records. Tables covered in this chapter often serve as lookup structures loaded from files, and the PERFORM loops from Chapter 8 provide the iteration framework for reading files record by record, bringing together everything you have learned so far.