38 min read

It is 6:47 a.m. on a Tuesday morning. Most of the world is still asleep, but the machines are wide awake.

Chapter 1: The World of COBOL -- History, Relevance, and the Mainframe Ecosystem


Opening Vignette: Trillions Before Breakfast

It is 6:47 a.m. on a Tuesday morning. Most of the world is still asleep, but the machines are wide awake.

Deep inside a nondescript building in Poughkeepsie, New York, rows of IBM z16 mainframes hum at a pitch barely audible to the human ear. Their cooling systems circulate chilled water through copper channels, dissipating the heat generated by processors executing billions of instructions per second. There are no monitors glowing with colorful dashboards, no developers hunched over keyboards. Just blinking lights, spinning tape libraries, and the quiet, relentless rhythm of computation.

At this hour, a batch processing window is closing. Overnight, these machines have processed the previous day's transactions for three of the nation's ten largest banks. Credit card authorizations, mortgage payment postings, interbank transfers, and account reconciliations -- all flowing through programs written in a language that first saw the light of day in 1959. That language is COBOL.

By the time you sit down to breakfast, COBOL programs running on mainframes around the world will have processed an estimated $3 trillion in commerce. The ATM that dispensed your cash this morning? There is a 95% chance a COBOL program authorized that transaction. The direct deposit that landed in your checking account? COBOL. The insurance claim your doctor filed after your last visit? Almost certainly COBOL. The freight logistics that ensured your grocery store was stocked with fresh produce? Quite likely COBOL as well.

This is not a legacy technology gasping its last breath. This is the invisible backbone of the modern economy, processing more transactions in a single day than all the world's web APIs combined. And as you are about to discover, learning this language is one of the most strategically valuable decisions you can make in your technology career.

Welcome to the world of COBOL.


1.1 The Birth of COBOL: A Language for Business

The Problem That Needed Solving

In the late 1950s, the computing world was fractured. Every computer manufacturer had its own programming language, its own instruction set, its own way of doing things. A program written for a UNIVAC could not run on an IBM machine. A program written for a Burroughs system was incomprehensible to a Honeywell compiler. Businesses that invested heavily in software for one platform found themselves locked in, unable to switch vendors without rewriting everything from scratch.

The U.S. Department of Defense, one of the largest consumers of computing power in the world, was particularly frustrated by this state of affairs. They were spending millions of dollars on software that could not be moved from one machine to another. Something had to change.

Grace Hopper and the Dream of English-Like Programming

The story of COBOL cannot be told without Grace Murray Hopper. A rear admiral in the United States Navy and a mathematician by training, Hopper was one of the first people to envision a world where computers could understand something closer to human language. In 1952, she developed the first compiler, a program called A-0 that translated mathematical notation into machine code. She went on to lead the development of FLOW-MATIC, one of the first programming languages to use English-like syntax.

Hopper was famous for her pragmatism. When colleagues told her that computers could not understand English words, she replied that she had proven otherwise -- and that the real obstacle was not technology but the unwillingness of people to try something new. "The most dangerous phrase in the English language," she was fond of saying, "is 'We've always done it this way.'"

Hopper's work on FLOW-MATIC directly influenced the design of COBOL. She served on the committee that created the language and championed its core philosophy: programs should be readable by managers, auditors, and business professionals, not just by programmers.

CODASYL and the Conference That Changed Everything

In May 1959, a group of computer manufacturers, government agencies, and academic researchers gathered at the Pentagon for a conference that would change the trajectory of business computing. This group, which came to be known as the Conference on Data Systems Languages (CODASYL), set out to design a common business-oriented programming language.

The committee faced enormous pressure. The Department of Defense wanted a language that was:

  • Portable across different hardware platforms
  • English-like so that non-programmers could read and understand the code
  • Business-oriented with strong support for file handling, report generation, and decimal arithmetic
  • Self-documenting so that programs could serve as their own specification

The initial specification was completed in a remarkable six months. COBOL-60, as it came to be known, was released in April 1960. It was not perfect -- no first version ever is -- but it was revolutionary. For the first time, a program written on one manufacturer's hardware could, in principle, be compiled and run on another's.

The Department of Defense played a crucial role in COBOL's adoption. In 1960, the DoD mandated that any computer purchased by the federal government must be able to compile and run COBOL programs. This single policy decision ensured that every major hardware manufacturer would support the language, creating a virtuous cycle of adoption and improvement.

The Name Itself

The name "COBOL" stands for COmmon Business-Oriented Language. Each word in the acronym reflects a design goal:

  • Common: Portable across hardware platforms
  • Business: Designed for commercial data processing, not scientific computing
  • Oriented: Focused on practical business problems
  • Language: A true programming language, not a set of machine codes

The Other Contenders

COBOL was not the only language proposed at the Pentagon conference. The CODASYL committee considered several alternatives. IBM had COMTRAN (Commercial Translator), developed by Bob Bemer. Honeywell had FACT (Fully Automatic Compiling Technique). Sperry Rand had FLOW-MATIC, Grace Hopper's creation. Each had strengths, but none was universally acceptable.

The committee's genius was in synthesizing the best ideas from all three. FLOW-MATIC contributed the English-like syntax and the concept of self-documenting code. COMTRAN contributed algebraic expressions and the PICTURE clause concept. FACT contributed ideas about data description and file handling. The result was a new language that was more than the sum of its parts.

The first COBOL program was demonstrated on December 7, 1960, when the same COBOL source program was compiled and run on two different computers -- a UNIVAC II and an RCA 501 -- proving the language's portability promise. This demonstration was a landmark moment in computing history, showing that software could truly be independent of hardware.

Early Adoption and Growing Pains

COBOL's adoption was not without friction. Many academic computer scientists dismissed the language as inelegant and verbose. Edsger Dijkstra, the influential Dutch computer scientist, was famously hostile to COBOL, reportedly saying that "the use of COBOL cripples the mind." FORTRAN programmers, accustomed to compact mathematical notation, found COBOL's English-like syntax tedious.

But the business world had a different perspective. Corporate data processing departments, which were expanding rapidly in the 1960s, found COBOL approachable and practical. Programmers could be trained more quickly because the language read like English. Programs were easier to maintain because the code was self-explanatory. And the Department of Defense mandate meant that COBOL compilers were available on every major platform.

By the mid-1960s, COBOL had become the most widely used programming language in the world -- a position it would hold for decades. The language that academics dismissed as unsophisticated was quietly becoming the foundation of the global economy.


1.2 COBOL's Design Philosophy

COBOL was designed with a set of principles that distinguished it sharply from languages like FORTRAN, which was oriented toward scientific and mathematical computation. Understanding these principles is essential to understanding why COBOL code looks the way it does.

English-Like Syntax

COBOL was deliberately designed to read like English prose. Where FORTRAN might express an operation as:

X = A + B

COBOL would express the same operation as:

ADD FIRST-NUMBER TO SECOND-NUMBER GIVING TOTAL-AMOUNT.

This verbosity was not an accident or a flaw. It was the central design goal. The CODASYL committee believed that programs should be understandable by business managers and auditors who had no programming training. A payroll program written in COBOL could, in theory, be read and validated by the payroll department manager.

Business Data Orientation

COBOL was built from the ground up for business data processing. This meant:

  • Decimal arithmetic: COBOL uses packed decimal and display numeric formats that avoid the floating-point rounding errors that plague languages like C and Java when handling currency. When a COBOL program says a balance is $1,234.56, it means exactly $1,234.56 -- not $1,234.5600000001.

  • Fixed-point precision: The PICTURE clause (PIC) gives programmers exact control over the size and precision of every data field. PIC 9(7)V99 means exactly seven integer digits and two decimal places.

  • Record-oriented data: COBOL's DATA DIVISION allows programmers to define complex, hierarchical record structures that map directly to file layouts and database records. This made it natural to work with the structured data formats used in business -- customer records, invoice lines, ledger entries.

  • Report generation: Early versions of COBOL included a Report Writer feature that could generate formatted business reports with headers, footers, subtotals, and page breaks.

Self-Documenting Code

COBOL's verbose syntax serves a documentary purpose. A well-written COBOL program is, to a significant degree, its own documentation. Variable names like EMPLOYEE-GROSS-PAY, FEDERAL-TAX-WITHHELD, and NET-PAY-AMOUNT tell you exactly what the data represents. Procedure names like CALCULATE-OVERTIME-PAY and PRINT-MONTHLY-SUMMARY describe exactly what the code does.

This self-documenting quality has proven to be one of COBOL's greatest strengths over time. Programs written decades ago can often be understood and maintained by developers who were not yet born when the code was first written.

Consider the difference. In a contemporary language, you might see:

net = gross - fed - state - ss - med

The equivalent COBOL code reads:

SUBTRACT FEDERAL-TAX-AMOUNT
         STATE-TAX-AMOUNT
         SOCIAL-SECURITY-AMOUNT
         MEDICARE-AMOUNT
    FROM GROSS-PAY-AMOUNT
    GIVING NET-PAY-AMOUNT
END-SUBTRACT

The COBOL version is longer, but it is unambiguous. There is no need to consult a data dictionary to understand what fed, ss, or med represent. The variable names tell the entire story. For a payroll system that must be maintained for decades by developers who may never meet the original programmer, this clarity is invaluable.

Portability

From its inception, COBOL was designed to be hardware-independent. The ENVIRONMENT DIVISION, which specifies the hardware configuration, was deliberately separated from the PROCEDURE DIVISION, which contains the business logic. In theory, moving a COBOL program from one machine to another required changes only in the ENVIRONMENT DIVISION, leaving the business logic untouched.

In practice, vendor extensions and platform-specific features have sometimes compromised this portability. But the principle of separating environmental concerns from business logic was decades ahead of its time and anticipated design patterns that modern software engineering would not articulate until the 1990s.

Structured Data Handling

One of COBOL's most powerful features -- and one that is often underappreciated by programmers coming from other languages -- is its hierarchical data definition system. COBOL's DATA DIVISION allows you to define complex, nested record structures using level numbers:

01  CUSTOMER-RECORD.
    05  CUST-NAME.
        10  CUST-FIRST-NAME        PIC X(15).
        10  CUST-MIDDLE-INITIAL    PIC X(1).
        10  CUST-LAST-NAME         PIC X(20).
    05  CUST-ADDRESS.
        10  CUST-STREET            PIC X(30).
        10  CUST-CITY              PIC X(20).
        10  CUST-STATE             PIC X(2).
        10  CUST-ZIP               PIC 9(5).
    05  CUST-PHONE                 PIC 9(10).
    05  CUST-ACCOUNT-BALANCE       PIC S9(9)V99.

This structure maps directly to a file record layout. The COBOL compiler knows exactly how many bytes each field occupies and where each field begins within the record. You can refer to the entire record as CUSTOMER-RECORD, the name group as CUST-NAME, or an individual field like CUST-LAST-NAME. This hierarchical data model was remarkably well-suited to the record-oriented data processing that dominated business computing, and it foreshadowed the structured data types of modern languages by decades.


1.3 The Evolution of COBOL Standards

COBOL has been continuously refined and updated over more than six decades. Each revision of the standard has added new capabilities while maintaining backward compatibility with earlier versions.

COBOL-60 (1960)

The original specification, produced by the CODASYL Short-Range Committee. It defined the four-division program structure, the PICTURE clause, basic file handling, and the English-like syntax that would become COBOL's hallmark. This first version was intentionally limited -- the committee prioritized getting a working standard out the door quickly.

COBOL-61 and COBOL-61 Extended (1961-1963)

Rapid revisions that corrected deficiencies in the original specification and added features like the SORT verb and the Report Writer module. These early revisions reflect the intense pace of development as the language was deployed in production environments.

COBOL-68 (1968)

The first version standardized by the American National Standards Institute (ANSI). This was a landmark moment: COBOL became the first programming language to be formally standardized by a national standards body. The ANSI standard provided a definitive reference that all compiler vendors could implement, greatly improving portability across platforms.

COBOL-74 (1974)

A significant revision that added new features and tightened the language specification. Key additions included:

  • Improved file handling capabilities
  • The INSPECT statement for string manipulation
  • Enhanced inter-program communication
  • Standardized debugging features

COBOL-85 (1985)

Widely regarded as the most important revision in COBOL's history, COBOL-85 introduced structured programming constructs that brought COBOL in line with modern software engineering practices:

  • Scope terminators: END-IF, END-PERFORM, END-READ, and other explicit scope delimiters replaced the problematic period-based scope rules of earlier versions
  • Inline PERFORM: Allowed loop bodies to be written at the point of use rather than in separate paragraphs
  • EVALUATE statement: A powerful multi-branch conditional (similar to switch/case in other languages)
  • Nested programs: Allowed programs to contain subprograms
  • Reference modification: Substring operations on alphanumeric fields

COBOL-85 transformed the language from one that encouraged tangled, GO TO-heavy "spaghetti code" into one that supported clean, structured programming. Most production COBOL code in use today was written to the COBOL-85 standard or later.

COBOL 2002

The first revision to use a year-based naming convention rather than a two-digit suffix. COBOL 2002 was an ambitious update that added many features inspired by object-oriented and modern programming languages:

  • Object-oriented programming: Classes, methods, interfaces, and inheritance
  • User-defined functions: Programmer-defined intrinsic functions
  • Unicode support: International character set handling
  • Free-format source: Liberation from the rigid column-based format inherited from punch cards
  • Binary and floating-point data types: Expanded numeric representations
  • Improved interoperability: Better support for calling programs written in other languages

COBOL 2014

The most recent formal standard, which refined the features introduced in COBOL 2002 and added:

  • Dynamic-capacity tables: Arrays that can be resized at runtime
  • Enhanced VALIDATE statement: Improved data validation capabilities
  • Additional intrinsic functions: Date, time, and mathematical functions
  • Improved XML and JSON support: Native handling of modern data formats

It is important to note that not all compilers implement every feature of the latest standard. IBM's Enterprise COBOL, Micro Focus COBOL, and GnuCOBOL each support different subsets of the COBOL 2014 standard. In practice, most production COBOL programming uses features from COBOL-85 and selected features from later standards.


1.4 The Mainframe Ecosystem

To understand COBOL's enduring relevance, you must understand the ecosystem in which it operates. COBOL does not exist in isolation -- it is one component of a tightly integrated technology stack that has been refined over decades for reliability, throughput, and security.

The IBM Mainframe: z/OS and Its Predecessors

The IBM mainframe is the platform most closely associated with COBOL. The current generation, the IBM z16, is a far cry from the room-sized machines of the 1960s, but it is their direct descendant. The operating system, z/OS, traces its lineage through OS/390, MVS, and OS/360 back to 1964.

Key characteristics of the mainframe environment include:

  • Massive throughput: A single z16 can process up to 19 billion transactions per day
  • Extreme reliability: Mainframes achieve 99.999% uptime (less than 5.26 minutes of downtime per year)
  • Hardware-level security: Cryptographic processing is built into the processor chips
  • Workload management: The system automatically balances resources across thousands of concurrent users and batch jobs
  • Backward compatibility: Programs compiled decades ago can run unmodified on current hardware

JCL: Job Control Language

JCL (Job Control Language) is the scripting language used to submit and manage batch jobs on z/OS. It tells the operating system what programs to run, what files to use, and how to handle errors. A typical JCL job stream might compile a COBOL program, link it with subroutines, and execute it against a production database -- all as an automated, scheduled process.

JCL looks unlike any modern scripting language:

//PAYROLL  JOB (ACCT),'MONTHLY PAYROLL',CLASS=A,MSGCLASS=X
//STEP01   EXEC PGM=PAYRCALC
//INFILE   DD DSN=PROD.EMPLOYEE.MASTER,DISP=SHR
//OUTFILE  DD DSN=PROD.PAYROLL.MONTHLY,DISP=(NEW,CATLG),
//            SPACE=(CYL,(50,10)),DCB=(RECFM=FB,LRECL=200)
//SYSOUT   DD SYSOUT=*

While JCL can be intimidating at first, it is a powerful job scheduling and resource management language that has no direct equivalent in the distributed computing world.

CICS: Customer Information Control System

CICS (pronounced "kicks") is IBM's online transaction processing system. While batch processing handles large volumes of data overnight, CICS handles real-time, interactive transactions -- the kind that happen when a bank teller looks up your account or a customer service representative processes a return.

CICS provides:

  • Transaction management: ACID-compliant transaction processing
  • Terminal handling: Communication with thousands of concurrent users
  • Program management: Loading, executing, and managing COBOL programs
  • Security: User authentication and resource-level access control
  • Recovery: Automatic transaction rollback in case of failure

Most online COBOL programs in banking, insurance, and government run under CICS. When you use an ATM, there is a good chance that a CICS transaction on a mainframe is processing your request.

DB2: The Relational Database

DB2 is IBM's relational database management system for mainframes. COBOL programs interact with DB2 using embedded SQL statements. A COBOL program can SELECT, INSERT, UPDATE, and DELETE data in DB2 tables just as a Java program might interact with Oracle or PostgreSQL.

The combination of COBOL and DB2 is one of the most common technology pairings in enterprise computing. Billions of database records -- customer accounts, policy details, transaction histories -- are managed by COBOL programs communicating with DB2.

A typical embedded SQL statement in a COBOL program looks like this:

EXEC SQL
    SELECT CUSTOMER_NAME, ACCOUNT_BALANCE
    INTO :WS-CUST-NAME, :WS-ACCT-BALANCE
    FROM CUSTOMER_MASTER
    WHERE ACCOUNT_NUMBER = :WS-ACCT-NUM
END-EXEC

The colons before variable names indicate COBOL host variables. The SQL statement is preprocessed by a DB2 precompiler before the COBOL compiler processes the rest of the program. This seamless integration between COBOL and SQL has been in production use since the 1980s and remains the backbone of data access in mainframe applications.

VSAM: Virtual Storage Access Method

Before relational databases, mainframe data was stored in VSAM files. VSAM (Virtual Storage Access Method) provides indexed, sequential, and relative record access. Many legacy COBOL programs still use VSAM files, and understanding VSAM is essential for maintaining and modernizing these systems.

VSAM file types include:

  • KSDS (Key-Sequenced Data Set): Records accessed by a unique key, similar to a database table with a primary key
  • ESDS (Entry-Sequenced Data Set): Records stored in the order they were written, like a log file
  • RRDS (Relative Record Data Set): Records accessed by their relative position in the file

Batch Processing

Batch processing is the workhorse of mainframe computing. While online systems like CICS handle individual transactions in real time, batch processing handles massive volumes of data in scheduled runs -- typically overnight or during low-usage periods.

A typical batch processing cycle at a bank might include:

  1. End-of-day processing: Post all of the day's transactions to master files
  2. Interest calculation: Calculate interest on millions of accounts
  3. Statement generation: Produce monthly statements for all customers
  4. Regulatory reporting: Generate reports required by banking regulators
  5. Data warehouse loading: Extract and transform data for analytical systems

Each of these steps is typically implemented as a series of COBOL programs, orchestrated by JCL job streams, and monitored by operations staff. A single nightly batch window might execute thousands of COBOL programs in a carefully sequenced order.

Understanding the difference between batch and online processing is fundamental to working in the mainframe environment. A COBOL programmer may write programs for both modes, but the programming patterns are quite different. Batch programs typically process entire files from beginning to end, reading millions of records in a single run. Online CICS programs process one transaction at a time, responding to individual user requests in milliseconds. Both types of programs are written in COBOL, but they interact with the system in fundamentally different ways.

The Integration Layer

What makes the mainframe ecosystem so powerful is not any single component but the integration between components. A customer checking their balance at an ATM triggers a CICS transaction that executes a COBOL program, which queries a DB2 table, checks a VSAM file for recent pending transactions, and returns a response -- all in less than a second. That evening, a batch COBOL program will reconcile the day's ATM transactions, updating the master files and generating reports for the operations team. JCL orchestrates the entire batch sequence, while RACF ensures that only authorized programs and users can access sensitive data.

This deeply integrated stack has been refined over decades. Every component trusts every other component. Error handling, recovery procedures, and security policies are baked in at every level. Replicating this level of integration and reliability in a distributed system is one of the reasons mainframe replacement projects are so challenging and so frequently unsuccessful.


1.5 COBOL by the Numbers

The scale of COBOL's deployment in the global economy is staggering. While exact figures are difficult to verify independently, the following statistics are widely cited by industry analysts, IBM, and organizations like the COBOL Working Group:

  • 220+ billion lines of COBOL code are in active production use worldwide
  • 95% of ATM transactions are processed by COBOL programs
  • 80% of in-person transactions at financial institutions involve COBOL
  • $3 trillion in daily commerce flows through COBOL systems
  • 43% of all banking systems are built on COBOL
  • 80% of business transactions worldwide touch COBOL at some point
  • COBOL processes 200 times more transactions daily than Google searches
  • Every Fortune 500 company uses COBOL in some capacity

These numbers are not relics of a bygone era. New COBOL code is being written every day. IBM's Enterprise COBOL compiler continues to receive regular updates, with version 6.4 released to take advantage of the z16's hardware capabilities. Micro Focus, now part of OpenText, maintains COBOL compilers for Windows, Linux, and cloud platforms. GnuCOBOL, the open-source COBOL compiler, enables COBOL development on virtually any platform.

The COVID-19 pandemic in 2020 brought COBOL's continued importance into sharp public focus. When unemployment claims surged to unprecedented levels, state governments across the United States discovered that their claims processing systems -- written in COBOL decades earlier -- could not handle the volume. The problem was not the language itself (COBOL can scale enormously) but the shortage of developers who knew how to modify and optimize these systems. Several states, including New Jersey, Connecticut, and Kansas, issued public calls for COBOL programmers, making international headlines. New Jersey Governor Phil Murphy publicly appealed for volunteers with COBOL skills. The story was covered by the New York Times, Reuters, CNN, and hundreds of other outlets worldwide.

The pandemic episode was a wake-up call, but it should not have been a surprise. COBOL's dominance in transaction processing has been well documented for decades. The language's continued prevalence is not the result of inertia or neglect -- it is the result of deliberate choices by organizations that depend on systems that must not fail. When 95% of ATM transactions flow through a technology, it is not legacy -- it is infrastructure.

Where COBOL Lives Today

To make these statistics concrete, consider a typical day in the life of COBOL-powered systems:

  • 6:00 AM: A bank's overnight batch processing completes, having calculated interest on 12 million savings accounts, processed 500,000 mortgage payments, and generated regulatory reports for three federal agencies.
  • 8:30 AM: An insurance company's COBOL system processes 15,000 new claims from the previous day, cross-referencing each against policy databases and fraud detection rules.
  • 10:00 AM: A federal government agency's COBOL programs process 40,000 Social Security benefit payments, calculating each one based on complex eligibility rules encoded over 50 years of legislation.
  • 12:00 PM: A major retailer's mainframe processes 2.5 million credit card authorizations from its 3,000 stores.
  • 3:00 PM: An airline's reservation system, running on COBOL, manages 200,000 seat assignments, pricing calculations, and booking confirmations.
  • 11:00 PM: The cycle begins again as batch windows open at financial institutions across the country.

This is not the profile of a dying technology. This is the profile of a technology so deeply embedded in the world's economic infrastructure that it has become invisible -- like plumbing or electrical wiring. You do not think about it until it stops working. And when it stops working, the consequences are measured in billions of dollars.


1.6 The Skills Gap Crisis

An Aging Workforce

The COBOL skills gap is one of the most pressing challenges in enterprise technology. The majority of experienced COBOL programmers entered the workforce in the 1970s and 1980s. Many of these developers are now in their 60s and 70s, and they are retiring in large numbers.

According to industry surveys:

  • The average age of a COBOL programmer is estimated to be over 55
  • An estimated 75% of the current COBOL workforce will retire within the next decade
  • Universities have largely stopped teaching COBOL, reducing the pipeline of new talent
  • Many organizations have lost institutional knowledge about their COBOL systems as senior developers retire without transferring their expertise

The Retirement Wave

The challenge is not simply that developers are retiring. The deeper problem is that decades of business logic -- the rules that govern how a bank calculates interest, how an insurance company processes claims, how a government agency determines benefit eligibility -- are encoded in COBOL programs that only a shrinking number of people understand.

When a senior COBOL developer retires, they take with them not just knowledge of the COBOL language but an understanding of the business processes, data flows, regulatory requirements, and system interdependencies that their programs implement. This institutional knowledge is often undocumented and cannot be easily replaced.

Demand Exceeding Supply

The result of this demographic shift is a classic supply-and-demand imbalance. Organizations that depend on COBOL systems -- which includes virtually every large bank, insurance company, government agency, and retailer -- need COBOL developers. But the supply of qualified COBOL developers is shrinking faster than the demand.

This imbalance manifests in several ways:

  • Rising salaries: Experienced COBOL developers command premium compensation
  • Extended retirements: Organizations offer incentives for senior developers to delay retirement or return as consultants
  • Outsourcing challenges: Even offshore development firms are finding it difficult to recruit and retain COBOL talent
  • Project delays: Modernization and maintenance projects are delayed due to staffing shortages
  • Increased risk: Systems are maintained by smaller teams, creating single-point-of-failure risks when key individuals leave

1.7 Why Learn COBOL Today

Career Opportunities

The COBOL skills gap is your opportunity. While thousands of computer science graduates enter the job market each year with Java, Python, and JavaScript skills, very few can claim COBOL expertise. This scarcity translates directly into career advantages:

  • High demand: Job postings for COBOL developers consistently appear on major job boards, with thousands of open positions at any given time
  • Low competition: For every COBOL position, there are far fewer applicants than for equivalent Java or Python roles
  • Industry diversity: COBOL skills are needed in banking, insurance, healthcare, government, retail, transportation, and telecommunications
  • Geographic flexibility: While many COBOL positions have traditionally been concentrated in financial centers, remote work has expanded geographic options significantly

Salary Data

COBOL developers are among the best-compensated programmers in the industry. While salaries vary by region, experience, and industry, the following ranges are representative of the U.S. market:

  • Entry-level COBOL developer: $55,000 - $75,000
  • Mid-level COBOL developer (3-5 years): $75,000 - $110,000
  • Senior COBOL developer (5-10 years): $100,000 - $140,000
  • COBOL architect / technical lead: $130,000 - $170,000
  • COBOL consultants and contractors: $75 - $150+ per hour

These figures often exceed the compensation for equivalent experience levels in more "popular" languages. The premium reflects the scarcity of COBOL talent and the critical nature of the systems these developers maintain.

Job Security

COBOL offers a level of job security that few other technologies can match. Consider:

  • The systems that run on COBOL are too large, too critical, and too expensive to simply replace. A major bank's core banking system might contain 50 million lines of COBOL code representing decades of business logic and regulatory compliance.
  • Replacement projects are measured in years and billions of dollars, and many have failed. The history of enterprise IT is littered with ambitious "rip and replace" projects that went over budget, over schedule, or were abandoned entirely.
  • Even as organizations modernize, they typically do so incrementally, wrapping COBOL programs in APIs or migrating individual components while the core COBOL systems continue to run. This means COBOL developers are needed both to maintain existing systems and to participate in modernization efforts.
  • Regulatory requirements in banking and insurance mandate stability and continuity. Regulators are skeptical of large-scale system replacements that could introduce risk into financial systems.

The bottom line: COBOL systems will be running for decades to come, and the developers who maintain and modernize them will be in demand for the foreseeable future.

Industries That Need COBOL Developers

The demand for COBOL skills spans a wide range of industries. Here are the primary sectors and the types of COBOL work they require:

Banking and Financial Services: Core banking systems, payment processing, credit card management, mortgage servicing, and regulatory compliance. This is the largest single employer of COBOL developers, accounting for an estimated 40% of all COBOL positions.

Insurance: Policy administration, claims processing, underwriting, actuarial calculations, and reinsurance management. Insurance systems are among the most complex COBOL applications, with business rules that reflect decades of regulatory evolution.

Government: Federal and state agencies use COBOL for tax processing (the IRS), social security benefit calculations, Medicare and Medicaid claims, unemployment insurance, and defense logistics. Government COBOL systems are often the oldest and most complex.

Healthcare: Hospital billing systems, health insurance claims processing, and pharmacy benefits management. The intersection of healthcare and insurance creates particularly complex COBOL systems.

Retail and Supply Chain: Point-of-sale transaction processing, inventory management, and logistics coordination. Major retailers process billions of transactions annually through COBOL systems.

Transportation: Airline reservation systems, freight management, and railroad logistics. Many of these systems have been running continuously since the 1970s and 1980s.

Telecommunications: Billing systems, customer management, and network provisioning. Telecom COBOL systems handle the billing for hundreds of millions of phone and internet accounts.

The diversity of these industries means that a COBOL developer can move between sectors, bringing their language skills to new domains while continuously expanding their business knowledge.


1.8 Your First COBOL Program: Hello, World

Every programming journey begins with "Hello, World." In COBOL, this simple program introduces several fundamental concepts about the language's structure and syntax.

The Code

Here is the complete Hello World program (see also code/example-01-hello-world.cob):

000100 IDENTIFICATION DIVISION.
000200 PROGRAM-ID. HELLOWORLD.
000300*---------------------------------------------------------------
000400* A classic Hello World program in COBOL.
000500*---------------------------------------------------------------
000600 ENVIRONMENT DIVISION.
000700 DATA DIVISION.
000800 PROCEDURE DIVISION.
000900 MAIN-PARAGRAPH.
001000     DISPLAY "HELLO, WORLD! WELCOME TO COBOL.".
001100     STOP RUN.

Line-by-Line Walkthrough

Let us examine each line:

Line 000100: IDENTIFICATION DIVISION. Every COBOL program begins with the IDENTIFICATION DIVISION. This is required. The period at the end is mandatory -- COBOL uses periods as sentence terminators, similar to English.

Line 000200: PROGRAM-ID. HELLOWORLD. The PROGRAM-ID paragraph names the program. This is the only required paragraph in the IDENTIFICATION DIVISION. The name HELLOWORLD identifies this program to the compiler and the operating system. By convention, COBOL program names are uppercase and limited to 8 characters on mainframes (though modern compilers allow longer names).

Lines 000300-000500: Comments Lines beginning with * in column 7 are comments. COBOL comments extend to the end of the line. They are ignored by the compiler but are essential for documenting your code.

Line 000600: ENVIRONMENT DIVISION. The ENVIRONMENT DIVISION describes the computing environment. In this simple program, it is empty but present to demonstrate the four-division structure.

Line 000700: DATA DIVISION. The DATA DIVISION defines all data items (variables) used by the program. Our Hello World program does not use any variables, so this division is empty.

Line 000800: PROCEDURE DIVISION. The PROCEDURE DIVISION contains the executable code. This is where the program's logic lives.

Line 000900: MAIN-PARAGRAPH. A paragraph is a named block of code. MAIN-PARAGRAPH is a user-chosen name. Paragraph names must begin in Area A (columns 8-11).

Line 001000: DISPLAY "HELLO, WORLD! WELCOME TO COBOL.". The DISPLAY statement writes text to the standard output (typically the terminal or SYSOUT on a mainframe). The text is enclosed in quotation marks. Note that the statement is indented to Area B (columns 12-72).

Line 001100: STOP RUN. The STOP RUN statement terminates the program and returns control to the operating system.

Compiling and Running with GnuCOBOL

GnuCOBOL is a free, open-source COBOL compiler that runs on Windows, Linux, and macOS. It is an excellent tool for learning COBOL without access to a mainframe.

Installing GnuCOBOL:

On Ubuntu/Debian Linux:

sudo apt-get install gnucobol

On macOS with Homebrew:

brew install gnucobol

On Windows, GnuCOBOL can be installed via pre-built packages available from the GnuCOBOL project page or through package managers like Chocolatey.

Compiling the program:

cobc -x -o helloworld example-01-hello-world.cob

The -x flag tells the compiler to produce an executable. The -o flag specifies the output file name.

Running the program:

./helloworld

Expected output:

HELLO, WORLD! WELCOME TO COBOL.

Compiling on a Mainframe

On an IBM mainframe running z/OS, the compilation process is different. You would use JCL to invoke the Enterprise COBOL compiler:

//COMPILE  JOB (ACCT),'COMPILE HELLO',CLASS=A,MSGCLASS=X
//STEP01   EXEC PROC=IGYWCL
//COBOL.SYSIN DD *
       IDENTIFICATION DIVISION.
       PROGRAM-ID. HELLOWORLD.
       ENVIRONMENT DIVISION.
       DATA DIVISION.
       PROCEDURE DIVISION.
       MAIN-PARAGRAPH.
           DISPLAY "HELLO, WORLD! WELCOME TO COBOL.".
           STOP RUN.
/*
//LKED.SYSLIB DD DSN=CEE.SCEELKED,DISP=SHR

The JCL procedure IGYWCL invokes the COBOL compiler and linkage editor. The compiled program can then be executed as a batch job or loaded into CICS for online execution.


1.9 Understanding the Four Divisions

Every COBOL program is organized into four divisions, each with a specific purpose. This structure is unique to COBOL and is one of the language's most distinctive features. The full structure is demonstrated in code/example-02-program-structure.cob.

IDENTIFICATION DIVISION

The IDENTIFICATION DIVISION is the program's identity card. It is the only division that is absolutely required in every COBOL program.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. PAYROLL-CALC.
       AUTHOR. JANE SMITH.
       INSTALLATION. CORPORATE DATA CENTER.
       DATE-WRITTEN. 2026-02-10.
       DATE-COMPILED.

Required paragraph: - PROGRAM-ID: Names the program. This is the only required element.

Optional paragraphs (for documentation): - AUTHOR: The programmer's name - INSTALLATION: The computing environment - DATE-WRITTEN: When the program was created - DATE-COMPILED: Automatically filled in by the compiler

Note that in COBOL 2002 and later, AUTHOR, INSTALLATION, DATE-WRITTEN, and DATE-COMPILED are considered obsolete but are still supported by most compilers. They remain in wide use in production code.

ENVIRONMENT DIVISION

The ENVIRONMENT DIVISION describes the hardware and software environment in which the program operates. It has two sections:

       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       SOURCE-COMPUTER. IBM-MAINFRAME.
       OBJECT-COMPUTER. IBM-MAINFRAME.

       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
           SELECT EMPLOYEE-FILE ASSIGN TO "EMPLOYEE.DAT"
               ORGANIZATION IS LINE SEQUENTIAL.

CONFIGURATION SECTION: Specifies the computers used for compilation (SOURCE-COMPUTER) and execution (OBJECT-COMPUTER). In modern practice, this section is often omitted or left minimal.

INPUT-OUTPUT SECTION: Contains the FILE-CONTROL paragraph, which maps logical file names used in the program to physical files on the system. The SELECT statement establishes this mapping.

DATA DIVISION

The DATA DIVISION is where all data items (variables, constants, records, and file layouts) are defined. It is often the largest division in a business COBOL program.

       DATA DIVISION.
       FILE SECTION.
       FD  EMPLOYEE-FILE.
       01  EMPLOYEE-RECORD.
           05  EMP-ID             PIC 9(6).
           05  EMP-NAME           PIC X(30).
           05  EMP-SALARY         PIC 9(7)V99.

       WORKING-STORAGE SECTION.
       01  WS-TOTAL-SALARY        PIC 9(9)V99 VALUE ZEROS.
       01  WS-EMPLOYEE-COUNT      PIC 9(5) VALUE ZEROS.
       01  WS-AVERAGE-SALARY      PIC 9(7)V99 VALUE ZEROS.

FILE SECTION: Defines the record layout for each file declared in the FILE-CONTROL paragraph. The FD (File Description) entry provides file-level information, and the 01-level record description defines the structure of each record.

WORKING-STORAGE SECTION: Defines variables used by the program during execution. These variables persist for the life of the program. This is the section you will use most frequently as a beginning COBOL programmer.

LOCAL-STORAGE SECTION: Similar to WORKING-STORAGE, but variables are re-initialized each time the program or section is called. Useful for reentrant programs.

LINKAGE SECTION: Defines data items that are passed to the program from a calling program. Used for inter-program communication.

The PICTURE clause (PIC) is COBOL's type system. It defines the size and type of each data item:

PIC Symbol Meaning Example
9 Numeric digit PIC 9(5) = 5-digit number
X Alphanumeric character PIC X(20) = 20-character field
A Alphabetic character PIC A(10) = 10-letter field
V Implied decimal point PIC 9(5)V99 = 5.2 digits
S Sign (positive/negative) PIC S9(5) = signed number
Z Zero suppression PIC Z(4)9 = suppress leading zeros
$` | Currency symbol | `PIC $9,999.99

PROCEDURE DIVISION

The PROCEDURE DIVISION contains all the executable logic. This is where computation happens. See code/example-03-first-calculation.cob for a complete arithmetic example.

       PROCEDURE DIVISION.
       MAIN-LOGIC.
           OPEN INPUT EMPLOYEE-FILE
           PERFORM READ-EMPLOYEE
           PERFORM PROCESS-EMPLOYEES
               UNTIL END-OF-FILE
           PERFORM CALCULATE-AVERAGE
           PERFORM DISPLAY-RESULTS
           CLOSE EMPLOYEE-FILE
           STOP RUN.

       READ-EMPLOYEE.
           READ EMPLOYEE-FILE
               AT END SET END-OF-FILE TO TRUE
           END-READ.

       PROCESS-EMPLOYEES.
           ADD 1 TO WS-EMPLOYEE-COUNT
           ADD EMP-SALARY TO WS-TOTAL-SALARY
           PERFORM READ-EMPLOYEE.

       CALCULATE-AVERAGE.
           IF WS-EMPLOYEE-COUNT > ZERO
               DIVIDE WS-TOTAL-SALARY BY WS-EMPLOYEE-COUNT
                   GIVING WS-AVERAGE-SALARY
               END-DIVIDE
           END-IF.

       DISPLAY-RESULTS.
           DISPLAY "TOTAL EMPLOYEES: " WS-EMPLOYEE-COUNT
           DISPLAY "AVERAGE SALARY:  " WS-AVERAGE-SALARY.

The PROCEDURE DIVISION is organized into sections and paragraphs. Paragraphs are named blocks of code (similar to functions or procedures in other languages) that can be invoked with the PERFORM statement.

Key statements you will encounter frequently:

  • MOVE: Assigns a value to a variable
  • ADD, SUBTRACT, MULTIPLY, DIVIDE: Arithmetic operations
  • COMPUTE: Evaluates arithmetic expressions
  • DISPLAY: Outputs text to the screen or log
  • ACCEPT: Reads input from the user or system
  • PERFORM: Calls a paragraph or section (like a function call)
  • IF/ELSE/END-IF: Conditional execution
  • EVALUATE: Multi-branch conditional (like switch/case)
  • READ/WRITE: File input/output
  • OPEN/CLOSE: File management
  • STOP RUN: Terminates the program

1.10 Fixed-Format vs. Free-Format COBOL

Fixed Format: The Traditional Layout

Traditional COBOL uses a fixed-format source layout that dates back to the era of 80-column punch cards. Every character position in a line has a specific meaning:

Columns  1-6  : Sequence number area (line numbers, ignored by compiler)
Column   7    : Indicator area
                  * = comment line
                  - = continuation of previous line
                  D = debugging line
                  / = comment line with page eject
Columns  8-11 : Area A (division headers, section headers, paragraph
                names, level numbers 01 and 77, FD entries)
Columns 12-72 : Area B (statements, sentences, clauses, continuation
                of Area A entries)
Columns 73-80 : Identification area (ignored by compiler, historically
                used for program identification)

Here is an example showing the column alignment:

000100 IDENTIFICATION DIVISION.                                         HELLO
000200 PROGRAM-ID. HELLOWORLD.                                          HELLO
000300*This is a comment - asterisk in column 7                         HELLO
000400 ENVIRONMENT DIVISION.                                            HELLO
000500 DATA DIVISION.                                                   HELLO
000600 WORKING-STORAGE SECTION.                                         HELLO
000700 01  WS-NAME         PIC X(20) VALUE "STUDENT".                   HELLO
000800 PROCEDURE DIVISION.                                              HELLO
000900 MAIN-PARA.                                                       HELLO
001000     DISPLAY "HELLO " WS-NAME.                                    HELLO
001100     STOP RUN.                                                    HELLO

The fixed format enforces a disciplined structure, but it can feel restrictive to programmers accustomed to modern languages. Common pitfalls for beginners include accidentally placing code in the wrong column area, which can cause subtle compilation errors.

Free Format: The Modern Alternative

COBOL 2002 introduced free-format source, which removes the column restrictions. In free format:

  • Code can start in any column
  • Lines can be up to 255 characters long
  • Comments begin with *> rather than * in column 7
  • No sequence number area or identification area

Here is the same program in free format:

IDENTIFICATION DIVISION.
PROGRAM-ID. HELLOWORLD.
*> This is a comment in free format
ENVIRONMENT DIVISION.
DATA DIVISION.
WORKING-STORAGE SECTION.
01  WS-NAME  PIC X(20) VALUE "STUDENT".
PROCEDURE DIVISION.
MAIN-PARA.
    DISPLAY "HELLO " WS-NAME.
    STOP RUN.

To compile in free format with GnuCOBOL, use the -free flag:

cobc -x -free -o helloworld helloworld.cob

Which Format Should You Learn?

Learn both, but start with fixed format. Here is why:

  1. The vast majority of existing COBOL code is in fixed format. If you are going to maintain or modernize production systems, you will encounter fixed-format code.

  2. Understanding fixed format gives you a deeper appreciation for COBOL's history and the constraints under which the language evolved.

  3. Fixed format enforces a visual discipline that makes code easier to scan and read, particularly for the hierarchical data definitions in the DATA DIVISION.

  4. Once you understand fixed format, switching to free format is trivial. The reverse is not always true.

Throughout this textbook, we will primarily use fixed format in our examples, with occasional free-format examples for comparison.


1.11 COBOL's Future: Modernization, Cloud, and Hybrid Architectures

The Modernization Imperative

Organizations are not abandoning COBOL, but they are modernizing how they use it. The modernization strategies that are gaining traction include:

API Enablement: Wrapping existing COBOL programs in RESTful APIs so that web applications, mobile apps, and microservices can call them. IBM's z/OS Connect and similar tools make it possible to expose a COBOL program as a JSON-based API without modifying the COBOL code itself.

DevOps Integration: Bringing modern development practices -- continuous integration, automated testing, version control with Git, and agile methodologies -- to the mainframe. Tools like IBM Wazi, Topaz for Total Test, and Zowe CLI are bridging the gap between mainframe and distributed development workflows.

Cloud Integration: IBM, AWS, and other cloud providers now offer mainframe-compatible services in the cloud. IBM's z/OS Cloud Broker and AWS Mainframe Modernization service allow organizations to run COBOL workloads in cloud environments or integrate mainframe data with cloud-native applications.

Incremental Migration: Rather than attempting a risky "big bang" replacement, organizations are migrating individual components from COBOL to modern languages while keeping the core COBOL systems running. This approach reduces risk and allows organizations to modernize at a sustainable pace.

Cloud Mainframes

The emergence of cloud-based mainframe services represents a significant shift. Organizations can now:

  • Run COBOL workloads on virtual mainframes in the cloud
  • Use cloud-based development environments for COBOL programming
  • Integrate mainframe data with cloud-native analytics and AI services
  • Scale mainframe capacity on demand without purchasing physical hardware

This does not eliminate the need for COBOL skills -- quite the opposite. Cloud mainframe services require developers who understand both COBOL and cloud technologies, a combination that commands premium compensation.

Hybrid Architectures

The future of enterprise computing is not "mainframe or cloud" but "mainframe and cloud." Hybrid architectures that combine the reliability and throughput of mainframe COBOL systems with the flexibility and innovation velocity of cloud-native technologies are becoming the standard approach.

In a typical hybrid architecture:

  • Core transaction processing remains on the mainframe in COBOL, where it benefits from decades of optimization and proven reliability
  • Customer-facing interfaces (web and mobile) are built with modern frameworks and communicate with the mainframe through APIs
  • Analytics and machine learning run in the cloud, consuming data produced by mainframe COBOL systems
  • New business capabilities are developed in cloud-native languages but integrate with mainframe systems for data access and transaction processing

This hybrid model ensures that COBOL developers are not working in isolation but are part of cross-functional teams that span the entire technology stack.

The Role of AI and Automation

Artificial intelligence is beginning to play a role in COBOL modernization. AI-assisted code analysis tools can help developers understand complex COBOL programs, identify dead code, map data flows, and even suggest translations to modern languages. However, these tools are assistive, not autonomous -- they require skilled COBOL developers to validate their output and make architectural decisions.

AI-powered code generation and completion tools are also being adapted for COBOL, helping new developers write correct COBOL code more quickly. These tools reduce the learning curve without eliminating the need for human understanding of the language and its ecosystem.

What This Means for You

If you are reading this textbook, you are entering the COBOL field at a moment of transformation. The developers who will be most valuable in the coming decades are those who can bridge two worlds: they understand COBOL and the mainframe ecosystem deeply enough to work with existing systems, and they are comfortable with modern technologies like REST APIs, Git, cloud platforms, and agile methodologies.

This hybrid skill set is rare and becoming rarer. Most experienced COBOL developers did not grow up with cloud computing. Most cloud-native developers have never seen a mainframe. The developers who can work in both worlds -- who can debug a COBOL batch abort in the morning and design a REST API in the afternoon -- will command premium compensation and have their pick of opportunities.

This textbook is designed to make you one of those developers.


1.12 Chapter Summary

In this chapter, you have been introduced to COBOL -- a language that has been quietly running the world's economy for over six decades. Here is what we covered:

The History: COBOL was born in 1959 from the CODASYL committee, championed by Grace Hopper and mandated by the Department of Defense. It was designed to be a common, portable, English-like language for business data processing.

The Design Philosophy: COBOL's verbosity is intentional. Its English-like syntax, strong data typing through PICTURE clauses, and self-documenting style were designed to make business programs readable by non-programmers.

The Standards Evolution: From COBOL-60 through COBOL 2014, the language has continuously evolved, adding structured programming (COBOL-85), object orientation (COBOL 2002), and modern data format support (COBOL 2014).

The Mainframe Ecosystem: COBOL operates within a sophisticated technology stack -- z/OS, JCL, CICS, DB2, VSAM -- that has been optimized for reliability and throughput over decades.

The Numbers: Over 220 billion lines of COBOL code process $3 trillion in daily commerce, handle 95% of ATM transactions, and power the core systems of virtually every major financial institution.

The Skills Gap: The aging COBOL workforce is retiring faster than new developers are being trained, creating a significant supply-demand imbalance that translates into career opportunity.

Your First Program: You wrote, compiled, and ran your first COBOL program, learning about the four-division structure, fixed-format column layout, and the basic mechanics of COBOL compilation.

The Future: COBOL is not disappearing. It is being modernized through API enablement, DevOps integration, cloud deployment, and hybrid architectures. The developers who bridge the COBOL and cloud worlds are among the most valuable in enterprise IT.

In the next chapter, we will set up your development environment and take a deeper dive into the COBOL syntax that will become the foundation of your programming skills.


Looking Ahead

Chapter 2: Setting Up Your Development Environment will guide you through installing GnuCOBOL, configuring your editor, and establishing a productive COBOL development workflow. You will also learn about mainframe emulators and cloud-based mainframe development environments that can give you hands-on experience with the z/OS ecosystem.


"I had a running compiler, and nobody would touch it. They told me computers could only do arithmetic." -- Grace Hopper, on her early work with compilers


Code Examples for This Chapter

The following code files accompany this chapter and are located in the code/ directory:

File Description
example-01-hello-world.cob Classic Hello World program in fixed format
example-02-program-structure.cob All four divisions with detailed comments
example-03-first-calculation.cob Basic arithmetic operations demonstration
case-study-code.cob Bank welcome program (Case Study 01)
exercise-solutions.cob Solutions for selected exercises