Case Study 1: SecureFirst's Mobile API Layer over CICS

From 3270 Screens to 4,000 JSON Requests Per Second


Background

SecureFirst Retail Bank is a mid-size retail institution with 800,000 customers, a growing mobile banking user base, and a core banking system built on CICS TS 5.6 with DB2. The COBOL programs powering core functions — account inquiry, fund transfer, bill payment, and transaction history — have been in production for 18 years. They process 3270 terminal traffic from 120 branch locations and an ATM network of 450 units.

In Q3 2023, SecureFirst's board approved a mobile-first digital strategy. The requirement: a native iOS/Android banking app with real-time account access, fund transfers, and payment scheduling. The timeline: 9 months to launch.

Carlos Vega, the mobile API architect hired from a fintech startup, had never seen a mainframe. Yuki Nakamura, the DevOps lead who had spent five years bringing modern tooling to SecureFirst's z/OS environment, became his counterpart on the mainframe side. Their collaboration — sometimes contentious, ultimately productive — produced an API layer that became the most reliable component in SecureFirst's entire digital stack.


The Initial Misstep

Carlos's initial architecture proposed a Java microservices layer that would replicate the COBOL business logic and access DB2 directly:

Mobile App → API Gateway → Java Microservices → DB2
                                    (replaces COBOL)

He estimated 6 months for the core four services (balance, transfer, history, payments). His rationale: "We need clean REST APIs with proper error handling, pagination, and HATEOAS links. COBOL programs weren't designed for that."

Yuki's response, delivered during an architecture review that both parties later described as "spirited," was direct: "You're proposing to rewrite 18 years of production-hardened code in 6 months. How many of those 18 years of edge cases have you documented?"

Carlos had documentation for approximately 40% of the business rules. The remaining 60% was embedded in the COBOL programs — conditional logic accumulated over 18 years of production fixes, regulatory changes, and audit findings.

The turning point came when Carlos attempted to replicate the fund transfer validation logic. The COBOL program's VALIDATE-TRANSFER paragraph contained 47 conditions, including:

  • Six different hold types that affect available balance
  • Three regulatory limits based on account type and customer classification
  • A complex fee calculation that depends on transfer amount, source account type, destination type, and day of week
  • Special handling for inter-bank transfers that crossed a daily cumulative threshold
  • A retroactive adjustment for transfers that reversed previously posted items

Replicating this logic — correctly, with all edge cases — in Java would take months and carry significant regression risk. Any discrepancy between the Java and COBOL versions would create a customer-facing inconsistency that auditors and regulators would flag.

Carlos conceded: "I spent three weeks trying to replicate one paragraph. There are forty paragraphs in the transfer program."


The Service Enablement Architecture

Yuki proposed the service enablement approach: wrap the existing COBOL programs as REST services. The mobile app calls the same COBOL programs that 3270 terminals call, with CICS handling the JSON transformation.

Phase 1: Balance Inquiry (Proof of Concept)

The team started with balance inquiry — the simplest service, read-only, single DB2 query.

Existing COBOL program: ACCTINQP - Interface: COMMAREA (256 bytes) - Input: 10-byte account ID, 1-byte request type - Output: Account details including balance, available balance, status, last activity date - Average execution time: 4ms (including DB2 query)

Step 1: Generate WSBIND from copybook

Yuki ran DFHLS2JS against the existing copybook ACCTINQC:

//LS2JS    EXEC PGM=DFHLS2JS,PARM='PDSLIB=YES'
//STEPLIB  DD DSN=CICSTS56.CICS.SDFHLOAD,DISP=SHR
//SYSPRINT DD SYSOUT=*
//INPUT    DD *
LANG=COBOL
LOGFILE=/u/sfirst/cics/log/acctinq_ls2js.log
PDSLIB=//SFIRST.COBOL.COPYLIB(ACCTINQC)
REQMEM=ACCT-INQ-REQUEST
RESPMEM=ACCT-INQ-RESPONSE
PGMNAME=ACCTINQP
URI=/api/v1/accounts
PGMINT=COMMAREA
MAPPING-LEVEL=3.0
JSON-SCHEMA=/u/sfirst/cics/schemas/acctinq.json
WSBIND=/u/sfirst/cics/wsbind/acctinq.wsbind
FIELD-NAME-MAP=/u/sfirst/cics/maps/acctinq_names.map
/*

The field name map file provided clean JSON names:

ACCT-RSP-ACCOUNT-ID=accountId
ACCT-RSP-ACCOUNT-TYPE=accountType
ACCT-RSP-ACCOUNT-NAME=accountName
ACCT-RSP-BALANCE=balance
ACCT-RSP-AVAILABLE-BAL=availableBalance
ACCT-RSP-LAST-ACTIVITY=lastActivity
ACCT-RSP-STATUS=status
ACCT-RSP-RETURN-CODE=returnCode
ACCT-REQ-ACCOUNT-ID=accountId
ACCT-REQ-REQUEST-TYPE=requestType

Step 2: Define CICS resources

CEDA DEFINE TCPIPSERVICE(SFHTTPS)
     GROUP(SFWEBGRP)
     PORTNUMBER(8443)
     STATUS(OPEN)
     PROTOCOL(HTTP)
     TRANSACTION(CWXN)
     SSL(YES)
     CERTIFICATE(SFIRST-CICS-SERVER)
     AUTHENTICATE(CERTIFICATE)
     MAXPERSIST(120)
     SOCKETCLOSE(00,00,30)
     BACKLOG(256)

CEDA DEFINE PIPELINE(SFJSONP)
     GROUP(SFWEBGRP)
     CONFIGFILE(/u/sfirst/cics/config/json-provider.xml)
     STATUS(ENABLED)
     SHELF(/u/sfirst/cics/shelf/)

CEDA DEFINE WEBSERVICE(SFACCTWS)
     GROUP(SFWEBGRP)
     PIPELINE(SFJSONP)
     WSBIND(/u/sfirst/cics/wsbind/acctinq.wsbind)
     VALIDATION(YES)
     STATE(ENABLED)

CEDA DEFINE URIMAP(SFACCTGQ)
     GROUP(SFWEBGRP)
     USAGE(SERVER)
     SCHEME(HTTPS)
     HOST(*)
     PORT(8443)
     PATH(/api/v1/accounts/{accountId})
     TCPIPSERVICE(SFHTTPS)
     PIPELINE(SFJSONP)
     PROGRAM(ACCTINQP)
     TRANSACTION(AINQ)
     MEDIATYPE(application/json)

Step 3: Define the web service transaction

CEDA DEFINE TRANSACTION(AINQ)
     GROUP(SFWEBGRP)
     PROGRAM(DFHWBA)
     PROFILE(DFHCICST)
     STATUS(ENABLED)
     TASKDATALOC(ANY)

A separate RACF profile for AINQ allowed the team to apply different authorization rules for mobile API access versus 3270 access:

RDEFINE TCICSTRN AINQ UACC(NONE)
PERMIT AINQ CLASS(TCICSTRN) ID(SFRSTAPIUSR) ACCESS(READ)

Step 4: Test

Carlos sent the first test request from his laptop:

GET https://cics-dev.securefirst.internal:8443/api/v1/accounts/1234567890

Response (7ms end-to-end through the development API gateway):

{
  "returnCode": 0,
  "accountId": "1234567890",
  "accountType": "CH",
  "accountName": "CARLOS A VEGA",
  "balance": 5432.10,
  "availableBalance": 5182.10,
  "lastActivity": "2023-09-15",
  "status": "A"
}

Carlos's reaction, as Yuki recalls: "He stared at the screen for about ten seconds and then said, 'That's it? The same program, same data, same logic — just... JSON?' He expected it to be harder."

Phase 2: Fund Transfer

Fund transfer was significantly more complex. The existing program XFERP used a 1,024-byte COMMAREA with nested structures and conditional logic.

Challenge 1: REDEFINES in the copybook

The XFER-DEST-INFO group used REDEFINES for internal versus external transfers:

       05  XFER-DEST-INFO.
           10  XFER-DEST-TYPE      PIC X(01).
           10  XFER-DEST-INTERNAL.
               15  XFER-INT-ACCT   PIC X(10).
           10  XFER-DEST-EXTERNAL
               REDEFINES XFER-DEST-INTERNAL.
               15  XFER-EXT-ROUTING PIC X(09).
               15  XFER-EXT-ACCT   PIC X(12).
               15  XFER-EXT-NAME   PIC X(30).

DFHLS2JS could not handle this. Yuki's team created a wrapper copybook:

       01  XFER-WS-REQUEST.
           05  WS-FROM-ACCOUNT     PIC X(10).
           05  WS-DEST-TYPE        PIC X(01).
           05  WS-INT-ACCT         PIC X(10).
           05  WS-EXT-ROUTING      PIC X(09).
           05  WS-EXT-ACCT         PIC X(12).
           05  WS-EXT-NAME         PIC X(30).
           05  WS-AMOUNT           PIC S9(09)V99 COMP-3.
           05  WS-CURRENCY         PIC X(03).
           05  WS-MEMO             PIC X(50).

And a thin wrapper program XFERWSP that mapped between the flat wrapper copybook and the legacy COMMAREA with REDEFINES:

       PROCEDURE DIVISION.
           EXEC CICS GET CONTAINER('DFHWS-DATA')
               CHANNEL('DFHWS-CHANNEL')
               INTO(XFER-WS-REQUEST)
               FLENGTH(WS-DATA-LEN)
           END-EXEC

      *    Map wrapper to legacy COMMAREA
           INITIALIZE XFER-COMMAREA
           MOVE WS-FROM-ACCOUNT TO XFER-FROM-ACCT
           MOVE WS-DEST-TYPE    TO XFER-DEST-TYPE
           EVALUATE XFER-DEST-TYPE
               WHEN 'I'
                   MOVE WS-INT-ACCT TO XFER-INT-ACCT
               WHEN 'E'
                   MOVE WS-EXT-ROUTING TO XFER-EXT-ROUTING
                   MOVE WS-EXT-ACCT    TO XFER-EXT-ACCT
                   MOVE WS-EXT-NAME    TO XFER-EXT-NAME
           END-EVALUATE
           MOVE WS-AMOUNT   TO XFER-AMOUNT
           MOVE WS-CURRENCY TO XFER-CURRENCY
           MOVE WS-MEMO     TO XFER-MEMO

      *    Call the real transfer program
           EXEC CICS LINK
               PROGRAM('XFERP')
               COMMAREA(XFER-COMMAREA)
               LENGTH(LENGTH OF XFER-COMMAREA)
           END-EXEC

      *    Map response back to wrapper
           MOVE XFER-RSP-CODE TO WS-RSP-RETURN-CODE
           MOVE XFER-RSP-REF  TO WS-RSP-REFERENCE
           MOVE XFER-RSP-BAL  TO WS-RSP-NEW-BALANCE
           MOVE XFER-RSP-MSG  TO WS-RSP-MESSAGE

           EXEC CICS PUT CONTAINER('DFHWS-DATA')
               CHANNEL('DFHWS-CHANNEL')
               FROM(XFER-WS-RESPONSE)
               FLENGTH(LENGTH OF XFER-WS-RESPONSE)
           END-EXEC

           EXEC CICS RETURN END-EXEC

This wrapper pattern became the standard for all services where the legacy copybook was incompatible with DFHLS2JS.

Challenge 2: Input validation

The 3270 version relied on BMS field attributes for basic input validation (numeric-only fields, required fields). The web service had no such protection. Carlos initially assumed the API gateway's JSON schema validation was sufficient. Yuki disagreed.

They added a validation section to the wrapper program:

       VALIDATE-WS-INPUT.
           IF WS-FROM-ACCOUNT = SPACES OR LOW-VALUES
               MOVE 400 TO WS-HTTP-STATUS
               MOVE 'Source account is required' TO WS-ERROR-MSG
               PERFORM RETURN-ERROR
               GOBACK
           END-IF

           IF WS-AMOUNT NOT > 0
               MOVE 400 TO WS-HTTP-STATUS
               MOVE 'Transfer amount must be positive'
                   TO WS-ERROR-MSG
               PERFORM RETURN-ERROR
               GOBACK
           END-IF

           IF WS-DEST-TYPE NOT = 'I' AND
              WS-DEST-TYPE NOT = 'E'
               MOVE 400 TO WS-HTTP-STATUS
               MOVE 'Destination type must be I or E'
                   TO WS-ERROR-MSG
               PERFORM RETURN-ERROR
               GOBACK
           END-IF

           IF WS-DEST-TYPE = 'E' AND
              WS-EXT-ROUTING = SPACES
               MOVE 400 TO WS-HTTP-STATUS
               MOVE 'Routing number required for external'
                   TO WS-ERROR-MSG
               PERFORM RETURN-ERROR
               GOBACK
           END-IF

This validation caught 3–4% of requests during the first month of production — requests that passed JSON schema validation but contained logically invalid data.


Performance Results

After deploying all four core services (balance inquiry, fund transfer, transaction history, payment scheduling), Yuki's team conducted a performance test simulating the projected mobile app traffic:

Response Time (95th percentile)

Service Target Measured Breakdown
Balance Inquiry 200ms 52ms Gateway: 8ms, Network: 6ms, CICS: 38ms (JSON: 0.4ms, COBOL: 4ms, DB2: 2ms, overhead: 31.6ms)
Fund Transfer 500ms 187ms Gateway: 12ms, Network: 6ms, CICS: 169ms (JSON: 0.8ms, COBOL: 85ms, DB2: 45ms, ext credit: 15ms, overhead: 23.2ms)
Transaction History (25 records) 300ms 124ms Gateway: 8ms, Network: 6ms, CICS: 110ms (JSON: 1.2ms, COBOL: 12ms, DB2: 78ms, overhead: 18.8ms)
Payment Schedule Create 500ms 203ms Gateway: 12ms, Network: 6ms, CICS: 185ms (JSON: 0.6ms, COBOL: 92ms, DB2: 62ms, overhead: 30.4ms)

The "overhead" category includes CICS task attach, routing, connection handling, and z/OS TCP/IP processing. At 52ms for balance inquiry — well under the 200ms target — the team had significant headroom.

Throughput

Metric Value
Sustained throughput 4,200 TPS across all services
Peak throughput (10-second burst) 6,800 TPS
CICS task utilization at sustained 68% of MAXTASK
DB2 thread utilization at sustained 55% of MAXDBAT
CPU utilization at sustained 42% of allocated zIIP + GP capacity

Availability (first 6 months of production)

Metric Value
Total requests served 3.2 billion
HTTP 200 responses 99.87%
HTTP 4xx responses (client errors) 0.11%
HTTP 5xx responses (server errors) 0.02%
Unplanned outage minutes 7 minutes (one DB2 restart)
Availability 99.998%

Lessons Learned

1. The wrapper pattern scales

SecureFirst now has 14 REST services in production, all following the same wrapper pattern: thin COBOL wrapper for JSON-compatible input/output, with EXEC CICS LINK to the legacy program. The wrapper adds 2–5ms of overhead, which is negligible. The consistency of the pattern makes it easy to train new team members and troubleshoot issues.

2. JSON transformation overhead is negligible for typical payloads

The JSON transformation (DFHJSON handler) consistently adds less than 1.5ms for payloads under 5KB. The team initially over-invested in optimizing transformation performance. In practice, DB2 query time dominates response time for every service.

3. Connection pooling matters

During the first week of production, the API gateway was configured to open a new TCP connection for every request. Response times were 70ms higher due to TLS negotiation overhead. After configuring the API gateway to maintain a pool of 80 persistent connections per CICS TOR, response times dropped to the numbers shown above.

4. Validation in the wrapper saved the team repeatedly

API gateway JSON schema validation catches malformed requests (missing fields, wrong types). But it cannot catch logically invalid data (transfer to your own account, amount exceeding daily limit, invalid routing number). The COBOL wrapper's validation logic caught 3–4% of requests in the first month. Without it, those requests would have reached the legacy program and produced confusing error responses or, worse, incorrect results.

5. Monitoring requires both layers

SMF 110 records capture CICS-level metrics. Application-level audit logging captures business-level detail. Neither alone is sufficient. The team built a Splunk dashboard that combines both data sources, with a correlation ID (passed as an HTTP header from the API gateway) linking the two.


Carlos Vega's Retrospective

Six months after launch, Carlos wrote in an internal engineering blog post:

"I came to SecureFirst thinking COBOL was the obstacle. I proposed rewriting everything in Java because that's what I knew. What I didn't understand was that the COBOL programs aren't the problem — they're the solution. They contain decades of tested, audited, compliant business logic. My job wasn't to replace that logic. It was to give it a modern front door.

The JSON transformation adds less than a millisecond. The wrapper program adds less than five milliseconds. The business logic — the part that actually matters — runs in the same COBOL program that has been running correctly for 18 years. Our mobile app is faster and more reliable than any microservice architecture I've worked with, because the hardest part — the business logic — was already done.

I still prefer writing Java. But I have deep respect for the COBOL programs running behind those REST endpoints. They're the most reliable code in our stack."


Discussion Questions

  1. At what point would the service enablement approach become impractical? If SecureFirst needed to expose 200 COBOL programs, should they continue with the wrapper pattern or consider z/OS Connect EE?

  2. The wrapper program adds a layer of indirection. Under what circumstances could this layer become a maintenance burden rather than a benefit?

  3. SecureFirst's API gateway rejects 15% of inbound traffic. If that rejection rate increased to 40%, what would you investigate first?

  4. Carlos's initial Java microservices proposal would have created a second code path for the same business logic. Describe three specific risks of maintaining two implementations of the same business rules.

  5. The fund transfer service calls an external credit score API with a 3-second timeout. How would you redesign this if the credit score API's average response time increased from 15ms to 800ms?