Chapter 20: Event-Driven Architecture with COBOL — MQ Triggers, CICS Events, and Pub/Sub Patterns

37 min read

> This chapter builds directly on Chapter 19's MQ fundamentals and connects to Chapter 13's CICS architecture. Before continuing, make sure you can answer these from memory:

In This Chapter

20.1 From Request/Reply to Events — Why EDA Matters for Mainframe Systems
20.2 MQ Triggers — Starting Programs When Messages Arrive
20.3 CICS Event Processing — Capturing Business Events Without Code Changes
20.4 Pub/Sub with MQ — Broadcasting Events to Multiple Consumers
20.5 Event Schemas and Contracts — Designing Events for COBOL Producers and Consumers
20.6 Event-Driven Patterns — Event Sourcing, Sagas, and CQRS for Mainframe Architects
20.7 Production Patterns and Anti-Patterns
20.8 Project Checkpoint — Event-Driven Notification Layer for HA Banking System
Conclusion — Events Changed CNB's Architecture. They'll Change Yours.

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 20: Event-Driven Architecture with COBOL — MQ Triggers, CICS Events, and Pub/Sub Patterns

📊 SPACED REVIEW — Before You Begin This chapter builds directly on Chapter 19's MQ fundamentals and connects to Chapter 13's CICS architecture. Before continuing, make sure you can answer these from memory: - From Ch 19: What is the difference between temporal and spatial decoupling? What does MQPMO-SYNCPOINT do in a CICS unit of work? What is the purpose of a dead letter queue? - From Ch 13: What is the role of a TOR vs. an AOR in CICS MRO? How does CICS manage units of work across regions?

If any of these are fuzzy, revisit those chapters. This chapter assumes you have them cold.

20.1 From Request/Reply to Events — Why EDA Matters for Mainframe Systems

Kwame Mensah was three hours into CNB's post-mortem for the November 2023 fraud incident when he finally lost patience.

"We detected the fraudulent wire transfer 14 seconds after it was submitted," he said, reading from the timeline. "Our rules engine correctly flagged it as suspicious in 200 milliseconds. And then it sat in a request/reply queue for 13.8 seconds waiting for the compliance review service to respond, because that service was processing a backlog of legitimate reviews from the morning's ACH batch."

Fourteen seconds. The wire cleared. $2.3 million left CNB's correspondent account and vanished into a web of intermediary banks. The fraud team traced it to three jurisdictions before the trail went cold.

Lisa Tran asked the question that changed CNB's architecture: "Why does the fraud system need to ask compliance to review the transaction? Why can't it just announce that a suspicious transaction was detected, and let every system that cares react simultaneously?"

That question — why ask when you can announce — is the core of event-driven architecture. And it's the reason you're reading this chapter.

The Limitations of Request/Reply

Chapter 19 introduced MQ's request/reply pattern. It's powerful. It's well-understood. And for many use cases — balance inquiries, credit score lookups, synchronous authorization — it's exactly right.

But request/reply has structural limitations that become dangerous at enterprise scale:

Sequential bottlenecks. In request/reply, System A sends a request to System B and waits for a response. If System A also needs to notify System C and System D, it must do so sequentially — or implement complex parallel request logic. At CNB, a single wire transfer approval required sequential calls to fraud screening, compliance review, sanctions checking, and AML (anti-money-laundering) scoring. Four services, four round trips, four points where a slow or down service could block the entire flow.

Tight coupling of flow control. The sender determines who receives the message. If a new consumer needs the data — say, a new regulatory reporting system — you have to modify the sender to add another request/reply pair. At CNB, adding the sanctions screening system to the wire transfer flow required changes to the wire transfer program, the request queue definitions, the reply queue definitions, and the correlation logic. Three weeks of development. Two weeks of testing. For what was conceptually a one-line requirement: "Also send wire transfers to the sanctions system."

Wasted capacity during waits. In a CICS environment, a task that's waiting for a reply is holding task-related storage, potentially holding DB2 threads, and consuming one of your MAXT task slots. If the reply takes 14 seconds (as in CNB's fraud incident), that's 14 seconds of wasted capacity multiplied by however many concurrent requests are in flight.

Fragile error handling. When a reply doesn't come, the requester has to decide: retry? timeout? abend? compensate? Each decision adds complexity. And the decision logic is different for each service. Fraud screening timeout might mean "block the transaction." Notification timeout might mean "proceed anyway." The requester program becomes a tangled nest of timeout-specific error handlers.

The Event-Driven Alternative

Event-driven architecture (EDA) inverts the communication model. Instead of "System A asks System B to do something," it's "System A announces that something happened, and any system that cares reacts."

The fundamental shift: the producer doesn't know or care who consumes the event. It doesn't know how many consumers there are, what they do with the event, or how long they take. It puts the event on a topic (not a queue), and MQ distributes copies to every subscriber.

Here's how CNB's wire transfer flow looks in both models:

Request/Reply Model (Before):

Wire Transfer Program
  → MQPUT to FRAUD.SCREEN.REQUEST      (wait for reply...)
  ← MQGET from FRAUD.SCREEN.REPLY      (14 seconds later)
  → MQPUT to COMPLIANCE.REVIEW.REQUEST  (wait for reply...)
  ← MQGET from COMPLIANCE.REVIEW.REPLY  (3 seconds later)
  → MQPUT to SANCTIONS.CHECK.REQUEST    (wait for reply...)
  ← MQGET from SANCTIONS.CHECK.REPLY    (1 second later)
  → MQPUT to AML.SCORE.REQUEST          (wait for reply...)
  ← MQGET from AML.SCORE.REPLY          (2 seconds later)
Total: 20+ seconds, sequential, any failure blocks all

Event-Driven Model (After):

Wire Transfer Program
  → Publish to topic WIRE.TRANSFER.SUBMITTED
  (Done. Move on to next transaction.)

Fraud Screening   ← subscribes, receives event, processes independently
Compliance Review ← subscribes, receives event, processes independently
Sanctions Check   ← subscribes, receives event, processes independently
AML Scoring       ← subscribes, receives event, processes independently
Audit Trail       ← subscribes, receives event, logs independently
Total: near-zero latency for the sender, parallel processing for all consumers

The wire transfer program doesn't wait. It doesn't know about the fraud system or the compliance system. It simply announces "a wire transfer was submitted" and lets the event infrastructure handle distribution.

⚠️ EDA IS NOT A REPLACEMENT FOR REQUEST/REPLY

Don't make the mistake of converting everything to events. Request/reply is correct when the sender needs a response before proceeding — balance inquiries, real-time authorization, credit score checks. EDA is correct when the sender doesn't need to wait — notifications, audit logging, analytics, downstream processing that happens asynchronously.

At CNB, about 30% of the message flows converted to event-driven. The other 70% stayed request/reply. The skill is knowing which flows belong in which model. The wire transfer submission became event-driven, but the real-time ATM authorization stayed request/reply because the ATM has to wait for an approve/deny response before dispensing cash.

EDA on the Mainframe — Not New, Just Formalized

Here's the thing that surprises distributed-world architects when they first encounter mainframe messaging: event-driven patterns have existed on z/OS since the 1990s. MQ triggers — where a message arriving on a queue automatically starts a program — are a form of event-driven processing. CICS START commands, initiated by one transaction to start another, are event-driven. Even the venerable JES2 internal reader, where one job submits another, is event-driven in the loosest sense.

What's changed is formalization. Modern MQ provides publish/subscribe infrastructure. CICS provides Event Processing (EP) to capture business events without modifying application code. And the patterns have names now — event sourcing, CQRS, sagas — that let us reason about them systematically.

This chapter gives you three concrete tools: 1. MQ Triggers — react to messages arriving on queues (Section 20.2) 2. CICS Event Processing — capture business events from CICS transactions (Section 20.3) 3. MQ Pub/Sub — broadcast events to multiple consumers (Section 20.4)

And three design disciplines: 4. Event Schemas — standardize event formats for COBOL producers and consumers (Section 20.5) 5. Event-Driven Patterns — event sourcing, sagas, and CQRS for mainframe architects (Section 20.6) 6. Production Patterns and Anti-Patterns — what works and what will burn you (Section 20.7)

20.2 MQ Triggers — Starting Programs When Messages Arrive

The Concept

An MQ trigger is the simplest form of event-driven processing on z/OS. The concept: when a message arrives on a queue, MQ automatically starts a program to process it. No polling. No timer-based checks. No operator intervention. Message arrives, program starts, message gets processed.

This is fundamentally different from the model you've been using in Chapter 19, where a long-running program sits in a GET loop waiting for messages. Both models work, but they solve different problems:

Approach	Best For	Resource Usage	Latency
Long-running GET loop	High-volume queues with continuous flow	Constant — program always running	Low — program is ready
MQ trigger	Intermittent or bursty queues	On-demand — program starts only when needed	Higher — program startup cost

At CNB, the incoming ACH file processing queue receives messages in bursts — thousands of messages arrive when a file lands, then nothing for hours. Running a GET-loop program 24/7 for that queue wastes a CICS task slot (or a batch region) during the idle hours. Triggering a program when messages arrive means resources are consumed only when there's work to do.

How Triggering Works — The Components

MQ triggering involves five components working together:

1. Application Queue (the trigger queue's "watched" queue) This is the queue that receives messages. You configure it with triggering attributes that tell MQ when to fire a trigger.

2. Trigger Monitor A long-running program that watches the initiation queue for trigger messages. On z/OS with CICS, the trigger monitor is CKTI (Channel Key Transaction Initiator) — a CICS-supplied transaction that runs automatically in the CICS region.

3. Initiation Queue When triggering conditions are met, MQ writes a special trigger message to this queue. The trigger message contains information about which queue triggered and what program to start. Typically defined as SYSTEM.CICS.INITIATION.QUEUE in a CICS environment.

4. Process Definition An MQ object that specifies what to do when the trigger fires — specifically, which program (or CICS transaction) to start. The process definition contains the application ID (transaction name in CICS, program name in batch).

5. Triggered Program Your COBOL program that processes the messages. It's started by the trigger monitor, opens the application queue, processes the available messages, and terminates when the queue is empty (or after processing a configurable number of messages).

Here's the flow:

Message arrives on APP.QUEUE
        |
        v
MQ checks trigger conditions
(Is trigtype satisfied? Is triggering enabled?)
        |
        v (yes)
MQ writes trigger message to INITIATION.QUEUE
        |
        v
CKTI (trigger monitor) reads trigger message
        |
        v
CKTI starts the specified CICS transaction
(using the process definition's APPLICID)
        |
        v
Your COBOL program runs, opens APP.QUEUE,
processes messages, then terminates

Trigger Types

MQ supports three trigger types, and choosing the wrong one is a common source of production problems:

TRIGTYPE(FIRST) — MQ fires the trigger only when the first message arrives on an empty queue. After the first trigger, no more triggers fire until the queue goes back to empty (depth = 0) and another message arrives. This is the standard choice for high-volume queues. Your triggered program should process all available messages before terminating, which resets the queue to empty and re-arms the trigger.

TRIGTYPE(EVERY) — MQ fires a trigger for every single message that arrives. If 10,000 messages arrive in a burst, MQ generates 10,000 trigger messages, and CKTI tries to start 10,000 instances of your transaction. This will overwhelm your CICS region. Use TRIGTYPE(EVERY) only for low-volume queues where each message truly needs its own dedicated transaction instance — for example, a queue that receives one or two high-priority alerts per hour.

TRIGTYPE(DEPTH) — MQ fires the trigger when the queue depth reaches a specified threshold. For example, TRIGDPTH(100) fires when 100 messages have accumulated. Useful for batch-oriented triggered processing where you want to accumulate a workload before starting the processor.

🔴 PRODUCTION INCIDENT: TRIGTYPE(EVERY) ON A HIGH-VOLUME QUEUE

Rob Calloway tells this story at every CNB architecture review. A developer configured TRIGTYPE(EVERY) on the ACH file processing queue. When the first ACH file landed — 47,000 messages — CKTI attempted to start 47,000 instances of the processing transaction. The CICS region hit MAXT in under a second. Every real user transaction got SOS (Short-on-Storage). The region went into a cascading abend spiral — AKR8 abends everywhere — and had to be recycled. It took 23 minutes to recover. The fix: TRIGTYPE(FIRST). One line. Twenty-three minutes of outage.

Configuring a Triggered Queue

Here's the complete queue definition for a triggered queue:

DEFINE QLOCAL(CNB.ACH.INBOUND)  +
       DESCR('Inbound ACH transactions')  +
       PUT(ENABLED) GET(ENABLED)  +
       DEFPSIST(YES)  +
       MAXDEPTH(1000000)  +
       MAXMSGL(32768)  +
       TRIGGER  +
       TRIGTYPE(FIRST)  +
       TRIGDPTH(1)  +
       INITQ(SYSTEM.CICS.INITIATION.QUEUE)  +
       PROCESS(CNB.ACH.PROCESS)  +
       BOTHRESH(3)  +
       BOQNAME(CNB.ACH.BACKOUT)

The trigger-related attributes: - TRIGGER — enables triggering on this queue - TRIGTYPE(FIRST) — fire only on first message when queue was empty - TRIGDPTH(1) — depth threshold (used with TRIGTYPE(DEPTH)) - INITQ(SYSTEM.CICS.INITIATION.QUEUE) — where to write trigger messages - PROCESS(CNB.ACH.PROCESS) — the process definition to invoke

And the process definition:

DEFINE PROCESS(CNB.ACH.PROCESS)  +
       DESCR('ACH inbound processing')  +
       APPLICTYPE(CICS)  +
       APPLICID(ACHI)  +
       USERDATA('CNB.ACH.INBOUND')

APPLICTYPE(CICS) — the triggered application is a CICS transaction
APPLICID(ACHI) — the 4-character CICS transaction ID to start
USERDATA — passed to the triggered program in the trigger message (typically the queue name)

Writing a Triggered COBOL Program

A triggered program differs from a standard MQ program in one critical way: it must read the trigger message to discover which queue triggered it, rather than having the queue name hardcoded.

The trigger message structure is defined in the CMQTMCV copybook (Trigger Message Channel — COBOL version for CICS). It contains:

Field	Content
`TMCF-QNAME`	Name of the queue that triggered
`TMCF-PROCESSNAME`	Name of the process definition
`TMCF-TRIGGERDATA`	First 64 bytes of the process definition's USERDATA
`TMCF-APPLTYPE`	Application type from the process definition
`TMCF-APPLID`	Application ID from the process definition
`TMCF-USERDATA`	The USERDATA from the process definition

In CICS, CKTI passes the trigger message to your program through the CICS RETRIEVE mechanism. Your program's first action is EXEC CICS RETRIEVE to get the trigger message, which tells it which queue to open.

Here's the skeleton of a triggered COBOL program in CICS:

       PROCEDURE DIVISION.
       0000-MAIN-CONTROL.
      *    -----------------------------------------------
      *    Step 1: Retrieve the trigger message from CKTI
      *    -----------------------------------------------
           EXEC CICS RETRIEVE
                INTO(WS-TRIGGER-MSG)
                LENGTH(WS-TRIGGER-LEN)
                RESP(WS-CICS-RESP)
           END-EXEC

           IF WS-CICS-RESP NOT = DFHRESP(NORMAL)
               PERFORM 9100-RETRIEVE-ERROR
               EXEC CICS RETURN END-EXEC
           END-IF

      *    Extract the queue name from the trigger message
           MOVE TMCF-QNAME TO WS-QUEUE-NAME

      *    -----------------------------------------------
      *    Step 2: Open the triggered queue
      *    -----------------------------------------------
           MOVE WS-QUEUE-NAME TO MQOD-OBJECTNAME
           CALL 'MQOPEN' USING WS-HCONN
                                WS-MQOD
                                WS-OPEN-OPTIONS
                                WS-HOBJ
                                WS-COMP-CODE
                                WS-REASON

           IF WS-COMP-CODE NOT = MQCC-OK
               PERFORM 9200-OPEN-ERROR
               EXEC CICS RETURN END-EXEC
           END-IF

      *    -----------------------------------------------
      *    Step 3: Process all available messages
      *    -----------------------------------------------
           PERFORM 1000-PROCESS-MESSAGES
               UNTIL WS-NO-MORE-MSGS = 'Y'

      *    -----------------------------------------------
      *    Step 4: Close queue and return
      *    -----------------------------------------------
           CALL 'MQCLOSE' USING WS-HCONN
                                 WS-HOBJ
                                 MQCO-NONE
                                 WS-COMP-CODE
                                 WS-REASON

           EXEC CICS RETURN END-EXEC.

       1000-PROCESS-MESSAGES.
           CALL 'MQGET' USING WS-HCONN
                               WS-HOBJ
                               WS-MQMD
                               WS-MQGMO
                               WS-BUFFER-LEN
                               WS-MSG-BUFFER
                               WS-DATA-LEN
                               WS-COMP-CODE
                               WS-REASON

           EVALUATE TRUE
               WHEN WS-COMP-CODE = MQCC-OK
                   PERFORM 2000-PROCESS-ONE-MESSAGE
               WHEN WS-REASON = MQRC-NO-MSG-AVAILABLE
                   MOVE 'Y' TO WS-NO-MORE-MSGS
               WHEN OTHER
                   PERFORM 9300-GET-ERROR
                   MOVE 'Y' TO WS-NO-MORE-MSGS
           END-EVALUATE.

The key discipline: process all messages, not just one. With TRIGTYPE(FIRST), the trigger fires only once. Your program must drain the queue (or process up to a configurable maximum) before terminating. If you process one message and exit, the remaining messages sit on the queue untouched until it drains to zero and the next message arrives — which may be hours or days.

💡 TRIGGER RE-ARMING

A TRIGTYPE(FIRST) trigger re-arms when the queue depth returns to zero. This means your triggered program should process messages until MQGET returns 2033 (no message available). At that point, the queue is empty, the trigger is re-armed, and the next arriving message will fire a new trigger.

There's a subtle race condition here: if a message arrives while your program is processing the last message, the queue depth never hits zero, and the trigger never re-arms. Your program processes the message, terminates, and the newly arrived message sits on the queue with no trigger to start a processor. The solution: use MQGMO-WAIT with a short wait interval (say, 5 seconds) on the final MQGET. This gives any in-flight messages time to land before your program gives up and exits.

Batch Triggering

MQ triggers aren't limited to CICS. You can trigger batch programs using the runmqtrm trigger monitor (or on z/OS, the CSQUTIL utility with trigger monitoring). The process definition specifies APPLICTYPE(OS400) or APPLICTYPE(MVS) and the APPLICID contains the program name or JCL procedure.

At Federal Benefits Administration, Sandra Chen uses batch triggering for incoming state enrollment files. Each state transmits enrollment updates on its own schedule — some at midnight, some mid-afternoon. Rather than running a scheduler job every 15 minutes to check for files, Sandra configured MQ triggers that start the enrollment processing batch job the moment messages arrive from a state system.

"We cut our enrollment processing latency from 15 minutes average to under 30 seconds," Sandra says. "And we stopped wasting batch initiators on polling jobs that find nothing 95% of the time."

Trigger Monitoring and Operations

CKTI, the CICS trigger monitor, is a CICS-supplied transaction that starts automatically when the CICS region initializes (if configured in the CICS SIT or resource definitions). Operational considerations:

Monitoring CKTI health. If CKTI stops running, no triggers fire. Monitor the initiation queue depth — if it's growing, CKTI is either not running or not keeping up. Set a queue depth alert on SYSTEM.CICS.INITIATION.QUEUE with a threshold appropriate for your environment (at CNB, the alert fires at depth 50).

CKTI and MAXT. Every triggered transaction consumes a CICS task slot. If you have 20 triggered queues all firing simultaneously, you need 20 task slots available. Plan your MAXT allocation to account for triggered transaction demand, especially during burst periods.

Disabling triggers for maintenance. To stop a queue from triggering during maintenance, issue ALTER QLOCAL(queue-name) NOTRIGGER. Don't forget to re-enable it: ALTER QLOCAL(queue-name) TRIGGER. Rob Calloway at CNB adds this to the maintenance checklist: "Item 37: Verify all trigger-enabled queues have TRIGGER set. Every. Single. One."

20.3 CICS Event Processing — Capturing Business Events Without Code Changes

The Problem: Instrumenting Legacy Applications

CNB has 4,200 COBOL programs running in CICS. When the architecture team decided to implement event-driven fraud detection, the first question was: how do we make the wire transfer program emit events?

The obvious answer — modify the program to add MQPUT calls — has problems. The wire transfer program is 12,000 lines of COBOL written over 15 years by at least eight different programmers. It's been through four rounds of "just add this one thing." Its paragraph structure is, as Lisa Tran diplomatically puts it, "historically complex." Adding MQ logic means modifying the program, regression testing, and deploying — a three-week cycle for a program that processes $4.7 billion per day.

CICS Event Processing (EP) offers an alternative: capture business events from CICS transactions without modifying the application code. EP sits between the CICS runtime and your application, watching for specified conditions and emitting events when those conditions occur.

How CICS Event Processing Works

CICS EP has three components:

1. Event Binding An XML configuration that defines: - What to capture (which CICS command, which program, which data) - When to capture it (on entry, on exit, on specific conditions) - What data to include in the event - Where to send the event (the EP adapter)

2. Capture Specification Within the event binding, the capture specification defines the exact trigger point. CICS EP can capture events from: - CICS API commands — EXEC CICS WRITE FILE, EXEC CICS LINK, EXEC CICS START, etc. - Program entry/exit — when a specific program is invoked or returns - CICS system events — task start, task end, transaction abend - Service entry/exit — when a CICS web service or channel-based call occurs

3. EP Adapter The delivery mechanism for captured events. CICS provides built-in adapters: - MQ adapter — writes the event to an MQ queue (most common for z/OS integration) - HTTP adapter — posts the event to an HTTP endpoint (for distributed consumers) - TS Queue adapter — writes to a CICS temporary storage queue (for debugging) - Custom adapter — your own COBOL program that receives the event and does whatever you want

Configuring an Event Binding

Let's walk through the event binding that CNB created to capture wire transfer submissions without modifying the wire transfer program.

The wire transfer program (WIREXFR1) writes to a VSAM file (WIRETRAN) using EXEC CICS WRITE FILE. CNB configured an event binding that captures every WRITE to that file:

<?xml version="1.0" encoding="UTF-8"?>
<event-binding name="WireTransferSubmitted"
               xmlns="http://www.ibm.com/xmlns/prod/cics/eventbinding">

  <description>
    Capture wire transfer submission events
  </description>

  <capture-specification>
    <event-capture name="WireTransferCapture"
                   component="WIREXFR1"
                   capture-point="FILE_WRITE"
                   current-context="WIRETRAN">
      <capture-data>
        <data-item name="AccountFrom"
                   source="CONTAINER"
                   container-name="WIRE-DATA"
                   offset="0"
                   length="16"/>
        <data-item name="AccountTo"
                   source="CONTAINER"
                   container-name="WIRE-DATA"
                   offset="16"
                   length="16"/>
        <data-item name="Amount"
                   source="CONTAINER"
                   container-name="WIRE-DATA"
                   offset="32"
                   length="8"/>
        <data-item name="Currency"
                   source="CONTAINER"
                   container-name="WIRE-DATA"
                   offset="40"
                   length="3"/>
        <data-item name="BeneficiaryName"
                   source="CONTAINER"
                   container-name="WIRE-DATA"
                   offset="43"
                   length="40"/>
      </capture-data>
    </event-capture>
  </capture-specification>

  <event-emission>
    <adapter-name>MQAdapter</adapter-name>
    <adapter-properties>
      <property name="queue-manager" value="HAQM01"/>
      <property name="queue-name"
                value="CNB.EVENTS.WIRE.SUBMITTED"/>
      <property name="format" value="JSON"/>
    </adapter-properties>
  </event-emission>

</event-binding>

The event binding is installed into CICS as a resource — no code compilation, no program modification. CICS intercepts the WRITE FILE command at runtime, extracts the specified data fields, formats them as a JSON event, and puts the event on the MQ queue. The wire transfer program doesn't know it's happening.

Deploying and Managing Event Bindings

Event bindings are deployed through CICS bundles. A bundle is a deployable unit containing one or more event bindings, plus any associated resources. The deployment process:

1. Create the event binding XML (typically using CICS Explorer IDE)
2. Package it in a CICS bundle (a zFS directory with a manifest)
3. DEFINE BUNDLE(WIREEVT) BUNDLEDIR(/cics/bundles/wire-events)
4. INSTALL BUNDLE(WIREEVT)

Once installed, the event binding is active immediately. No CICS restart required. No application redeployment. This is the power of EP: you can add event emission to a 15-year-old program in minutes, not weeks.

Managing event bindings in production:

CEMT SET EVENTBINDING(WireTransferSubmitted) ENABLED
CEMT SET EVENTBINDING(WireTransferSubmitted) DISABLED
CEMT INQUIRE EVENTBINDING(WireTransferSubmitted)

The INQUIRE command shows you how many events have been captured, how many were successfully emitted, and how many failed. Monitor the failure count — a rising failure count means the EP adapter can't deliver events (typically because the target MQ queue is full or the queue manager is down).

What EP Can and Cannot Do

EP can: - Capture events from CICS API commands without code changes - Extract data from COMMAREA, channels/containers, or CICS system areas - Filter events (only capture if specific conditions are met) - Format events as JSON, XML, or binary - Deliver to MQ, HTTP, TS queues, or custom adapters

EP cannot: - Capture events from programs that don't use CICS APIs (pure COBOL logic without EXEC CICS commands) - Modify the application's behavior — it's read-only observation - Guarantee delivery in the same unit of work as the application (EP emission is asynchronous by default) - Replace application-level event emission when you need transactional guarantees between the event and the business operation

⚠️ EP AND TRANSACTIONAL CONSISTENCY

By default, CICS EP emits events asynchronously — the event emission is not part of the application's unit of work. This means: if the wire transfer program writes the VSAM record and then abends before SYNCPOINT, the VSAM write rolls back but the EP event has already been emitted. You now have an event in MQ for a wire transfer that didn't happen.

For most use cases (audit logging, analytics, monitoring), this is acceptable — the consumer can handle phantom events. For fraud detection, it's actually desirable — you want to detect the attempt even if it later fails. But for cases where the event must be transactionally consistent with the business operation, you need the synchronous EP adapter or application-level MQPUT under syncpoint (which means modifying the program).

At CNB, the fraud detection events use asynchronous EP (phantom events are filtered by the fraud engine). The regulatory reporting events use application-level MQPUT under syncpoint because regulators require exact consistency between the transaction and the report.

EP Performance Impact

The overhead of CICS EP is measurable but small: typically 50–200 microseconds per captured event, depending on the amount of data extracted. At CNB's transaction rates, this adds up to roughly 0.3% CPU overhead for the wire transfer region. Kwame considers it "the cheapest insurance we've ever bought."

However, the overhead scales with the number of active event bindings and the complexity of data extraction. If you install 50 event bindings in a single CICS region, each extracting large data payloads, you'll notice. Profile in a test environment before deploying to production.

20.4 Pub/Sub with MQ — Broadcasting Events to Multiple Consumers

From Queues to Topics

Chapter 19 focused on point-to-point queuing: one sender, one queue, one receiver. Pub/sub changes the model to one sender, one topic, many receivers.

The key concept: a topic is a named channel for events. Producers publish messages to topics. Consumers subscribe to topics. MQ handles the distribution: when a message is published to a topic, MQ delivers a copy to every active subscriber.

At CNB, the wire transfer event binding (Section 20.3) puts events on a queue. But the fraud system, the compliance system, the AML system, the audit system, and the customer notification system all need those events. With point-to-point queuing, you'd need five queue definitions and five copies of each event. With pub/sub, you publish once, and MQ delivers five copies.

Topic Structure

MQ topics are organized in a hierarchical tree using forward-slash delimiters, similar to file system paths:

CNB/
  EVENTS/
    WIRE/
      SUBMITTED       ← wire transfer submitted
      APPROVED         ← wire transfer approved
      REJECTED         ← wire transfer rejected
      COMPLETED        ← wire transfer completed
    ACH/
      RECEIVED         ← ACH file received
      PROCESSED        ← ACH batch processed
    ACCOUNT/
      OPENED           ← new account opened
      CLOSED           ← account closed
      BALANCE-ALERT    ← balance threshold crossed

Subscribers can subscribe to specific topics or use wildcards: - CNB/EVENTS/WIRE/SUBMITTED — only wire transfer submissions - CNB/EVENTS/WIRE/# — all wire transfer events (submitted, approved, rejected, completed) - CNB/EVENTS/+/SUBMITTED — all "submitted" events regardless of transaction type - CNB/EVENTS/# — every event in the system

The # wildcard matches zero or more levels. The + wildcard matches exactly one level. These follow the standard MQ topic wildcard syntax.

Defining Topics and Subscriptions

Topic definition (MQSC):

DEFINE TOPIC(CNB.WIRE.SUBMITTED)  +
       TOPICSTR('CNB/EVENTS/WIRE/SUBMITTED')  +
       DESCR('Wire transfer submitted events')  +
       DEFPSIST(YES)  +
       DURSUB(YES)  +
       PUB(ENABLED)  +
       SUB(ENABLED)

Administrative subscription (created by the MQ admin, not the application):

DEFINE SUB(FRAUD.WIRE.SUB)  +
       TOPICSTR('CNB/EVENTS/WIRE/SUBMITTED')  +
       DEST(FRAUD.WIRE.INBOUND)  +
       DESTQMGR(HAQM01)  +
       DESCR('Fraud system subscription to wire events')

This creates a subscription that delivers copies of published messages to the FRAUD.WIRE.INBOUND local queue. The fraud system's COBOL program then reads from that queue using standard MQGET — it doesn't need to know anything about pub/sub. The subscription is the bridge between the pub/sub world and the point-to-point world.

Durable vs. Non-Durable Subscriptions

This is a critical architectural decision:

Durable subscriptions survive subscriber disconnections and queue manager restarts. If the fraud system goes down for maintenance, messages published while it's down are stored and delivered when it reconnects. Durable subscriptions have a name and persist until explicitly deleted.

Non-durable subscriptions exist only while the subscriber is connected. When the subscriber disconnects, the subscription is removed and published messages are discarded. Useful for monitoring dashboards and development testing, but never for production event processing.

At CNB, every production subscription is durable. "If a message is published, every subscriber gets it," Kwame says. "If they're down when it's published, they get it when they come back up. That's the contract. Non-durable subscriptions have no place in a financial system."

-- Durable subscription — survives disconnection
DEFINE SUB(FRAUD.WIRE.SUB)  +
       TOPICSTR('CNB/EVENTS/WIRE/SUBMITTED')  +
       DEST(FRAUD.WIRE.INBOUND)  +
       DURABLE(YES)  +
       EXPIRY(UNLIMITED)

-- Non-durable subscription — for monitoring only
DEFINE SUB(MON.WIRE.SUB)  +
       TOPICSTR('CNB/EVENTS/WIRE/#')  +
       DEST(MON.WIRE.TEMP)  +
       DURABLE(NO)

Publishing from COBOL

Publishing a message to a topic is almost identical to putting a message on a queue. The differences:

The MQOD specifies a topic name instead of a queue name
You set MQOD-OBJECTTYPE to MQOT-TOPIC
You use MQOD-OBJECTSTRING for the topic string (or MQOD-OBJECTNAME for an administered topic object)

      *    Set up MQOD for topic publish
           MOVE MQOT-TOPIC       TO MQOD-OBJECTTYPE
           MOVE SPACES            TO MQOD-OBJECTNAME
           MOVE 'CNB/EVENTS/WIRE/SUBMITTED'
                                  TO WS-TOPIC-STRING
           MOVE WS-TOPIC-STRING   TO MQOD-OBJECTSTRING
           MOVE LENGTH OF WS-TOPIC-STRING
                                  TO MQOD-OBJECTSTRLENGTH

      *    Open for publish
           COMPUTE WS-OPEN-OPTIONS =
               MQOO-OUTPUT + MQOO-FAIL-IF-QUIESCING

           CALL 'MQOPEN' USING WS-HCONN
                                WS-MQOD
                                WS-OPEN-OPTIONS
                                WS-HOBJ
                                WS-COMP-CODE
                                WS-REASON

      *    Publish the message (same as MQPUT)
           CALL 'MQPUT' USING WS-HCONN
                               WS-HOBJ
                               WS-MQMD
                               WS-MQPMO
                               WS-MSG-LENGTH
                               WS-EVENT-MSG
                               WS-COMP-CODE
                               WS-REASON

From the COBOL program's perspective, the only difference from a queue PUT is the MQOD setup. The MQPUT call itself is identical. MQ handles the fan-out to all subscribers transparently.

Pub/Sub Performance Characteristics

Publishing to a topic with N subscribers is more expensive than putting to a single queue. MQ must create N copies of the message and deliver each to the subscriber's destination queue. At CNB's scale:

Metric	Point-to-Point	Pub/Sub (5 subscribers)
Latency (single message)	0.2ms	0.8ms
CPU per message	1x	~3x (not 5x — MQ optimizes)
Storage	1 copy	5 copies on disk

The latency and CPU overhead are acceptable for most enterprise use cases. But if you're publishing 10 million messages per day to a topic with 15 subscribers, do the math: that's 150 million message copies. Plan your disk capacity accordingly.

💡 ADMINISTRATIVE VS. PROGRAMMATIC SUBSCRIPTIONS

There are two ways to create subscriptions:

Administrative subscriptions are created by the MQ admin using MQSC commands (as shown above). The subscription exists independently of any application. This is the recommended approach for production systems — the subscription is a configuration artifact managed alongside queue definitions.

Programmatic subscriptions are created by the application itself using the MQSUB API call. The application subscribes at startup and receives messages directly. This gives the application more control but means the subscription lifecycle is tied to the application lifecycle.

At CNB, all production subscriptions are administrative. "I want to see every subscription in my MQSC configuration," Rob Calloway says. "If a subscription only exists because some program created it at runtime, it's invisible to operations. And invisible things in production are dangerous things."

20.5 Event Schemas and Contracts — Designing Events for COBOL Producers and Consumers

Why Schemas Matter More in EDA Than in Request/Reply

In a request/reply system, two programs talk to each other. If you change the message format, you coordinate the change between two teams. Painful, but manageable.

In an event-driven system, one producer broadcasts events to potentially dozens of consumers. If you change the event format, every consumer must be updated. And here's the dangerous part: the producer doesn't know who all the consumers are. A consumer team you've never heard of might have subscribed to your topic six months ago. Change the format and their system breaks — silently, because they're reading from a subscription queue and the messages are just garbled.

Event schemas are contracts. They define what a consumer can expect, what a producer promises to deliver, and the rules for how the format can evolve over time.

Designing Event Schemas for COBOL

COBOL events in enterprise systems typically use one of three formats:

1. Fixed-format copybook records (traditional) The producer and consumer share a COBOL copybook that defines the event layout. This is the most natural format for COBOL-to-COBOL communication.

      *================================================================*
      * Copybook: WIREVTCP                                             *
      * Wire Transfer Event — Shared Event Schema                      *
      * Version: 2.0                                                   *
      *================================================================*
       01  WIRE-TRANSFER-EVENT.
           05  WTE-HEADER.
               10  WTE-EVENT-TYPE        PIC X(30).
               10  WTE-EVENT-VERSION     PIC 9(4).
               10  WTE-EVENT-ID          PIC X(36).
               10  WTE-EVENT-TIMESTAMP   PIC X(26).
               10  WTE-SOURCE-SYSTEM     PIC X(8).
               10  WTE-CORRELATION-ID    PIC X(24).
               10  WTE-HEADER-FILLER     PIC X(72).
           05  WTE-PAYLOAD.
               10  WTE-ACCOUNT-FROM      PIC X(16).
               10  WTE-ACCOUNT-TO        PIC X(16).
               10  WTE-AMOUNT            PIC S9(13)V99 COMP-3.
               10  WTE-CURRENCY          PIC X(3).
               10  WTE-BENEFICIARY       PIC X(40).
               10  WTE-BANK-CODE         PIC X(11).
               10  WTE-REFERENCE         PIC X(20).
               10  WTE-PRIORITY          PIC 9(1).
               10  WTE-RISK-SCORE        PIC 9(3).
               10  WTE-ORIGINATOR-IP     PIC X(45).
               10  WTE-CHANNEL           PIC X(10).
               10  WTE-PAYLOAD-FILLER    PIC X(135).

Note the FILLER fields at the end of both the header and payload. These are not laziness — they're forward-compatibility buffers. When version 3 adds new fields, they go in the FILLER space, and the total record length doesn't change. Consumers running the old copybook still parse correctly — they just ignore the FILLER bytes that are now occupied.

2. JSON (for cross-platform consumers) When events need to reach Java, Python, or cloud-based consumers, JSON is the lingua franca. COBOL programs can produce JSON using several approaches: IBM's JSON GENERATE statement (Enterprise COBOL V6.2+), CICS JSON transformation, or manual string construction.

{
  "eventType": "WIRE_TRANSFER_SUBMITTED",
  "eventVersion": 2,
  "eventId": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2024-11-15T14:32:07.123456Z",
  "sourceSystem": "CNB-CORE",
  "correlationId": "WF-2024-1115-00047832",
  "payload": {
    "accountFrom": "0001234567890123",
    "accountTo": "0009876543210987",
    "amount": 2300000.00,
    "currency": "USD",
    "beneficiaryName": "GLOBAL TRADING CORP",
    "bankCode": "GTCBUS33XXX",
    "reference": "INV-2024-88432",
    "priority": 1,
    "riskScore": 87,
    "originatorIP": "10.42.17.203",
    "channel": "ONLINE"
  }
}

3. XML (for regulated/EDI environments) Healthcare (EDI 835/837), financial messaging (ISO 20022), and government systems often mandate XML. Pinnacle Health's claims events use XML because their downstream consumers are XML-native EDI processors.

Event Schema Design Rules

After three years of operating event-driven systems, CNB codified these rules. Violate them and you'll find out why they exist — the hard way.

Rule 1: Every event has a header with version, type, ID, and timestamp. The header is the contract the infrastructure enforces. Consumers check the version before parsing. The event ID enables idempotency. The timestamp enables ordering. The type enables routing.

Rule 2: Use semantic versioning for event schemas. Major version changes (v1 → v2) break backward compatibility. Minor version changes (v2.0 → v2.1) add fields but don't change existing ones. The producer MUST maintain backward compatibility within a major version. Consumers MUST tolerate unknown fields within a major version.

Rule 3: Never remove or redefine fields within a major version. If WTE-AMOUNT means "transfer amount in minor units" in v2.0, it cannot mean "transfer amount in major units" in v2.1. That's a major version change.

Rule 4: Make events self-describing. An event should contain everything a consumer needs to process it. Don't publish an event with just an account number and expect the consumer to look up the rest. That lookup couples the consumer to the producer's database. Include the relevant data in the event payload.

Rule 5: Use FILLER for forward compatibility in COBOL copybooks. Allocate at least 20% of the record length as FILLER. When new fields are needed, consume FILLER space rather than extending the record. This preserves the record length, which matters for queues defined with MAXMSGL.

Rule 6: Include a correlation ID for tracing. Every event should carry a correlation ID that ties it back to the originating business transaction. When the fraud system flags a suspicious wire transfer, the incident response team needs to trace the event back through every system that processed it. The correlation ID is the thread that connects them all.

📊 CNB'S EVENT CATALOG

CNB maintains an event catalog — a registry of every event type, its schema, its version, its producer, and its known consumers. The catalog is a DB2 table, not a wiki page or a spreadsheet. When a producer registers a new event type or a new version, the catalog notifies all known consumers. When a consumer subscribes to an event type, the catalog records the dependency.

As of 2024, CNB's catalog contains 147 event types, produced by 23 systems, consumed by 41 distinct subscriber applications. The catalog has prevented at least a dozen schema-change incidents by surfacing consumer dependencies before changes went to production.

Handling Schema Evolution in COBOL

The practical challenge: your COBOL consumer program uses a COPY statement to include the event copybook. When the event schema version changes, you need to update the copybook and recompile. But you can't update all consumers simultaneously.

The solution is version-tolerant parsing:

      *    Check event version before parsing
           EVALUATE WTE-EVENT-VERSION
               WHEN 0001
                   PERFORM 2100-PARSE-V1-EVENT
               WHEN 0002
                   PERFORM 2200-PARSE-V2-EVENT
               WHEN OTHER
      *            Unknown version — log and skip
      *            (forward compatible: don't crash on
      *             future versions you don't understand)
                   PERFORM 9400-LOG-UNKNOWN-VERSION
           END-EVALUATE

The consumer program supports multiple versions simultaneously. When a new version is published, you add a new parsing paragraph without removing the old ones. Consumers still running on the old code path continue to work because the producer maintains backward compatibility within the major version. Consumers that have been updated parse the new fields.

20.6 Event-Driven Patterns — Event Sourcing, Sagas, and CQRS for Mainframe Architects

Event Sourcing on the Mainframe

Event sourcing inverts the traditional data model. Instead of storing the current state of an entity (account balance = $5,000), you store the sequence of events that produced that state (deposit $3,000, deposit $2,500, withdrawal $500 = $5,000).

The appeal is clear: you have a complete audit trail by construction. You can reconstruct the state at any point in time by replaying events up to that timestamp. You can derive new views of the data by replaying events through new projections.

On the mainframe, this pattern is less exotic than it sounds. Consider CICS journaling — CICS has always logged transaction events that can be replayed for recovery. DB2's transaction log is an event store. The VSAM change log is an event stream. The mainframe has been doing event sourcing for decades; it just didn't call it that.

However, implementing application-level event sourcing in COBOL requires careful design:

Traditional Model (current state):
  ACCOUNT-MASTER record:
    ACCT-NUMBER = 0001234567
    ACCT-BALANCE = 5000.00
    LAST-UPDATE = 2024-11-15

Event Sourcing Model (event stream):
  Event 1: AccountOpened   | 2024-01-15 | Balance: 0.00
  Event 2: DepositMade     | 2024-02-01 | Amount: 3000.00
  Event 3: DepositMade     | 2024-06-15 | Amount: 2500.00
  Event 4: WithdrawalMade  | 2024-11-15 | Amount: 500.00
  → Current balance: 5000.00 (derived, not stored)

When event sourcing makes sense on the mainframe: - Regulatory environments that require immutable audit trails (banking, healthcare) - Systems where understanding how state was reached matters as much as the current state - Cross-system reconciliation where event replay can rebuild state after failures

When it doesn't make sense: - High-volume transaction processing where reading the current state must be sub-millisecond (replaying 10 million events to compute a balance is not viable for real-time authorization) - Simple CRUD applications where audit logging is sufficient

💡 THE PRACTICAL COMPROMISE: EVENT SOURCING + CQRS

Most mainframe shops that adopt event sourcing use it alongside a traditional read model. Events are the source of truth and are stored immutably. But a projection (materialized view) is maintained for fast reads. This is CQRS — Command Query Responsibility Segregation. Commands (writes) go through the event store. Queries (reads) go to the projection. At CNB, the account event stream is the authoritative record, but a denormalized DB2 table is maintained for sub-millisecond balance lookups. The projection is rebuilt from events during DR recovery.

The Saga Pattern — Distributed Transactions Without Two-Phase Commit

Here's a scenario: a customer initiates a wire transfer that involves three systems on two LPARs and an external gateway. In the request/reply world, you might use a distributed two-phase commit (via RRS) to ensure atomicity. But two-phase commit across systems is expensive, fragile, and sometimes impossible (you can't do 2PC with an external bank's system).

The saga pattern replaces a single distributed transaction with a sequence of local transactions, each with a compensating action. If step 3 fails, you don't roll back the distributed transaction — you execute compensating transactions to undo steps 1 and 2.

Wire Transfer Saga at CNB:

Step 1: Debit source account (local DB2 transaction on LPAR-A)
  → Compensating action: Credit source account

Step 2: Submit to correspondent bank (MQ message to external gateway)
  → Compensating action: Send cancellation message to correspondent

Step 3: Receive confirmation from correspondent (MQ reply)
  → If failure: Execute compensating actions for steps 2 and 1

Step 4: Credit destination account (local DB2 transaction on LPAR-B)
  → Compensating action: Debit destination account

Step 5: Emit completion event
  → No compensating action (event is informational)

In COBOL, the saga coordinator is a program that tracks the state of each step:

       01  WS-SAGA-STATE.
           05  WS-SAGA-ID            PIC X(36).
           05  WS-SAGA-STATUS        PIC X(10).
               88  SAGA-STARTED      VALUE 'STARTED'.
               88  SAGA-DEBITED      VALUE 'DEBITED'.
               88  SAGA-SUBMITTED    VALUE 'SUBMITTED'.
               88  SAGA-CONFIRMED    VALUE 'CONFIRMED'.
               88  SAGA-COMPLETED    VALUE 'COMPLETED'.
               88  SAGA-COMPENSATING VALUE 'COMPENSATE'.
               88  SAGA-FAILED       VALUE 'FAILED'.
           05  WS-SAGA-STEP          PIC 9(2).
           05  WS-SAGA-RETRY-COUNT   PIC 9(3).

The saga state is persisted (typically in a DB2 table) so that recovery can resume a failed saga from where it left off. This is critical: if the CICS region crashes mid-saga, the recovery process reads the saga state table, determines which steps completed, and either continues forward or executes compensating actions.

Saga orchestration vs. choreography:

Orchestration — a central coordinator program drives the saga, calling each step in sequence. This is the natural model for COBOL programs. The coordinator program is a clear, readable sequence of PERFORM paragraphs.

Choreography — each step triggers the next by publishing an event. No central coordinator. Each service reacts to events independently. This is the distributed-systems purist model, but it's harder to reason about in COBOL because the flow of control is implicit in the event subscriptions rather than explicit in the program code.

At CNB, wire transfer sagas use orchestration. "I want a program I can read top to bottom and understand the entire flow," Kwame says. "With choreography, the flow is scattered across five event bindings and three subscription configurations. Good luck debugging that at 2 AM."

CQRS on the Mainframe

CQRS separates the write path (commands) from the read path (queries) into distinct models. On the mainframe, this maps naturally to existing patterns:

Write path: CICS online transactions update the DB2 transactional tables (normalized, optimized for write consistency)
Read path: Batch processes or event-driven projections maintain denormalized read tables (optimized for query performance)

The event-driven version of CQRS uses published events to update the read model:

Online transaction (CICS)
  → Updates DB2 transactional table
  → Publishes event to topic
  → Event consumed by projection program
  → Projection program updates denormalized read table

The read table might be a DB2 summary table, a VSAM file optimized for key access, or even a distributed cache. The point is that the read model is derived from events, not coupled to the write model.

Ahmad Rashidi at Pinnacle Health uses CQRS for claims status inquiries. The claims adjudication engine (write side) processes claims and publishes events. A separate projection program consumes those events and maintains a denormalized "claims status" DB2 table optimized for the status inquiry API. The write-side tables are normalized for adjudication correctness. The read-side table is denormalized for sub-second status lookups.

"The claims adjudication tables are third normal form with 23 joins for a status inquiry," Ahmad says. "The status table is a single denormalized row per claim. Same data, two shapes, two purposes."

20.7 Production Patterns and Anti-Patterns

Pattern: Idempotent Event Consumers

Events can be delivered more than once. MQ guarantees at-least-once delivery, not exactly-once. Network retries, trigger re-fires, and saga retries all produce duplicate events. Your consumer programs must be idempotent — processing the same event twice should produce the same result as processing it once.

For COBOL consumers, idempotency typically means:

      *    Check if we've already processed this event
           EXEC SQL
               SELECT EVENT_STATUS
               INTO :WS-EVENT-STATUS
               FROM CNB_EVENT_LOG
               WHERE EVENT_ID = :WTE-EVENT-ID
           END-EXEC

           EVALUATE SQLCODE
               WHEN 0
      *            Already processed — skip
                   ADD 1 TO WS-DUPLICATE-COUNT
               WHEN +100
      *            Not found — process and log
                   PERFORM 3000-PROCESS-EVENT
                   PERFORM 3100-LOG-EVENT-PROCESSED
               WHEN OTHER
                   PERFORM 9500-DB2-ERROR
           END-EVALUATE

The event log table (CNB_EVENT_LOG) stores the event ID of every processed event. Before processing, check the table. If the event ID exists, skip it. If not, process it and insert the event ID — all within the same unit of work.

⚠️ THE IDEMPOTENCY TABLE MUST BE IN THE SAME UOW

The event log insert and the business processing must be in the same DB2 unit of work. If you process the event, commit, and then insert the log entry as a separate commit, a failure between the two commits means the event was processed but not logged — and will be processed again on retry. Same UOW. Always.

Pattern: Dead Letter Queue Processing for Events

In an event-driven system, the dead letter queue becomes even more critical. A failed event isn't just a failed message — it's a business event that didn't reach one or more consumers. At CNB, a DLQ entry for a wire transfer event means the fraud system might not have screened that transfer.

CNB's DLQ handler for events:

Read the DLQ message
Parse the dead letter header (MQDLH) to determine why it failed
Parse the event header to determine the event type and severity
For high-severity events (fraud, compliance): page on-call immediately
For medium-severity events (notifications, analytics): queue for retry with exponential backoff
For low-severity events (logging, metrics): log and discard
Insert a record in the DLQ audit table for every message processed

Pattern: Event Ordering and Sequence Numbers

MQ does not guarantee message ordering across pub/sub subscribers. If events A, B, and C are published in order, subscriber 1 might receive A, B, C while subscriber 2 receives A, C, B.

For consumers that require ordering (such as event sourcing projections), include a sequence number in the event header:

           05  WTE-SEQUENCE-NUMBER   PIC 9(18).

The consumer maintains the last processed sequence number and rejects out-of-order events:

           IF WTE-SEQUENCE-NUMBER NOT =
              WS-LAST-SEQUENCE + 1
      *        Out of order — hold for resequencing
               PERFORM 4000-HOLD-FOR-RESEQUENCE
           ELSE
               PERFORM 3000-PROCESS-EVENT
               MOVE WTE-SEQUENCE-NUMBER
                   TO WS-LAST-SEQUENCE
           END-IF

Pattern: Circuit Breaker for Event Consumers

When a consumer fails repeatedly, it can create a cascade: events pile up on the subscription queue, the queue hits MAXDEPTH, MQ starts routing to the dead letter queue, and the DLQ fills up. The circuit breaker pattern detects consumer failures and stops processing temporarily:

       01  WS-CIRCUIT-BREAKER.
           05  WS-CB-FAILURE-COUNT   PIC 9(5) VALUE 0.
           05  WS-CB-THRESHOLD       PIC 9(5) VALUE 10.
           05  WS-CB-STATE           PIC X(6) VALUE 'CLOSED'.
               88  CB-CLOSED         VALUE 'CLOSED'.
               88  CB-OPEN           VALUE 'OPEN'.
               88  CB-HALF           VALUE 'HALF'.
           05  WS-CB-OPEN-TIME       PIC 9(18) VALUE 0.
           05  WS-CB-COOL-SECONDS    PIC 9(5) VALUE 300.

When the failure count exceeds the threshold, the circuit opens and the consumer stops processing. After a cooldown period, it enters half-open state and attempts one event. If that succeeds, the circuit closes. If it fails, the circuit opens again.

Anti-Pattern: The God Event

An event that tries to capture everything about a business transaction. A 32KB event with 200 fields that every consumer parses, even though each consumer uses only 5–10 fields. The God Event couples every consumer to every field — and schema changes affect everyone.

Fix: Publish domain-specific events. Instead of one TRANSACTION_COMPLETE event with every field, publish BALANCE_UPDATED, FEE_CHARGED, LIMIT_CHECKED, and AUDIT_LOGGED as separate events. Consumers subscribe only to the events they care about.

Anti-Pattern: Event Sourcing Without Snapshots

Replaying 10 million events to reconstruct account state is not viable at scale. If you implement event sourcing, create periodic snapshots:

Snapshot at Event 9,999,000: Balance = $47,231.50
Event 9,999,001: Deposit $500
Event 9,999,002: Withdrawal $100
Event 9,999,003: Fee charged $2.50
→ Current balance: $47,629.00 (replay from snapshot, not from event 1)

Without snapshots, your reconstruction time grows linearly with the event count. With snapshots, it's bounded by the snapshot interval.

Anti-Pattern: Fire-and-Forget Events for Critical Business Flows

Publishing an event without syncpoint for a critical business flow. If the program abends after the publish but before the DB2 commit, you have an event for a transaction that was rolled back. This is the opposite of the EP transactional consistency issue (Section 20.3) — here the event survives but the transaction doesn't.

Fix: Use MQPMO-SYNCPOINT for all critical events. The event and the DB2 update commit or roll back together.

Anti-Pattern: Subscriber Queue Without Monitoring

A subscription routes events to a queue. If nobody monitors that queue's depth, it fills silently. At CNB, a reporting subscription queue filled to MAXDEPTH over a weekend because the consumer batch job had a JCL error. Monday morning, 1.2 million events were sitting on the dead letter queue. It took the team six hours to reprocess them.

Fix: Every subscription queue gets a depth monitor. Alert at 80% of MAXDEPTH. No exceptions.

Anti-Pattern: Topic Explosion

Creating a unique topic for every possible event variant. CNB/EVENTS/WIRE/SUBMITTED/USD, CNB/EVENTS/WIRE/SUBMITTED/EUR, CNB/EVENTS/WIRE/SUBMITTED/GBP, etc. This makes subscription management a nightmare and defeats the purpose of content-based routing.

Fix: Use a hierarchical topic structure with wildcard subscriptions. Put the currency in the event payload, not the topic name. Let consumers filter by event content, not by subscribing to hundreds of topics.

20.8 Project Checkpoint — Event-Driven Notification Layer for HA Banking System

Progressive Project: HA Banking Transaction Processing System This checkpoint designs and implements the event-driven layer from the project tracker: "Event-driven triggers — MQ-triggered programs for incoming ACH files, CICS event processing for alerts." It builds on Chapter 19's queue architecture.

See code/project-checkpoint.md for the complete checkpoint specification, including:

MQ trigger configuration for the ACH inbound processing queue — triggered COBOL program that processes ACH batches when files arrive from Federal Reserve
CICS event binding for wire transfer fraud detection — EP configuration that captures wire transfer events without modifying the wire transfer program
Pub/sub topic hierarchy for the HA banking system — topic tree for all event types, subscription definitions for fraud, compliance, audit, and notification consumers
Fraud detection trigger flow — end-to-end design from wire transfer submission through event capture, fraud scoring, and alert generation
Balance alert trigger flow — MQ-triggered COBOL program that evaluates balance threshold crossings and publishes notification events

The checkpoint delivers working COBOL code for the triggered ACH processor and the event emission pattern, plus the complete MQ and CICS configuration for the notification layer.

🔗 CROSS-REFERENCE - From Ch 19: The queue definitions from the Chapter 19 project checkpoint (HAQM01–HAQM04, Queue Sharing Group HABANKQSG) are prerequisites. This checkpoint adds triggers, topics, and subscriptions to that existing infrastructure. - To Ch 21: The pub/sub topics defined here become the event source for the API gateway's webhook notifications in Chapter 21's checkpoint. - To Ch 22: The event schemas defined here become the integration contracts for the data feeds designed in Chapter 22.

Conclusion — Events Changed CNB's Architecture. They'll Change Yours.

Eight months after deploying event-driven fraud detection, CNB's mean time to detect a suspicious wire transfer dropped from 14 seconds to 340 milliseconds. The three downstream systems that needed wire transfer data — fraud, compliance, and AML — were processing in parallel instead of sequentially. Adding the audit system as a fourth consumer took 45 minutes: one subscription definition, no code changes.

But the deeper change was cultural. The architecture team stopped asking "who needs to receive this data?" and started asking "what business events does this system produce?" That shift — from push to publish, from request to announce — restructured how CNB thinks about integration.

Lisa Tran puts it simply: "We used to build plumbing. Now we build nervous systems."

The tools are here. MQ triggers let your programs react to messages instead of polling for them. CICS EP lets you capture events from legacy programs without touching the code. Pub/sub lets you broadcast events to any system that cares, without the producer knowing or caring who those systems are. Event schemas keep the contracts honest. And patterns like sagas and idempotent consumers keep the system reliable when — not if — things go wrong.

Chapter 21 takes the next step: exposing these events and services through API gateways, so the systems outside the mainframe can participate too.

📊 SPACED REVIEW — Carry These Forward Concepts from this chapter will be reviewed in: - Chapter 21 — API mediation layer consumes events from the pub/sub topics defined here - Chapter 22 — Data integration patterns build on event-driven feeds - Chapter 30 — Disaster recovery design must account for in-flight events, incomplete sagas, and subscription queue backlogs

Quiz yourself now (answers are in the quiz file): 1. What is the difference between TRIGTYPE(FIRST) and TRIGTYPE(EVERY)? 2. How does CICS EP capture events without modifying application code? 3. Why must event consumers be idempotent? 4. When should you use orchestration vs. choreography for a saga?