Exercises: From Batch to Real-Time: A Full Migration Project

Exercise 45.1: Design a Message Format

Design the JSON message format for a new real-time event: when a member's HSA balance drops below $100, MedClaim should be notified so they can alert the member.

  1. Define the JSON structure with all necessary fields (member ID, current balance, last transaction, etc.)
  2. Define the COBOL data structure (01-level) that maps to this JSON
  3. Write the JSON GENERATE statement to produce the message
  4. Explain why you included each field (what information does the consumer need?)

Exercise 45.2: Implement a Dead-Letter Queue Handler

Write a COBOL program (DLQ-HANDLER) that processes messages from the dead-letter queue. The program should:

  1. Read each message from the DLQ
  2. Determine why it failed (parse the MQ dead-letter header for the reason code)
  3. For transient errors (queue full, temporary network issue), requeue the message to the original queue
  4. For permanent errors (poison message, invalid format), write the message to an error file and remove it from the DLQ
  5. Produce a summary report of all DLQ processing

This is a critical operational program — in production, DLQ messages represent transactions that might be lost.

Exercise 45.3: Build an Enhanced Reconciliation

The HSA-RECON program in this chapter performs a simple claim-by-claim comparison. Enhance it to also:

  1. Compare the total dollar amounts (sum of all batch payments vs. sum of all real-time payments)
  2. Check for timing discrepancies (claims that appeared in real-time on day 1 but batch on day 2)
  3. Produce a trend report showing reconciliation results over the past 30 days
  4. Flag any single-day mismatch count > 0 as a critical alert

How would you handle the case where a claim is processed by real-time on Friday evening but by batch on Saturday morning (due to timing at the cutover boundary)?

Exercise 45.4: Design an Optimistic Locking Strategy

The HSA-EVENTS program uses optimistic locking to prevent concurrent updates to the same HSA account. But what happens when two claims for the same member arrive within milliseconds of each other?

  1. Explain why optimistic locking causes a conflict in this scenario
  2. Design a retry strategy: after a conflict, wait and retry N times before failing
  3. Write the COBOL code for the retry loop (include the DB2 SELECT, UPDATE, and conflict detection)
  4. What is the maximum number of retries you would allow before declaring a permanent failure? Justify your answer.
  5. How would you handle the situation where the retry succeeds but the original message has already been processed (duplicate)?

Exercise 45.5: Monitoring Dashboard Design

Design a comprehensive monitoring dashboard for the HSA real-time system. For each metric, specify:

  1. What to measure (metric name and unit)
  2. How to measure it (DB2 query, MQ inquiry, or system metric)
  3. Normal range (what values indicate healthy operation)
  4. Warning threshold (investigate soon)
  5. Critical threshold (investigate immediately)

Include at least 8 metrics covering message flow, processing latency, error rates, and system resources.

Exercise 45.6: Rollback Rehearsal

Design a rollback rehearsal plan that the team would execute the weekend before cutover. The rehearsal should:

  1. Simulate a cutover (disable batch, enable real-time)
  2. Process 100 test transactions through the real-time path
  3. Simulate a failure (bring down the MQ queue manager)
  4. Execute the rollback (re-enable batch)
  5. Verify that batch processing recovers all transactions that real-time did not complete
  6. Document the elapsed time for each step

What would you do if the rollback rehearsal revealed a problem that could not be fixed before the planned cutover date?

Exercise 45.7: Cross-Organization Integration Design

The current design uses MQ for asynchronous communication between MedClaim and GlobalBank. An alternative would be synchronous REST APIs (MedClaim calls GlobalBank's HSA debit API in real-time).

Compare these two approaches:

Criterion MQ (Async) REST API (Sync)
Latency ? ?
Reliability ? ?
Coupling ? ?
Error handling ? ?
Monitoring ? ?

Which approach would you recommend for this specific use case? Under what circumstances would the other approach be better?

Exercise 45.8: Full System Integration Test

Design a complete integration test for the real-time HSA system that covers the following scenarios:

  1. Normal HSA payment (sufficient balance)
  2. NSF payment (insufficient balance)
  3. Inactive HSA account
  4. Duplicate message delivery
  5. MQ queue full (producer side)
  6. DB2 connection failure (consumer side)
  7. Network interruption between LPARs
  8. Message arrives out of order (older claim after newer claim)

For each scenario, describe: - How to create the test condition - Expected system behavior - How to verify the behavior occurred correctly - How to clean up the test environment afterward