Chapter 1 Exercises: The z/OS Ecosystem

Part A: Conceptual Questions

A1. Explain why z/OS uses separate address spaces for each major subsystem (DB2, CICS, MQ, JES2) rather than running them all within a single address space. Identify at least three architectural benefits of this approach.

A2. Describe the role of the initiator in batch job processing. How does it differ from the JES2 address space, and why are these two functions separated?

A3. Define the following terms and explain how they relate to each other: SVC, PC routine, and cross-memory communication. Under what circumstances would z/OS use each mechanism?

A4. The z/OS dispatcher manages CPU allocation across all tasks in the system. Explain how WLM influences the dispatcher's decisions. Why is this relevant to a COBOL architect designing a system with both batch and online workloads?

A5. Explain the difference between JES2 and JES3. Why has the industry overwhelmingly chosen JES2? Under what (now rare) circumstances might JES3 have been the better choice?

A6. Language Environment (LE) initializes before your first line of COBOL executes and runs termination routines after your last line completes. List four services that LE provides and explain why a runtime environment is necessary — why can't the COBOL program just execute directly on the hardware?

A7. In the context of z/OS architecture, define "address space" and explain how it provides isolation. How does the hardware enforce this isolation?

A8. Describe the subsystem interface (SSI) and its role in z/OS. Give two examples of subsystem communication that uses the SSI and two examples that use other mechanisms (SVCs or PC routines).


Part B: Applied Analysis

B1. Batch Job Trace Analysis

The following SMF data describes a batch job at CNB:

Metric Value
Job Name CNBAR200
Step Name STEP010
Program CNBAR200
Elapsed Time 47 minutes
CPU Time 2 minutes, 14 seconds
EXCP Count (VSAM input) 342,187
EXCP Count (DB2) 0 (in-memory)
DB2 Calls 1,200,000
DB2 CPU Time 18 minutes
WLM Service Class DISCRETIONARY

a) The job's elapsed time (47 minutes) is much higher than its CPU time (2 min 14 sec). List all possible reasons for this discrepancy based on what you learned about z/OS architecture.

b) The DB2 CPU time (18 minutes) is much higher than the job's own CPU time. Explain how this is possible — where is the DB2 CPU time being consumed?

c) The WLM service class is DISCRETIONARY. What does this mean for the job's performance? What would you recommend changing?

B2. CICS Transaction Path

Trace the complete path of a CICS transaction that performs the following operations: 1. Receives a 3270 screen from a terminal 2. Reads a VSAM KSDS record 3. Executes two DB2 SELECT statements 4. Puts a message on an MQ queue 5. Issues SYNCPOINT 6. Sends a response to the terminal

For each step, identify: (a) which z/OS address spaces are involved, (b) what communication mechanism is used, and (c) what happens if that step fails.

B3. Sysplex Failure Analysis

CNB's Parallel Sysplex has four LPARs (CNBPROD1-4) with DB2 data sharing. At 14:32, CNBPROD2 experiences a z/OS failure and goes down.

a) Describe, step by step, what happens to: (1) CICS transactions that were in-flight on CNBPROD2, (2) the DB2 data sharing group, (3) batch jobs running on CNBPROD2, and (4) online users whose sessions were connected to CNBPROD2's CICS regions.

b) What role does the coupling facility play during this failure?

c) How long should the recovery take before the remaining three LPARs are fully absorbing CNBPROD2's workload? What factors determine recovery time?

B4. WLM Classification Design

You are designing WLM classification rules for a z/OS LPAR that handles both CICS online transactions and batch jobs. The requirements are:

  • CICS funds transfer transactions (XFER): 95% must complete within 200ms
  • CICS inquiry transactions (INQY): 90% within 500ms
  • Critical batch jobs (job names starting with CNB*): must complete within their scheduled window
  • Report jobs (job names starting with RPT*): run whenever resources are available

Design a WLM classification scheme with appropriate service classes and goals. Justify each classification decision.

B5. DB2 Thread Analysis

A CICS region at CNB is configured with 50 pool threads and 10 entry threads dedicated to the XFER transaction. During peak hours, the region processes 800 XFER transactions per second.

a) Calculate the average thread reuse rate for XFER entry threads, assuming each transaction holds a thread for 3ms.

b) At what transaction rate would the 10 entry threads become a bottleneck?

c) What symptoms would the application team observe if the entry thread limit were reached? What would users experience?

B6. Cross-Memory Overhead Calculation

A COBOL batch program processes 5 million records. For each record, it performs: - 1 VSAM READ (average 5 microseconds from buffer) - 2 DB2 SELECTs (average 150 microseconds each, including cross-memory overhead) - 1 DB2 UPDATE (average 200 microseconds, including cross-memory overhead) - 1 VSAM WRITE (average 8 microseconds to buffer)

a) Calculate the total elapsed time just for the data access operations (ignoring COBOL processing logic).

b) What percentage of the total data access time is spent on DB2 cross-memory operations?

c) If you could replace the two DB2 SELECTs with VSAM reads (by denormalizing the data), how much time would you save? Is this trade-off worth it? What are the risks?


Part C: Architecture Design

C1. Mid-Sized Insurance Company

Design a z/OS environment for a mid-sized insurance company with these characteristics: - 20 million claims per month - 99.99% availability requirement (four nines) - 500 concurrent online users - 6-hour batch window - Budget constraint: two z16 frames maximum

Specify: number of LPARs, Sysplex topology, coupling facility placement, DB2 configuration, CICS topology, and batch strategy. Justify every decision.

C2. Government Agency Modernization

The Federal Benefits Administration runs a single-LPAR z/OS system with IMS DB/DC and DB2 (mixed environment). They need to migrate to a Parallel Sysplex for high availability while maintaining the IMS applications. Design a migration strategy that: - Moves to a two-LPAR Sysplex as a first phase - Maintains IMS and DB2 coexistence - Minimizes risk to the 40-year-old codebase - Enables future growth to three or four LPARs

What are the biggest risks in this migration? How would you mitigate them?

C3. API Gateway Architecture

SecureFirst Retail Bank wants to expose 50 COBOL CICS programs as REST APIs for their mobile application. They currently have a single-LPAR z/OS system. Design the z/OS component architecture that connects the mobile app to the COBOL programs. Include: - Network path from mobile app to COBOL program - z/OS Connect or alternative API exposure technology - CICS configuration for API workload - Security architecture (authentication, authorization, TLS) - Performance considerations (expected: 500 API calls/second)

C4. DR Architecture

Design a disaster recovery architecture for CNB's Parallel Sysplex. Requirements: - RPO (Recovery Point Objective): zero data loss - RTO (Recovery Time Objective): 2 hours - DR site must be at least 50 km from primary site - Must support full production workload at DR site

Specify: DR Sysplex topology, data replication technology (GDPS, PPRC, Metro Mirror?), coupling facility configuration at DR, and the failover procedure. What are the limitations of each technology choice?


Part D: Synthesis & Critical Thinking

D1. The Knowledge Transfer Problem

Marcus Whitfield at FBA is retiring in two years. He has 35 years of undocumented knowledge about 15 million lines of COBOL/IMS code. You've been asked to design a knowledge preservation strategy.

a) What categories of knowledge does Marcus possess that would be most difficult to replace? (Think beyond "he knows the code" — what does an architect with 35 years know that doesn't exist in source code or documentation?)

b) How does the z/OS ecosystem architecture complicate the knowledge transfer problem? (Consider: it's not just COBOL knowledge — it's COBOL + JCL + DB2 + IMS + CICS + z/OS system services knowledge, all interrelated.)

c) Propose a practical 18-month knowledge transfer plan. What artifacts should be created? What training should be delivered? What organizational changes are needed?

D2. Mainframe vs. Distributed: The Architecture Debate

A consulting firm has proposed replacing CNB's z/OS Parallel Sysplex with a Kubernetes-based microservices architecture. They claim it will reduce costs and improve agility.

a) Identify five specific z/OS architectural capabilities (from this chapter) that the Kubernetes proposal would need to replicate. For each, assess how difficult that replication would be.

b) Kwame Mensah argues that the coupling facility's hardware-speed lock management cannot be replicated in software on commodity hardware without sacrificing either consistency or performance. Is he right? Explain using the CAP theorem.

c) Under what circumstances might a hybrid architecture — some workloads on z/OS, some on Kubernetes — make sense? What would the integration points look like?

D3. The SYNCPOINT Dilemma

A senior developer at Pinnacle Health has written a claims batch processing program that commits after every single record (EXEC CICS SYNCPOINT after each claim). The program processes 50 million claims per month. Diane Okoye has asked you to analyze the architecture.

a) Calculate the coupling facility overhead of committing 50 million times versus committing every 100 records (500,000 times). Assume each commit generates 3 coupling facility lock requests.

b) What are the risks of committing every 100 records instead of every record? Consider: restart positioning, lock duration, and impact on concurrent transactions.

c) Design a commit strategy that balances performance, recoverability, and concurrency. Justify your interval choice.

D4. Architecture Evolution

Looking at CNB's current 4-LPAR Parallel Sysplex, speculate on how the architecture might evolve over the next 10 years. Consider: - IBM's direction with z/OS and the z-series platform - The trend toward hybrid cloud (IBM Cloud Pak, Wazi) - The aging of the COBOL programmer workforce - The increasing use of AI and machine learning in financial services

What architectural decisions should Kwame make today to position CNB for these changes? What decisions should he explicitly defer?


Part M: Mixed Practice

Mixed Practice is not applicable for Chapter 1 as this is the first chapter. Starting in Chapter 2, this section will contain interleaved questions from previous chapters to support long-term retention through spaced retrieval practice.


Part E: Research & Extension

E1. IBM Redbook Deep Dive

Locate the IBM Redbook ABCs of z/OS System Programming (SG24-6981 through SG24-6990, a multi-volume series). Read the sections on Parallel Sysplex architecture and coupling facility internals. Write a 500-word summary of one topic from the Redbook that extends what was covered in this chapter. Cite specific Redbook pages.

E2. WLM Hands-On Exploration

If you have access to a z/OS system (even a z/OS dev/test environment like Wazi or ZD&T), explore the WLM configuration: - Issue the D WLM,APPLENV=* console command to display application environments - Use SDSF to view WLM service class periods - Examine the WLM classification rules in the active policy

Document what you find. How does your shop's WLM configuration compare to the principles described in this chapter?

E3. Coupling Facility Technology Research

Research the evolution of coupling facility technology from its introduction in 1994 through the current z16. Topics to investigate: - How has coupling facility capacity (measured in "service units") scaled over hardware generations? - What is the Integrated Coupling Facility (ICF) and how does it differ from an external CF? - What are Coupling Facility Resource Manager (CFRM) policies and how do they affect structure placement?

Write a 750-word research summary. This topic becomes critical in Chapter 4 when we discuss coupling facility tuning for DB2 data sharing.