Chapter 13 Key Takeaways

DataField.Dev

Chapter 13 Key Takeaways

The Threshold Concept

CICS is a transaction manager, not an application server. Its core responsibility is ensuring work completes atomically across multiple resources and regions — not merely running your COBOL programs. Every architectural decision in this chapter flows from this distinction. If CICS were just an application server, you'd optimize for code execution speed. Because it's a transaction manager, you optimize for integrity, recovery, routing, and coordination across an entire topology.

Region Topology (Section 13.2)

TOR/AOR/FOR separation is the foundation. Terminal-Owning Regions own connections, Application-Owning Regions run programs, File-Owning Regions manage VSAM files. Mixing roles creates coupling that undermines failure isolation, scalability, and security.
Channel isolation prevents cross-channel failures. CNB isolates 3270/ATM, web, and mobile API on separate LPARs. A problem in one channel cannot degrade another. This is the single most impactful topology decision.
Name your regions for humans, not machines. A naming convention that encodes enterprise, type, LPAR, and instance (e.g., CNBAORA1) saves minutes during incidents — and minutes matter at 500 million transactions per day.

Transaction Routing (Section 13.3)

Static routing is a single point of failure. Any production CICS topology processing more than 100 TPS should use dynamic routing. Static routing cannot handle AOR failures, load imbalances, or workload changes.
CICSPlex SM workload management is the enterprise routing solution. The goal algorithm integrates with z/OS WLM to route transactions to the AOR most likely to meet response-time targets. The result is near-optimal workload distribution with minimal manual tuning.
Transaction affinity is the enemy of workload balancing. Every affinity constrains routing. Eliminate affinities by using external state stores (DB2, shared TS, coupling facility data tables) instead of region-local resources. Where affinities are unavoidable, minimize their duration.
Routing programs must be lightweight. A routing program runs for every routable transaction. Database calls, file I/O, or complex logic in the routing program accumulates into enormous CPU consumption at high TPS rates.

MRO and ISC (Section 13.4)

MRO for same-LPAR, ISC/IPIC for cross-LPAR. MRO uses cross-memory services (fastest). IPIC uses TCP/IP (required for cross-LPAR). Using ISC where MRO would work wastes performance — CNB measured 5x overhead.
Function shipping is transparent but not free. Each function-shipped file operation incurs an MRO round-trip. Programs with many remote file operations should use DPL (ship the program, not the data) or migrate data to DB2 with data sharing.
Size MRO/IPIC sessions at 2x observed peak. Under-sized sessions create queuing that degrades response times. Over-sized sessions waste minimal storage. Err on the side of excess.

CICSPlex SM (Section 13.5)

CICSPlex SM is essential above 5–6 regions. Managing individual regions through manual CSD updates and custom routing programs does not scale. CPSM provides centralized workload management, resource deployment (BAS), and health monitoring.
Always run paired CMASs for high availability. A single CMAS is a management-layer single point of failure. When the CMAS fails, routing continues on cached data, but health monitoring and deployment capability are lost until recovery.
BAS eliminates CSD management overhead. Define resources once, deploy across region groups. Phased rollouts enable zero-downtime deployments. CNB reduced deployment errors by 85% with BAS.

Sysplex-Aware CICS (Section 13.6)

Shared temporary storage eliminates the most common affinities. Moving inter-transaction TS queues from region-local to coupling facility (LOCATION=SYSPLEX) allows any AOR to continue a pseudo-conversational sequence. The 10–20 microsecond CF access latency is negligible.
DB2 data sharing eliminates the need for FORs for high-volume data. When data is in DB2 with data sharing, every AOR on every LPAR can access it directly. No function shipping, no FOR bottleneck.
Every HA feature adds complexity. Shared TS, named counters, CFDTs, data sharing, dual CMASs — each is valuable, but each is also a component that can fail. Select HA features that address your actual failure scenarios, not every possible failure scenario.

Topology Design (Section 13.7)

Five questions drive every topology decision. What are the channels? What are the failure domains? What are the performance tiers? What data is shared? What must scale independently? Answer these before drawing any topology diagram.
Size MAXTASK with measurement, not theory. Formula: peak TPS x average response time x safety factor. But deploy in a performance test environment and measure actual utilization before going to production.
Design for three horizons. Horizon 1 (0–12 months): handle without changes. Horizon 2 (1–3 years): scale with tuning and additional instances. Horizon 3 (3–5 years): accommodate strategic direction without architectural dead-ends.
Avoid the five anti-patterns. The mega-AOR, the over-split topology, the forgotten FOR, the static routing holdout, and the orphaned CMAS. Each creates operational risk that compounds over time.

The Big Picture

A CICS topology is not a diagram — it is a set of decisions about how work flows, how failures propagate, and how the system scales. The architect's job is to make these decisions explicitly, document them, and ensure that every region, every connection, and every routing rule exists for a reason that can be articulated under pressure at 2 AM.