Chapter 23 Exercises: Batch Window Engineering
Section 23.2 — The Batch Window as a Graph
Exercise 1: DAG Construction
Given the following batch job descriptions, construct a DAG (text-based) showing all dependencies:
- JOB-A: Extract daily transactions from CICS journal (no predecessors)
- JOB-B: Extract ATM transactions from switch file (no predecessors)
- JOB-C: Validate transaction records (requires JOB-A)
- JOB-D: Validate ATM records (requires JOB-B)
- JOB-E: Merge validated transactions (requires JOB-C and JOB-D)
- JOB-F: Post transactions to accounts (requires JOB-E)
- JOB-G: Generate regulatory report (requires JOB-E)
- JOB-H: Calculate balances (requires JOB-F)
- JOB-I: Generate GL entries (requires JOB-F and JOB-G)
- JOB-J: Reconciliation (requires JOB-H and JOB-I)
Draw the DAG. Identify all possible paths from start to finish.
Exercise 2: Critical Path Identification
Using the DAG from Exercise 1 with the following durations:
| Job | Duration (min) |
|---|---|
| A | 20 |
| B | 15 |
| C | 30 |
| D | 10 |
| E | 25 |
| F | 40 |
| G | 20 |
| H | 35 |
| I | 15 |
| J | 10 |
a) Calculate the total duration of every path through the DAG. b) Identify the critical path. c) Calculate slack for every non-critical-path job. d) If the batch window is 210 minutes, what is the margin?
Exercise 3: Hidden Dependency Detection
You have three jobs with no formal dependencies between them:
- JOB-X: Reads CUST.MASTER (DISP=SHR), writes CUST.EXTRACT (DISP=OLD)
- JOB-Y: Reads CUST.MASTER (DISP=SHR), writes CUST.REPORT (DISP=OLD)
- JOB-Z: Reads CUST.EXTRACT (DISP=SHR), writes CUST.SUMMARY (DISP=OLD)
a) Can JOB-X and JOB-Y run in parallel? Why or why not? b) Can JOB-X and JOB-Z run in parallel? Why or why not? c) What hidden dependency exists that the scheduler doesn't know about? d) How would you fix the hidden dependency?
Exercise 4: Dependency Cleanup
Examine this dependency graph:
A → B → D → F → H
A → C → E → F → H
A → C → D (redundant? C doesn't produce data D needs)
B → E (redundant? B doesn't produce data E needs)
A → F (redundant? A → B → D → F already exists)
a) Identify which dependencies are structurally redundant. b) For each potentially redundant dependency, explain what you'd verify before removing it. c) What is the critical path before and after cleanup (assume all jobs take 10 minutes)?
Exercise 5: GDG in the DAG
A job stream uses GDG datasets:
- JOB-1 writes TRANS.DAILY(+1)
- JOB-2 reads TRANS.DAILY(0) — which is the generation written by JOB-1 in the previous night's run
- JOB-3 reads TRANS.DAILY(+1) — the generation JOB-1 just wrote
a) Can JOB-1 and JOB-2 run in parallel? Explain. b) Must JOB-3 wait for JOB-1? Why? c) What happens if JOB-1 fails and the GDG generation isn't cataloged? How does this affect JOB-2 and JOB-3?
Section 23.3 — Throughput Math
Exercise 6: Basic Throughput Calculation
A batch COBOL program processes customer records with the following characteristics:
- Input file: 8 million records, LRECL=200, RECFM=FB
- CPU time per record: 0.02 ms
- I/O time per record: 0.08 ms (sequential read)
- DB2 SQL time per record: 0.40 ms (2 SQL calls per record)
- Commit every 5,000 records
a) Calculate the processing rate in records per second. b) Calculate the estimated elapsed time. c) What percentage of time is spent in each component? d) If you could cut DB2 SQL time by 40%, what would the new elapsed time be?
Exercise 7: I/O Bottleneck Analysis
Two batch jobs read the same 50GB sequential file:
Job A (CPU-heavy): - CPU time: 45 minutes - I/O wait: 8 minutes - DB2 time: 12 minutes - Elapsed: 52 minutes
Job B (I/O-heavy): - CPU time: 5 minutes - I/O wait: 40 minutes - DB2 time: 3 minutes - Elapsed: 43 minutes
a) Calculate the CPU/elapsed ratio for each job. b) Which job is CPU-bound? Which is I/O-bound? c) For each job, recommend the most effective optimization strategy. d) If Job A and Job B both read the same dataset and must run sequentially, what's the combined time? If you could pipeline them (A reads and passes to B), estimate the combined time.
Exercise 8: Volume Growth Projection
A critical-path job currently processes 5 million records in 40 minutes with a volume elasticity of 0.9.
a) At 3% monthly volume growth, what will the record count be in 6 months? b) What will the elapsed time be in 6 months? c) If the batch window has 60 minutes of margin, when will this single job exhaust the margin? d) If you split this job into 2 parallel streams today, what margin does that buy in months?
Exercise 9: Commit Frequency Analysis
A DB2 batch update program processes 2 million records. Measured performance at different commit frequencies:
| Commit Every N Records | Elapsed Time | Lock Escalations | Deadlocks |
|---|---|---|---|
| 100 | 58 min | 0 | 0 |
| 500 | 42 min | 0 | 1 |
| 1,000 | 38 min | 0 | 3 |
| 5,000 | 35 min | 2 | 8 |
| 10,000 | 41 min | 7 | 15 |
| 50,000 | 65 min | 23 | 42 |
a) What is the optimal commit frequency based on this data? b) Why does elapsed time increase at very low commit frequencies? c) Why does elapsed time increase at very high commit frequencies? d) What commit frequency would you recommend for production, and why?
Exercise 10: Capacity Formula Application
Given the following batch window parameters:
- Available window: 360 minutes
- Current critical path: 280 minutes
- Required buffer: 30 minutes
- Monthly volume growth: 2%
- Next review date: 9 months out
- Average volume elasticity of critical-path jobs: 0.88
a) Calculate the current window capacity. b) Calculate the required growth margin. c) Is the window safe for 9 months? d) At what monthly growth rate would the window become unsafe?
Section 23.4 — Job Scheduling
Exercise 11: TWS Dependency Definition
Write TWS (OPC) application definitions for the following scenario:
- JOB-EXTRACT runs daily on business days, no predecessors
- JOB-VALIDATE depends on JOB-EXTRACT completing with RC ≤ 4
- JOB-POSTING depends on JOB-VALIDATE, requires 2 DB2 batch threads
- JOB-REPORT depends on JOB-POSTING, must complete by 04:00 AM
- JOB-ARCHIVE depends on JOB-REPORT, runs only on month-end
Define all predecessor, resource, and time constraints.
Exercise 12: Conditional Execution
Design a conditional execution strategy for this scenario:
- JOB-SCAN: Scans transactions for fraud indicators
- RC=0: No fraud found
- RC=4: Warnings found (possible fraud, needs review)
- RC=8: Confirmed fraud patterns detected
- RC>8: Program error
Based on JOB-SCAN's return code: - If RC=0: Run JOB-NORMAL-POST - If RC=4: Run JOB-REVIEW-POST (includes manual review queue) - If RC=8: Run JOB-FRAUD-HOLD (freezes affected accounts) AND JOB-FRAUD-NOTIFY (alerts compliance team) - If RC>8: Run JOB-ERROR-NOTIFY, do NOT run any posting jobs
Write the scheduler dependency logic (pseudocode or any scheduler syntax).
Exercise 13: Cross-System Dependencies
Your shop has two LPARs: SYSA (production batch) and SYSB (production DB2 for reporting).
- SYSA-JOB1: Extracts data and writes to shared dataset on SYSA
- SYSB-JOB2: Reads the shared dataset and loads it into SYSB DB2 reporting tables
- SYSA-JOB3: Runs after SYSB-JOB2 confirms load completion
a) What mechanisms can establish cross-LPAR dependencies? b) What happens if SYSB-JOB2 fails and the scheduler on SYSA isn't notified? c) Design a recovery procedure for this failure mode.
Exercise 14: Resource Contention Modeling
You have 6 batch initiators and 4 DB2 batch threads. The following jobs are all ready to run at the same time:
| Job | Initiators Needed | DB2 Threads Needed | Duration |
|---|---|---|---|
| JOB-A | 1 | 2 | 20 min |
| JOB-B | 1 | 1 | 15 min |
| JOB-C | 1 | 2 | 25 min |
| JOB-D | 1 | 0 | 10 min |
| JOB-E | 1 | 1 | 30 min |
| JOB-F | 1 | 0 | 20 min |
a) Which jobs can run simultaneously without exceeding resource limits? b) What is the minimum elapsed time to run all jobs? c) Create an optimal schedule (Gantt chart style) showing when each job runs. d) What happens if you add 2 more DB2 threads?
Section 23.5 — Parallel Streams
Exercise 15: Stream Identification
A retail bank processes these end-of-day jobs:
- Checking account posting
- Savings account posting
- CD maturity processing
- Loan payment application
- Credit card settlement
- Combined GL posting
- Combined regulatory reporting
- Customer statement generation
a) Identify which jobs can form independent parallel streams. b) Where must the streams converge? c) Draw the parallel stream architecture. d) If each job takes 30 minutes, compare serial vs. parallel elapsed time.
Exercise 16: DB2 Concurrency Design
Two batch programs must update the CUSTOMER table simultaneously:
- PROG-A: Updates account balance for checking accounts (3M rows)
- PROG-B: Updates account balance for savings accounts (2M rows)
The CUSTOMER table has a single tablespace, page-level locking.
a) Will these programs conflict even though they update different rows? Explain. b) What DB2 changes would minimize contention? c) What application design changes would minimize contention? d) If you partition the tablespace by account type, how does this change the answer?
Exercise 17: Dataset Contention Resolution
Three jobs need the CUST.MASTER VSAM KSDS:
- JOB-UPDATE: Updates customer records (needs exclusive control)
- JOB-REPORT: Reads customer records for a report
- JOB-EXTRACT: Reads specific customer records for regulatory extract
Currently they run serially: UPDATE → REPORT → EXTRACT (total: 90 min).
a) Can REPORT and EXTRACT run simultaneously? Under what conditions? b) Can UPDATE run simultaneously with either? Why or why not? c) Design an alternative using a shadow copy technique to maximize parallelism. d) What is the new elapsed time with your design?
Exercise 18: Job Splitting
A single-threaded COBOL batch job processes 20 million records in 100 minutes. The processing is: read record, look up DB2 reference table, apply business rules, write output record.
a) If you split this into 4 parallel jobs by key range, what is the theoretical elapsed time? b) What additional overhead does splitting introduce? (List at least 4 items.) c) What if the records aren't evenly distributed across key ranges? d) Design a split strategy that handles uneven distribution.
Section 23.6 — Window Compression
Exercise 19: Compression Strategy Selection
Your critical path is 380 minutes and your window is 375 minutes. You're 5 minutes over. For each strategy, estimate the savings and recommend a plan:
a) Dependency cleanup (you suspect 10-15% unnecessary dependencies) b) I/O optimization (critical-path jobs average BUFNO=5, BLKSIZE=4000) c) Job splitting (largest critical-path job: 65 minutes, 15M records) d) zIIP offload (critical path has 120 minutes of DB2 processing)
Rank these by ROI and recommend which to implement first, second, etc.
Exercise 20: Window Extension Decision
The CIO asks you to evaluate extending the batch window by 1 hour (closing online at 10:00 PM instead of 11:00 PM). Prepare a decision brief covering:
a) Technical impact of the extension b) Business impact (affected users, transaction volume during 10-11 PM) c) Alternative technical solutions that don't require extending the window d) If the extension is approved, how long will it remain sufficient given 2.5% monthly growth? e) Your recommendation with justification
Section 23.7 — Failure and Recovery
Exercise 21: Checkpoint Design
Design a checkpoint/restart mechanism for a COBOL program that:
- Reads a 15-million-record sequential input file
- For each record, performs a DB2 lookup and a DB2 update
- Writes accepted records to output file A and rejected records to output file B
- Maintains running totals (total amount processed, error count, record count)
a) What data must be saved in each checkpoint? b) How often should checkpoints be taken? Justify your answer. c) Write the COBOL data division entries for the checkpoint record. d) Write pseudocode for the restart logic. e) What happens to the output files on restart? How do you avoid duplicate records?
Exercise 22: Recovery Decision Tree
At 4:15 AM, JOB-GL-POST (general ledger posting) fails with a DB2 -904 (resource unavailable). This job is on the critical path. Window ends at 6:00 AM.
Information available: - JOB-GL-POST: 30-minute job, failed at minute 12 (40% complete) - DB2 tablespace needs RECOVER utility (estimated 15 minutes) - Jobs remaining after GL-POST: REGULATORY (25 min), RECONCILE (10 min) - Checkpoint was taken at minute 10 of JOB-GL-POST
a) Calculate time remaining in the window. b) Option A: Recover tablespace, restart JOB-GL-POST from checkpoint, run remaining jobs. Calculate total time. c) Option B: Recover tablespace, restart JOB-GL-POST from scratch, run remaining jobs. Calculate total time. d) Option C: Skip GL-POST, run REGULATORY with previous day's GL data, defer GL-POST to midday supplemental. Risks? e) Which option do you recommend and why?
Exercise 23: Failure Mode Analysis
For each failure mode, describe the root cause, immediate impact on the batch window, and recovery strategy:
a) S0C7 abend in a critical-path COBOL program at record 8 million of 12 million b) DB2 -911 (deadlock) occurring repeatedly in a batch program c) B37 abend (out of space) on a critical output dataset d) IEC036I — tape mount timeout for an input tape that can't be found e) Job waiting in the input queue for 45 minutes because all initiators in its class are busy
Exercise 24: Batch Monitoring Design
Design a batch window monitoring system for a shop with 500 jobs in the nightly batch window. Include:
a) How you would define milestones (what criteria determine a milestone?) b) Expected-time calculation methodology (how do you set thresholds?) c) Alert escalation levels (who gets notified when, and how?) d) Dashboard design (what data elements, what visualization?) e) Trend analysis approach (what historical data do you retain, how do you analyze trends?)
Exercise 25: Full Window Engineering
You are the batch architect for a new credit union with the following requirements:
- 500,000 member accounts
- 2 million transactions per day
- 8-hour batch window (10 PM – 6 AM)
- Single LPAR, 4 CPs, 2 zIIPs
- DB2 with 8 batch threads
- 12 batch initiators
Required processing: 1. Transaction extraction (15 min) 2. Fraud scan (25 min) 3. Transaction validation (20 min) 4. Account posting (35 min) 5. Interest calculation (monthly, 45 min) 6. GL posting (20 min) 7. ACH file generation (10 min) 8. Statement generation (daily cycle, 30 min) 9. Regulatory daily report (15 min) 10. Backup (30 min)
a) Design the complete DAG with dependencies. b) Identify the critical path. c) Calculate parallel stream elapsed time. d) Calculate margin and assess capacity for 3% monthly growth over 2 years. e) Design the recovery strategy for the top 3 failure points. f) Write JCL for the job stream (at least the skeleton with COND and dependency comments).