Chapter 4 Exercises: Dataset Management Deep Dive
Part A: Catalog Structure and Resolution (Exercises 1–5)
Exercise 1 — Catalog Trace Given the following catalog structure:
MASTER CATALOG: SYS1.MCAT.V01
Alias APP → UCAT.APP.PROD
Alias APP.TEST → UCAT.APP.TEST
Alias SYS1 → (master catalog entries)
Trace the catalog search for each of the following dataset references. For each, identify which catalog is searched and whether the dataset will be found (assuming it exists in the correct catalog):
a) APP.PROD.ACCTS.MASTER
b) APP.TEST.CLAIMS.DAILY
c) APP.DEV.UTIL.COPYLIB
d) SYS1.PARMLIB
e) TEMP.WORK.FILE01
Exercise 2 — Catalog Design Federal Benefits Administration has these application areas: Benefits Processing (BEN), Enrollment (ENR), Reporting (RPT), and Archive (ARC). They have three environments: PROD, UAT, DEV. Currently they use a single user catalog for everything.
Design a user catalog and alias structure that: - Separates production from non-production - Isolates the archive datasets (which number in the millions) - Minimizes catalog contention during the nightly batch window
Justify each design decision.
Exercise 3 — Catalog Recovery Scenario The user catalog UCAT.CNB.BATCH becomes corrupted at 10 PM, just before the nightly batch window begins. All datasets with alias CNB.BATCH are inaccessible.
a) What immediate actions should Rob Calloway take? b) What information does the VVDS on each volume contain that could help recovery? c) How would you rebuild the user catalog? Outline the IDCAMS commands. d) What preventive measures would you recommend to avoid this scenario?
Exercise 4 — Multi-Level Alias Analysis A shop has these aliases defined:
Alias PAYROLL → UCAT.PAY
Alias PAYROLL.HR → UCAT.HR
For each dataset, determine which user catalog will be searched:
a) PAYROLL.MONTHLY.CHECK
b) PAYROLL.HR.EMPLOYEE.MASTER
c) PAYROLL.HR.BENEFITS.DENTAL
d) PAYROLL.TAX.QUARTERLY
Explain the alias matching algorithm that produces each result.
Exercise 5 — LISTCAT Interpretation Given this (abbreviated) LISTCAT ALL output for a VSAM KSDS:
CLUSTER --- CNB.PROD.VSAM.ACCTMAST
IN-CAT --- UCAT.CNB.PROD
HISTORY
DATASET-OWNER-----(NULL) CREATION--------2025.120
RELEASE----------------2 EXPIRATION------0000.000
ASSOCIATIONS
DATA---CNB.PROD.VSAM.ACCTMAST.DATA
INDEX--CNB.PROD.VSAM.ACCTMAST.INDEX
SMS CLASSES
STORAGECLASS---SCPRODONL MANAGEMENTCLASS---MCPRODCR
DATACLASS------DCKSDSHI
DATA --- CNB.PROD.VSAM.ACCTMAST.DATA
STATISTICS
REC-TOTAL--------12450382 SPLITS-CI--------847291
SPLITS-CA------------3847 EXTENTS----------- 18
FREESPACE-CI---------- 2 FREESPACE-CA---------- 0
CI-SIZE----------- 4096 CA-SIZE--------- 786432
a) How many records are in this dataset? b) What is the CI split count, and is it concerning? c) What does the extent count of 18 tell you? d) What is the current free space percentage at the CI and CA levels? e) What specific actions would you recommend based on these statistics?
Part B: SMS Class Design (Exercises 6–10)
Exercise 6 — Storage Class Selection For each of the following datasets, recommend the appropriate storage class from CNB's defined classes (SCPRODHI, SCPRODONL, SCBATCH, SCTEST, SCARCHIVE). Justify each choice:
a) CICS transaction log ESDS b) Monthly regulatory compliance report c) Daily batch extract GDG generations d) Online customer master KSDS e) Developer's unit test dataset f) 7-year SAR (Suspicious Activity Report) archive
Exercise 7 — Management Class Design Pinnacle Health Insurance needs management classes for: - Claims processing VSAM (updated daily, critical for operations) - Provider directory (updated weekly, read-heavy) - Explanation of Benefits print files (read once, kept 90 days) - HIPAA audit logs (never deleted, regulatory requirement)
Design a management class for each. Specify: backup frequency, number of backup versions, migration eligibility, and retention policy. Explain your rationale.
Exercise 8 — ACS Routine Logic Write pseudo-code for an ACS routine (storage class assignment) that implements the following policy:
- Any dataset starting with
PROD.CICSgets SCPRODHI - Any dataset starting with
PROD.DB2gets SCPRODHI - Any dataset starting with
PROD.BATCH.GDGgets SCBATCH - Any dataset starting with
PROD.VSAMgets SCPRODONL - Any dataset starting with
TESTgets SCTEST - Any dataset starting with
ARCHIVEgets SCARCHIVE - Everything else gets SCSTD
Test your logic against these dataset names:
a) PROD.CICS.TRANLOG.JOURNAL
b) PROD.VSAM.CUSTOMER.MASTER
c) PROD.BATCH.GDG.DAILYEXT
d) TEST.UNIT.PROGRAM.DATA
e) ARCHIVE.2024.REGULATORY.SAR
f) PROD.COPYLIB.PROGRAMS
Exercise 9 — Data Class Impact Analysis A COBOL program uses this SELECT statement:
SELECT OUTPUT-FILE
ASSIGN TO OUTFILE
ORGANIZATION IS SEQUENTIAL
FILE STATUS IS WS-OUT-STATUS.
And this JCL:
//OUTFILE DD DSN=CNB.BATCH.DAILY.REPORT,
// DISP=(NEW,CATLG,DELETE),
// SPACE=(TRK,(100,20))
The ACS routine assigns data class DCFB80 (RECFM=FB, LRECL=80, BLKSIZE=27920, SPACE=(10,10) CYL).
a) What RECFM, LRECL, BLKSIZE will the dataset actually get? b) What SPACE will be allocated? c) If the COBOL program writes records longer than 80 bytes, what happens? d) How would you fix this JCL to ensure correct allocation?
Exercise 10 — SMS Class Interaction Explain the interaction between SMS classes in each scenario:
a) Storage class SCPRODHI specifies "continuous availability," but the storage group only has volumes on a single storage subsystem. What's the operational impact? b) Management class MCPRODCR specifies daily backup, but the dataset's storage class places it on volumes that are excluded from the backup job's scope. What happens? c) Data class DCKSDSHI specifies CI=8192, but the IDCAMS DEFINE explicitly codes CI=4096. Which wins?
Part C: Dataset Allocation and Performance (Exercises 11–16)
Exercise 11 — BLKSIZE Optimization Calculate the optimal block size for each of the following file specifications on a 3390 device (track capacity = 56,664 bytes):
a) RECFM=FB, LRECL=100 b) RECFM=FB, LRECL=250 c) RECFM=FB, LRECL=1000 d) RECFM=VB, LRECL=500
For each, show: blocks per track, records per track, and track utilization percentage. Compare with the "naive" BLKSIZE (= LRECL for FB) to quantify the I/O cost of unblocked records.
Exercise 12 — SPACE Calculation A batch job creates a sequential output file with these characteristics: - 25 million records - RECFM=FB, LRECL=200, BLKSIZE=27800 (139 records/block) - Device: 3390
Calculate: a) Total number of blocks needed b) Blocks per track (for 3390 with this BLKSIZE) c) Total tracks needed d) Total cylinders needed (15 tracks/cylinder) e) Recommended SPACE parameter (primary and secondary)
Exercise 13 — VSAM CI Sizing Pinnacle Health's new claims validation KSDS has: - Key: 30 bytes (claim ID) - Average record size: 1,800 bytes - Maximum record size: 4,500 bytes - Expected record count: 200 million - Access pattern: 80% random reads (online), 20% sequential scans (batch) - Insert rate: 2 million new claims per day
Design the VSAM configuration: a) Recommend CI size for data component (justify with calculations) b) Recommend CI size for index component c) Recommend free-space percentages (CI and CA) d) Calculate approximate DASD space requirement e) Recommend buffer settings for online (CICS) and batch access
Exercise 14 — Extent Analysis A VSAM KSDS was defined with CYLINDERS(100,20). After 6 months of production, LISTCAT shows 23 extents.
a) What is the maximum number of extents for a VSAM dataset? b) What happened to cause 23 extents? c) What is the performance impact of 23 extents for sequential processing? d) What is the performance impact for random access? e) Design a remediation plan.
Exercise 15 — Multi-Volume Design Federal Benefits Administration needs to create a dataset for their 15-million-line consolidated benefits history. Each record is 2,500 bytes (RECFM=FB). The data must be accessible for random reads via a secondary index.
a) Calculate total dataset size b) How many 3390-54 volumes (approximately 50 GB each) are needed? c) Would you use a single multi-volume sequential file, a multi-volume VSAM KSDS, or another approach? Justify. d) Design the IDCAMS DEFINE for your chosen approach.
Exercise 16 — Performance Comparison
Two COBOL programs read the same 5-million-record VSAM KSDS sequentially. Program A uses no AMP overrides (VSAM defaults). Program B specifies AMP=('BUFND=30,BUFNI=10').
a) How many data buffers does Program A get by default? b) How does increasing BUFND to 30 improve sequential read performance? c) Calculate the approximate virtual storage consumed by Program B's buffers if CI size is 4096. d) At what point do additional data buffers stop helping sequential access?
Part D: GDG and Lifecycle Management (Exercises 17–21)
Exercise 17 — GDG Design SecureFirst Retail Bank processes mobile banking transactions in real-time and batches them into daily extracts for reconciliation. They need:
- Daily extracts: kept for 14 days
- Weekly rollups: kept for 8 weeks
- Monthly summaries: kept for 24 months
- Annual regulatory archives: kept for 10 years
Design the GDG base definitions (including LIMIT, EMPTY/NOEMPTY, SCRATCH/NOSCRATCH) for each level. Write the IDCAMS DEFINE commands.
Exercise 18 — GDG Relative Reference
Given a job with three steps that use the same GDG base BANK.DAILY.TXNS:
//STEP1 EXEC PGM=EXTRACT
//OUTPUT DD DSN=BANK.DAILY.TXNS(+1),DISP=(NEW,CATLG,DELETE),...
//STEP2 EXEC PGM=VALIDATE
//INPUT DD DSN=BANK.DAILY.TXNS(+1),DISP=SHR
//PRIOR DD DSN=BANK.DAILY.TXNS(0),DISP=SHR
//STEP3 EXEC PGM=REPORT
//CURRENT DD DSN=BANK.DAILY.TXNS(+1),DISP=SHR
//PREV1 DD DSN=BANK.DAILY.TXNS(0),DISP=SHR
//PREV2 DD DSN=BANK.DAILY.TXNS(-1),DISP=SHR
If the current latest generation before job submission is G0045V00: a) What absolute generation does (+1) resolve to? b) What absolute generation does (0) resolve to in STEP2? c) What absolute generation does (-1) resolve to in STEP3? d) If STEP1 abends, what happens to generation (+1)?
Exercise 19 — GDG Recovery Scenario
During the nightly batch window, the job that creates CNB.BATCH.DAILY.TXNS(+1) abends after writing 80% of the records. The dataset was created with DISP=(NEW,CATLG,DELETE).
a) What happens to the partial dataset?
b) If the abend disposition is DELETE, is the dataset uncataloged?
c) If you restart the job from the beginning, will the (+1) reference still work correctly?
d) What if the dataset was created with DISP=(NEW,CATLG,CATLG) instead? How does recovery differ?
Exercise 20 — HSM Migration Impact A batch job references three datasets: - Dataset A: on primary DASD (never migrated) - Dataset B: migrated to ML1 (DASD migration pool) - Dataset C: migrated to ML2 (tape/VTS)
a) What happens when the job tries to allocate Dataset B? b) What happens when it tries to allocate Dataset C? c) Estimate the time impact of auto-recall for B and C. d) Design a pre-batch IDCAMS job step to proactively recall both. e) How would you prevent critical production datasets from being migrated in the first place?
Exercise 21 — Capacity Planning CNB's nightly batch creates 6,000 GDG generations and deletes 6,000. Each generation averages 500 CYL. The shop has 200 3390-54 volumes in the batch storage group.
a) What is the daily gross space allocation (in GB, approximately)? b) What is the net space change if the new generations are roughly the same size as those being deleted? c) If the batch data grows at 5% per month, how many additional volumes will they need per year? d) What SMS management class settings could help control space consumption?
Part E: Applied Scenarios (Exercises 22–25)
Exercise 22 — Complete SMS Design You're the architect for a new insurance claims processing system. The system has: - 30 million active claims in a VSAM KSDS - 150,000 new claims per day - Monthly batch reporting that reads all 30M records - Regulatory requirement: retain all claims for 7 years - Three environments: PROD, UAT, DEV - Two LPARs sharing PROD datasets
Design the complete SMS configuration: a) Dataset naming convention b) Catalog and alias structure c) Storage classes (at least 3) d) Management classes (at least 3) e) Data classes (at least 3) f) ACS routine logic (pseudo-code) for storage class assignment
Exercise 23 — Performance Troubleshooting A COBOL batch program that processes CNB's daily transaction file has gradually slowed from 45 minutes to 3 hours over the past 6 months. The program reads a VSAM KSDS sequentially, performs lookups against a second KSDS, and writes to a sequential output file.
No code changes have been made. The transaction volume has increased by 10%.
Develop a diagnostic checklist. For each item, explain: a) What to check b) How to check it (specific z/OS commands or reports) c) What the finding would mean d) What the fix would be
Exercise 24 — Migration Architecture Sandra Chen at Federal Benefits Administration needs to modernize their dataset architecture. Currently: - 15,000 datasets with no consistent naming convention - Datasets scattered across 50 user catalogs with overlapping aliases - Many datasets have JOBCAT/STEPCAT references in JCL - Some GDGs have LIMIT(255) with no cleanup - Block sizes range from 80 to 32760 with no standardization
Design a migration plan that transforms this into a modern SMS-managed environment. Address: a) Naming convention design b) Catalog consolidation strategy c) JCL remediation approach d) GDG cleanup procedure e) Block size standardization f) Risk mitigation and rollback plan
Exercise 25 — HA Dataset Design Design the dataset architecture for a high-availability banking system that must: - Process 100 million transactions per day - Maintain 99.99% availability for online access - Complete the nightly batch window in 4 hours - Support failover to a secondary LPAR within 30 seconds - Retain 7 years of transaction history for regulatory compliance
Produce a complete design document covering: a) VSAM configurations for the transaction and customer master files b) GDG strategy for batch processing c) Multi-volume and striping strategy for the history file d) Cross-LPAR sharing approach (RLS vs. DPL vs. other) e) Backup and recovery strategy f) Capacity growth projections for 3 years
Part M: Spaced Review (Referencing Chapters 1–3)
Exercise M1 — JES and Dataset Allocation (Ch 1) Explain the sequence of events from the time JES reads your JCL DD statement to the time your COBOL program's OPEN statement completes. Reference specific z/OS components (JES2/JES3, initiator, catalog services, DFSMS, access methods).
Exercise M2 — Virtual Storage and Buffers (Ch 2) A CICS region manages an LSR buffer pool with 500 data buffers (CI size 4096) and 100 index buffers (CI size 2048) for 150 VSAM files.
a) Calculate the total virtual storage consumed by the buffer pool. b) Where in the address space does this buffer pool reside? c) If the CICS region is constrained on virtual storage below the bar, what are your options? d) How does z/OS 64-bit buffer support change the equation?
Exercise M3 — JCL and SMS Interaction (Ch 1 + Ch 4) A COBOL program runs in a job that specifies:
//STEP1 EXEC PGM=MYPROG
//INPUT DD DSN=CNB.PROD.VSAM.ACCTMAST,DISP=SHR
//OUTPUT DD DSN=CNB.BATCH.DAILY.REPORT(+1),
// DISP=(NEW,CATLG,DELETE),
// SPACE=(CYL,(50,10),RLSE)
Trace the complete allocation sequence for both DD statements, identifying: a) Which catalog is searched for each b) What SMS classes are assigned to the new GDG generation c) How the SPACE parameter interacts with the data class d) At what point the COBOL program can begin issuing I/O
Exercise M4 — Cross-Chapter Integration Kwame Mensah says: "Dataset management, virtual storage, and workload management are three legs of a stool. Pull one out and the system falls over."
Using knowledge from Chapters 1–4, explain three specific scenarios where a dataset management decision (Chapter 4) directly impacts virtual storage utilization (Chapter 2). For each scenario, identify the decision, the impact, and the mitigation.
Exercise M5 — Architecture Decision Record Write an Architecture Decision Record (ADR) for the following decision: "Should CNB migrate their account master file from a traditional VSAM KSDS to DB2?"
Include: - Context (current state from Chapters 1–4) - Decision drivers (performance, availability, skills, modernization) - Considered options (at least 3, including hybrid approaches) - Decision outcome and consequences - Risks and mitigations
Reference specific technical details from this chapter (catalog structure, SMS classes, VSAM performance characteristics) in your analysis.