Case Study 2 — Pinnacle Health's Automated Compliance Reporting

DataField.Dev

Case Study 2 — Pinnacle Health's Automated Compliance Reporting

Background

Pinnacle Health Systems operates a 1,200-bed hospital network across four campuses, processing health insurance claims, patient records, and regulatory submissions on a z/OS mainframe. Their compliance obligations are extensive:

HIPAA — Health Insurance Portability and Accountability Act: privacy and security of protected health information (PHI)
CMS Reporting — Centers for Medicare & Medicaid Services: quality measures, cost reports, utilization data
State Medicaid — Reporting to three state Medicaid programs
Joint Commission — Accreditation compliance data
Internal audit — Monthly security reviews, quarterly access audits, annual penetration testing

Diane Kowalski, VP of Information Technology, and Ahmad Rashid, Lead z/OS Systems Programmer, were drowning. Not in data — they had plenty of that — but in the manual labor required to extract, validate, format, and submit compliance reports. Every month, Ahmad's team spent approximately 320 person-hours on compliance reporting. Eighty percent of that time was manual: running queries, reformatting output, cross-referencing datasets, checking for anomalies, and assembling final reports.

The breaking point came during a CMS audit in Q3 2022. The auditor asked, "Can you show me the complete chain of custody for the data in this report — from source system extraction through every transformation to the final submission?" Ahmad could show individual steps but couldn't demonstrate an unbroken, automated, auditable chain. The auditor noted this as a "significant finding."

The Problem in Detail

Ahmad mapped the monthly compliance reporting process and found a tangled web of manual steps:

HIPAA Security Report (Monthly)

Extract RACF access data using IRRDBU00 — manual JCL submission, 45 minutes
Extract SMF type 80 records (RACF events) for the month — manual job, 2 hours
Cross-reference access data against the authorized user list (Excel spreadsheet maintained by Security) — manual comparison, 4 hours
Identify anomalies (unauthorized access attempts, excessive privileges, dormant accounts) — manual review, 6 hours
Format findings into the compliance report template (Word document) — manual, 3 hours
Review cycle with Diane, Security Officer, and Legal — 2 hours
Submit to compliance portal — manual upload, 30 minutes

Total: approximately 18 hours across three people, repeated monthly.

CMS Quality Measures (Quarterly)

Extract clinical data from DB2 — 12 different queries, manual execution, 3 hours
Extract claims data from VSAM files — custom COBOL programs, manual JCL, 2 hours
Cross-reference clinical and claims data — REXX scripts that Ahmad wrote in 2018 and partially documented, 4 hours
Calculate quality measure numerators and denominators — Excel, 8 hours
Validate against CMS specifications — manual comparison against 200-page spec document, 6 hours
Format for CMS submission format (QRDA III XML) — semi-automated, 4 hours
Internal review and approval — 3 hours
Electronic submission — manual, 1 hour

Total: approximately 31 hours across four people, repeated quarterly.

The Hidden Risks

Beyond the labor cost, Ahmad identified three critical risks in the manual process:

Inconsistency. The same report produced by different team members sometimes had different results. Not because the data was different, but because the manual steps — particularly the cross-referencing and anomaly detection — involved judgment calls that different people made differently.

Timeliness. Reports were frequently late. Not dramatically late, but consistently 2–3 days behind schedule, eating into the review cycle. The CMS quarterly report had missed its submission deadline once in 2021 (a $50,000 penalty) because the manual process ran long during a particularly complex quarter.

Auditability. The CMS auditor's finding was the wake-up call. When compliance data passes through manual steps — Excel spreadsheets, Word documents, email attachments — the chain of custody is broken. You can't prove that the data in the final report is the same data that was extracted from the source system.

The Automation Design

Ahmad proposed a comprehensive automation framework with three principles:

Source to submission, untouched by human hands. Every data transformation must be automated and logged. Humans review output, not intermediate steps.
Every run is reproducible. Given the same inputs and the same date range, the automation must produce identical output every time.
The audit trail is the report. The automation log itself serves as evidence of compliance — it proves what data was extracted, how it was transformed, and what the results were.

Architecture

Ahmad designed a three-tier automation architecture:

Tier 1: Extraction Layer (REXX + COBOL + JCL PROCs)

Standardized extraction PROCs for each data source:

PINHRACT — RACF data extraction (IRRDBU00 + post-processing)
PINHSMFX — SMF record extraction (IFASMFDP + filtering)
PINHDB2X — DB2 clinical data extraction (COBOL-DB2 programs)
PINHVSMX — VSAM claims data extraction (COBOL programs)

Each PROC included: - Pre-extraction validation (source data available, date range correct) - Extraction execution - Post-extraction record counting and checksumming - Audit log entry generation

Tier 2: Transformation Layer (COBOL + REXX)

Standardized transformation programs:

PINHXREF — Cross-reference engine (COBOL program, table-driven)
PINHANOM — Anomaly detection engine (COBOL program, rule-driven)
PINHCALC — Quality measure calculator (COBOL program)
PINHFMT  — Report formatter (REXX exec, template-driven)

The cross-reference engine (PINHXREF) replaced the manual Excel comparison. It read the authorized user list from a DB2 table (maintained by Security through an ISPF application, with change tracking) and compared it against the RACF extract. Every comparison was logged: matched, unmatched-in-source, unmatched-in-reference.

The anomaly detection engine (PINHANOM) replaced manual review. It applied 34 rules defined in a DB2 rules table:

Rule ID	Description	Severity
HIPAA-001	User with SPECIAL authority not on approved list	Critical
HIPAA-002	User account not accessed in 90+ days	Major
HIPAA-003	Dataset with UACC > NONE containing PHI	Critical
HIPAA-004	More than 3 failed logon attempts in 24 hours	Major
HIPAA-005	Access to PHI dataset outside business hours	Minor
...	(29 additional rules)	Various

Rules were maintained by the compliance team through an ISPF panel application — no programming required. Adding a new rule was a data entry task, not a development project.

Tier 3: Assembly and Submission Layer (REXX + OPS/MVS)

The final tier assembled the compliance report from transformation output, generated the formatted submission (PDF for human review, XML/CSV for electronic submission), and managed the approval workflow:

Assembly REXX exec combines all transformation outputs into the final report structure
Report is written to a review dataset and notification sent to approvers
Approvers review via ISPF application and mark "approved" or "rejected with comments"
Upon approval, submission REXX exec transmits to the appropriate portal/system
Confirmation of submission is captured and logged

The Batch Stream

The complete compliance reporting batch stream ran monthly (with additional quarterly and annual runs for specific reports):

PINH-COMPLY-MONTHLY
├── PINHPRE1  — Pre-flight validation
├── PINHRACT  — RACF data extraction
├── PINHSMFX  — SMF data extraction
├── PINHDB2X  — DB2 clinical data extraction (quarterly only)
├── PINHVSMX  — VSAM claims data extraction (quarterly only)
├── PINHXREF  — Cross-reference processing
├── PINHANOM  — Anomaly detection
├── PINHCALC  — Quality measure calculation (quarterly only)
├── PINHFMT   — Report formatting
├── PINHASBL  — Report assembly
├── PINHVAL1  — Output validation
├── PINHNOTF  — Approver notification
└── PINHARCH  — Archive extraction data and audit logs

Self-Healing Elements

Ahmad built self-healing into the compliance batch stream because a late compliance report is a compliance violation:

Pre-flight checks specific to compliance: - RACF IRRDBU00 output from the prior day's unload is available - SMF data for the full reporting period is on DASD (not migrated) - DB2 catalog tables have current statistics - Authorized user list was updated within the last 30 days (flag if stale) - Previous month's report is archived (for trend comparison)

Recovery for common failures: - SMF data partially migrated: automated recall and retry (up to 60-minute wait) - DB2 timeout during extraction: retry with larger sort work area allocation - Cross-reference mismatch exceeds threshold (>5% unmatched): hold for human review rather than producing a potentially incorrect report - Report formatting failure: retry with expanded region size - Submission portal unavailable: queue for retry every 30 minutes for up to 24 hours

Critical safety rule: If any transformation step produces results that differ from the previous month by more than 20% (configurable threshold), the batch stream halts and escalates. A 25% change in RACF anomalies month-over-month might indicate a real security issue — or it might indicate a data extraction problem. Either way, a human needs to look at it before the report goes out.

Implementation

Phase 1: HIPAA Security Report (8 weeks)

Ahmad started with the HIPAA monthly report because it was the most frequent and most labor-intensive. Implementation milestones:

Weeks 1–2: Built the extraction PROCs (PINHRACT, PINHSMFX). Ran in parallel with manual extraction for two months to validate accuracy.

Weeks 3–4: Built the cross-reference engine (PINHXREF). The authorized user list migration from Excel to DB2 was the most contentious step — the Security team had maintained that spreadsheet for eight years and didn't want to change their process. Diane intervened, explaining that an Excel spreadsheet with no change tracking was itself a compliance risk.

Weeks 5–6: Built the anomaly detection engine (PINHANOM) with the initial 34 rules. Ahmad worked with the compliance officer to translate their manual review criteria into formal rules. This process revealed that some "criteria" were actually hunches — "I look for things that seem off." Those weren't automatable and became documented exceptions requiring human review.

Weeks 7–8: Built the formatting, assembly, and submission components. End-to-end testing with three months of historical data.

Parallel run: The automated process ran alongside the manual process for two months. Results were compared line-by-line. The automated process found 7 anomalies that the manual process had missed (human reviewers had overlooked them in the volume of data). The manual process found 0 anomalies that the automated process missed.

Phase 2: CMS Quality Measures (12 weeks)

The CMS reporting automation was more complex because it involved clinical data with intricate calculation rules specified by CMS. Ahmad's team:

Translated 47 CMS quality measure specifications into COBOL calculation logic
Built a regression test suite using two years of historical data with known correct results
Implemented the QRDA III XML output format per CMS specification
Added threshold checking (the "20% deviation" safety rule) after the first test run produced results that differed from manual calculations — the manual calculations had a formula error that had been producing slightly wrong results for four quarters

Phase 3: Remaining Reports and Self-Healing (8 weeks)

State Medicaid, Joint Commission, and internal audit reports followed the same pattern. Self-healing was added in this phase, after the team had enough production experience to know the common failure modes.

Results

Quantitative Impact

Metric	Before	After	Change
Monthly HIPAA report effort	18 hours	2 hours (review only)	-89%
Quarterly CMS report effort	31 hours	4 hours (review only)	-87%
Total compliance reporting hours/month	320	48	-85%
Reports submitted late	3/year	0	-100%
Anomalies missed by manual review	7/month (estimated)	0	-100%
Time from data extraction to report ready	3–5 days	4 hours	~95% reduction
Audit findings related to data handling	2–3/year	0	-100%

The Audit Redemption

When the CMS auditor returned in Q1 2023, Ahmad demonstrated the complete automated chain of custody:

Extraction logs showing exactly which data was pulled, from which sources, with record counts and checksums
Transformation logs showing every cross-reference comparison and anomaly rule evaluation
Calculation logs showing the inputs, formulas, and outputs for every quality measure
Assembly logs showing how the final report was constructed
Validation logs showing the automated checks performed on the output
Approval logs showing who reviewed and approved the report, with timestamps
Submission logs showing successful transmission to CMS

The auditor's response: "This is exactly what we're looking for. This is a model implementation."

The "significant finding" from the previous audit was formally closed.

The Unexpected Benefit: Error Detection

The automated anomaly detection found issues the manual process had been missing:

14 dormant user accounts with access to PHI datasets — accounts of former contractors whose access was never revoked
3 datasets with UACC(READ) that contained PHI — a configuration error from a migration two years prior
A pattern of after-hours PHI access by a single user that, upon investigation, turned out to be a nurse accessing patient records from home (against policy but not malicious) — a training issue, not a security incident, but one that needed to be documented

None of these were catastrophic. But under HIPAA, any of them could have been a reportable breach if exploited. The automated system caught them; the manual process hadn't.

Cost Analysis

Cost Item	Amount
Ahmad's time (design and implementation, 28 weeks at 60%)	$67,200
COBOL developer (12 weeks, quality measure calculations)	$28,800
OPS/MVS license (incremental — already owned for operations)	$0
DB2 table setup and ISPF panel development	$8,400
Testing and parallel run	$14,000
Total implementation cost	$118,400
Annual labor savings (272 hours/month x 12 x $85/hr loaded) \| $277,440
Payback period	5.1 months

Ongoing Operations

Monthly Cycle

The compliance automation runs on the first business day of each month. The typical cycle:

01:00 AM: Scheduler launches PINH-COMPLY-MONTHLY
01:05 AM: Pre-flight validation completes (RC=0)
01:10–03:30 AM: Extraction, transformation, calculation
03:30–04:00 AM: Formatting and assembly
04:00 AM: Reports staged for review, notification emails sent
08:00–10:00 AM: Compliance officer and Diane review reports via ISPF
10:00 AM: Reports approved
10:05 AM: Automated submission to CMS, state Medicaid portals, internal archive

The entire process, from raw data to submitted reports, takes about 9 hours including human review. The manual process took 3–5 business days.

Rule Maintenance

The compliance team adds 4–6 new anomaly detection rules per year as regulatory requirements evolve. Because rules are data-driven (stored in DB2), adding a rule is a 15-minute task: define the rule criteria in the ISPF application, test with last month's data, and activate. No code changes, no PROC modifications, no JCL updates.

Self-Healing Statistics

In the 18 months since self-healing was activated, the compliance batch stream has: - Run 18 times (monthly) - Failed 4 times (22% failure rate — higher than typical batch because compliance data has more edge cases) - Self-healed 3 of those 4 failures (75% automated recovery rate) - Required human intervention once (a new CMS data format change that the extraction PROC didn't handle)

Discussion Questions

The anomaly detection engine uses rules stored in a DB2 table. What are the advantages and disadvantages of rule-based anomaly detection versus machine learning approaches for compliance monitoring?
Ahmad's "20% deviation threshold" safety rule halts the batch stream if results differ significantly from the prior month. How would you set this threshold? What are the risks of setting it too high versus too low?
The authorized user list migration from Excel to DB2 encountered organizational resistance. How do you handle situations where automation requires process changes that affected teams don't want to make?
The automated process found anomalies that the manual process missed. Does this mean the previous compliance reports were incorrect? What are the legal and regulatory implications?
Pinnacle Health's compliance automation is entirely z/OS-based. Many healthcare organizations are moving compliance reporting to cloud-based GRC (Governance, Risk, Compliance) platforms. What are the arguments for and against keeping compliance automation on the mainframe?