Chapter 18 Further Reading: Backup, Recovery, and Logging

IBM Official Documentation

DB2 for z/OS

  • DB2 13 for z/OS Administration Guide — Chapter: "Recovering from Failures and Disasters." The definitive reference for z/OS recovery procedures, including detailed RECOVER utility syntax, conditional restart procedures, and recovery scenarios.
  • IBM Documentation: https://www.ibm.com/docs/en/db2-for-zos

  • DB2 13 for z/OS Utility Guide and Reference — Detailed reference for the COPY, RECOVER, QUIESCE, REPORT RECOVERY, REBUILD INDEX, and CHECK DATA utilities. Essential for writing production-grade recovery JCL.

  • IBM Documentation: https://www.ibm.com/docs/en/db2-for-zos

  • DB2 13 for z/OS Installation and Migration Guide — Covers initial log configuration, BSDS setup, and the DSNJU003/DSNJU004 utilities for log management.

  • GDPS: An Introduction to Geographically Dispersed Parallel Sysplex (IBM Redbooks SG24-6374) — Comprehensive guide to IBM's disaster recovery solution for z/OS, including Metro Mirror and Global Mirror configuration with DB2.

DB2 for LUW

  • DB2 11.5 Knowledge Center: Backup and Recovery — Complete reference for BACKUP DATABASE, RESTORE DATABASE, ROLLFORWARD DATABASE, and all related commands.
  • IBM Documentation: https://www.ibm.com/docs/en/db2/11.5

  • DB2 11.5 Knowledge Center: High Availability Disaster Recovery (HADR) — Configuration, monitoring, and failover procedures for HADR.

  • DB2 11.5 Knowledge Center: Database Logging — Detailed explanation of log file configuration parameters, archive logging setup, and log management.

  • DB2 11.5 Knowledge Center: Incremental Backup and Recovery — Specific guidance on incremental and delta backup strategies, including TRACKMOD configuration and automatic incremental restore.

IBM Redbooks

  • DB2 for z/OS: Data Sharing in a Nutshell (SG24-7322) — Covers recovery considerations specific to data sharing environments, including group restart, retained locks, and inter-member recovery.

  • Backup and Recovery for DB2 for LUW (SG24-8249) — A practical guide to designing and implementing backup/recovery strategies on LUW, with worked examples and sizing guidance.

  • DB2 Disaster Recovery Best Practices (REDP-5109) — Cross-platform coverage of disaster recovery architectures, including HADR, GDPS, log shipping, and hybrid approaches.

  • High Availability and Disaster Recovery with DB2 pureScale (SG24-8078) — For environments using pureScale, covers the interaction between pureScale clustering and HADR for maximum availability.

Books

  • Zikopoulos, Baklarz, and Eaton. "DB2 Universal Database V8 Handbook for Windows, UNIX, and Linux." Chapter on backup and recovery provides an approachable introduction to LUW backup concepts. While written for V8, the fundamental concepts remain valid.

  • Mullins, Craig S. "DB2 Developer's Guide." Comprehensive coverage of DB2 for z/OS, including detailed chapters on logging, recovery, and utility usage. Updated editions cover recent DB2 for z/OS versions.

  • Sloan, Robert and Hernandez, Michael. "DB2 for z/OS and OS/390: Ready for Java." Includes practical recovery scenarios and JCL examples relevant to z/OS DBAs.

  • Padmanabhan, Whei-Jen Chen, et al. "Understanding DB2: Learning Visually with Examples." Provides visual explanations of backup, recovery, and logging concepts on LUW, helpful for visual learners.

White Papers and Technical Articles

  • "Write-Ahead Logging" (original concept from IBM Research) — Mohan, C., Haderle, D., Lindsay, B., Pirahesh, H., and Schwarz, P. "ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging." ACM Transactions on Database Systems, 17(1), 1992. The foundational academic paper on the recovery algorithm that DB2 (and most other modern databases) implements.

  • IBM DeveloperWorks: "Best practices for DB2 backup and recovery" — Practical articles covering real-world scenarios, common mistakes, and optimization tips.

  • IBM DeveloperWorks: "Understanding DB2 logging" — Deep-dive into log architecture, log buffer management, and log-related performance tuning.

Transaction Processing Theory

  • Gray, Jim and Reuter, Andreas. "Transaction Processing: Concepts and Techniques." The definitive academic text on transaction processing, covering ACID properties, logging, recovery, and concurrency control. Dense but essential for deep understanding of why recovery works the way it does.

Storage and Replication

  • IBM DS8000 Series documentation — For understanding FlashCopy, Metro Mirror, and Global Mirror at the storage level. These technologies underpin the z/OS disaster recovery approaches discussed in this chapter.

  • IBM Spectrum Protect (TSM) documentation — For understanding how archive logs and backup images are managed by enterprise backup infrastructure.

Regulatory Requirements

  • FFIEC IT Examination Handbook: Business Continuity Management — U.S. banking regulators' guidance on business continuity and disaster recovery, including specific requirements for RPO, RTO, and testing frequency.

  • Basel Committee on Banking Supervision: Principles for the Sound Management of Operational Risk — International regulatory framework that drives DR requirements at banks like Meridian National.

Online Resources

  • IBM DB2 Community (https://community.ibm.com/community/user/datamanagement) — Forums where DBAs discuss recovery scenarios, share solutions, and ask questions.

  • IDUG (International DB2 Users Group) (https://www.idug.org) — Annual conferences and regional meetings with sessions on backup/recovery best practices. IDUG proceedings are a goldmine of real-world experience.

  • Planet DB2 (https://planetdb2.com) — Aggregator of DB2 blog posts from practitioners worldwide, frequently covering backup and recovery topics.

  1. Start with the IBM Redbook on Backup and Recovery for your platform (z/OS or LUW)
  2. Read the ARIES paper (Mohan et al., 1992) for theoretical foundation
  3. Practice every exercise in this chapter on a test system
  4. Study the GDPS or HADR documentation for disaster recovery architecture
  5. Review the FFIEC guidelines if working in financial services
  6. Attend IDUG sessions on backup/recovery when available