Part IV: Administration Fundamentals

This is where you start doing the actual work. Parts I through III gave you the knowledge to understand DB2 and to design databases that make sense. Part IV is where you learn to keep those databases alive, healthy, and secure in production — day after day, month after month, under load, under pressure, and occasionally under fire.

Database administration is not glamorous. Most of what you do as a DBA is invisible to the rest of the organization when things are going well, and it only becomes visible when something breaks. Your backup strategy is irrelevant until you need to recover. Your monitoring is background noise until an alert fires. Your security configuration is an afterthought until the audit. The work in this part is the work of keeping DB2 running so that everyone else can do their jobs, and doing it well requires both systematic discipline and hard technical knowledge.

What This Part Covers

Five chapters. Each one addresses a core competency area that every production DBA must master.

Chapter 17 covers DB2 utilities — the tools that maintain the physical health of your database. REORG, RUNSTATS, COPY, RECOVER, LOAD, UNLOAD, CHECK DATA, CHECK INDEX, CHECK LOB, and REBUILD INDEX on z/OS. REORG, RUNSTATS, BACKUP, RESTORE, LOAD, EXPORT, IMPORT, and db2move on LUW. These are not optional operations. Without regular RUNSTATS, the optimizer makes bad decisions. Without regular REORG, table and index structures degrade. Without regular COPY/BACKUP, you have no recovery capability. We cover each utility's syntax, options, and operational considerations. We cover online versus offline execution. We cover the JCL for z/OS utilities and the CLP commands and stored procedure interfaces for LUW. This chapter is reference-dense by design — you will come back to it repeatedly throughout your career.

Chapter 18 is backup and recovery. This is the chapter that justifies a DBA's salary. We cover the full-image and incremental backup strategies on both platforms, point-in-time recovery, tablespace-level and database-level recovery, and the log-based recovery architecture that makes it all work. On z/OS, we work through the RECOVER utility, the BSDS, log datasets, and the recovery process for individual tablespaces, indexes, and partitions. On LUW, we cover BACKUP DATABASE, RESTORE DATABASE, ROLLFORWARD DATABASE, and the relationship between log archiving and recovery granularity. We also cover the scenarios that keep DBAs awake at night: recovering from media failure, recovering from logical corruption, recovering when the logs themselves are damaged. You will build recovery runbooks for Meridian National Bank that you can adapt for your own environments.

Chapter 19 addresses security — authentication, authorization, and the layered security model DB2 provides. We cover the GRANT and REVOKE system, database-level and object-level privileges, roles, trusted contexts, label-based access control on z/OS, and row and column access control on LUW. We cover audit policies and the DB2 audit facility. We also cover the integration points with external security systems: RACF on z/OS, LDAP and Kerberos on LUW, and the emerging integration patterns with enterprise identity providers. Security is not an afterthought and this chapter does not treat it as one. The regulatory environment that Meridian Bank operates in — with SOX, GLBA, and PCI-DSS compliance requirements — makes database-level security controls mandatory rather than optional.

Chapter 20 is monitoring and diagnostics. How do you know if DB2 is healthy? What do you look at, how often, and what constitutes a problem versus normal operation? We cover the DB2 monitoring interfaces on both platforms: IFCID traces and the statistics, accounting, and performance traces on z/OS; the snapshot monitor, event monitors, and the db2pd tool on LUW. We also cover the newer monitoring table functions and administrative views that IBM has been developing to provide a more unified monitoring interface. We build monitoring dashboards for Meridian Bank that track the key health indicators: buffer pool hit ratios, lock escalations, log consumption rates, tablespace utilization, and long-running queries. You will leave this chapter knowing not just what to monitor but what the numbers mean and what action to take when they go wrong.

Chapter 21 covers the DB2 catalog — the system tables that describe every object in your database. The catalog is simultaneously the most important and least understood component of DB2 for most practitioners. Every DDL statement you execute modifies the catalog. Every query the optimizer compiles reads the catalog. The statistics that RUNSTATS collects are stored in the catalog. Your ability to diagnose problems, understand the optimizer's decisions, and manage your database effectively depends on your ability to query the catalog confidently. We cover SYSIBM tables on z/OS and SYSCAT views on LUW, and we build a library of catalog queries that answer the questions DBAs ask every day: What tables exist in this schema? What indexes are defined on this table? When were statistics last collected? What is the cardinality of this column? How much space is this tablespace consuming?

Why It Matters

Here is the truth about database administration: the technology is learnable, the tools are documented, and the processes are well-established. What separates competent DBAs from the rest is consistency. Running RUNSTATS is not hard. Running RUNSTATS on the right tables at the right frequency with the right options, every single time, without exception, across hundreds of objects in multiple databases — that is what separates an environment that hums along from one that lurches from crisis to crisis.

Backup and recovery is the most important skill in this part, full stop. You can recover from almost any other mistake. A bad REORG can be re-run. A security misconfiguration can be corrected. A missing monitoring alert can be added. But if you cannot recover your data after a failure, nothing else matters. Test your backups. Verify your recovery procedures. Do it regularly. Do it before you need to.

Dual-Platform Coverage

Part IV is where the z/OS and LUW divergence becomes most visible. The utilities have different names, different syntax, different operational models. z/OS utilities run as batch jobs submitted through JCL, managed by the DB2 utility infrastructure, and tracked through the DISPLAY UTILITY command. LUW utilities run as CLP commands, admin stored procedures, or through the IBM Data Server Manager interface.

Every chapter in this part provides parallel coverage. You will see the z/OS JCL and the LUW commands for every operation. The explanations cover both platforms, and where the platforms differ in capability — not just syntax — those differences are called out explicitly. If you work exclusively on one platform, you can focus on that platform's material, but I encourage you to at least read the other platform's coverage. Cross-platform knowledge makes you more valuable and gives you perspective on why certain things work the way they do.

The Meridian Bank Operations Framework

Throughout Part IV, we build an operational framework for Meridian National Bank's DB2 environment. This is not a toy exercise. By the end of this part, you will have:

  • A utility scheduling matrix that specifies REORG and RUNSTATS frequencies for every table category based on volatility and query patterns.
  • A backup and recovery strategy with documented RPO and RTO targets, tested recovery procedures, and escalation protocols.
  • A security model that implements the principle of least privilege across application accounts, DBA accounts, and audit accounts.
  • A monitoring configuration that captures the essential health metrics and alerts on meaningful thresholds rather than arbitrary defaults.
  • A catalog query library that you can adapt to any DB2 environment.

These artifacts are directly usable. Adapt them to your environment and they will serve you well.

How to Approach This Part

Read Chapter 17 first — the utilities chapter — because everything else depends on it. Backup depends on the COPY/BACKUP utility. Monitoring depends on RUNSTATS having been run. The catalog makes more sense once you have seen the utilities that populate it.

Then read Chapters 18 and 19 in order. Backup and recovery is the foundation of operational integrity. Security is the foundation of operational trust.

Chapters 20 and 21 can be read in either order, though I recommend tackling monitoring first because the catalog chapter references monitoring concepts.

Practice in your lab environment from Chapter 4. Run the utilities. Take backups and practice restoring them. Grant and revoke privileges. Query the catalog. This material does not stick through reading alone — you need to build the muscle memory of the commands and the tools.

Chapters in This Part