Chapter 29 Exercises: HADR and Replication

DataField.Dev

Chapter 29 Exercises: HADR and Replication

Exercise 29.1: HADR Synchronization Mode Analysis

Objective: Analyze the trade-offs between HADR synchronization modes for different scenarios.

Scenario: Meridian National Bank has three database environments that need HADR protection:

Database	Purpose	Transaction Rate	Data Sensitivity	Standby Location
CORE_BANK	Core banking OLTP	5,000 TPS	Financial (zero loss required)	Same data center (1 ms latency)
WEB_PORTAL	Customer web portal	2,000 TPS	Session data (some loss acceptable)	Different city (15 ms latency)
ANALYTICS	Analytics warehouse	200 bulk loads/day	Derived data (re-creatable)	Different continent (150 ms latency)

Tasks:

For each database, recommend an HADR synchronization mode. Justify your choice with: - RPO analysis (what data loss is acceptable?) - Performance impact analysis (what latency will the sync mode add?) - Availability behavior (what happens if the standby becomes unreachable?)
Calculate the per-transaction commit latency added by each recommended sync mode: SYNC: network_roundtrip + standby_disk_write + ack NEARSYNC: network_roundtrip + ack ASYNC: 0 (no wait)
For CORE_BANK with NEARSYNC mode, calculate the probability of data loss. Under what exact conditions would committed data be lost?
For WEB_PORTAL with the standby 15 ms away, calculate the throughput impact of using SYNC mode versus ASYNC mode. Assume each transaction performs one commit.

Exercise 29.2: HADR Configuration and Verification

Objective: Practice configuring HADR from scratch.

Scenario: You have two servers: - Primary: db2-prod.meridian.bank (IP: 10.10.1.10) - Standby: db2-stby.meridian.bank (IP: 10.10.1.20) - Database name: MERIDIAN - Instance name: db2inst1 - HADR port: 55001 - Sync mode: NEARSYNC - Peer window: 120 seconds

Tasks:

Write the complete sequence of commands to: a. Take a backup on the primary b. Restore on the standby c. Configure HADR parameters on both servers d. Start HADR on both servers (correct order!) e. Verify HADR is in PEER state
Write the command to perform a graceful (planned) takeover to the standby.
Write the command to perform a forced takeover when the primary is unreachable.
After a forced takeover, the old primary comes back online. Write the commands to: a. Reinitialize the old primary as the new standby b. Verify the new HADR configuration
Write a monitoring query using MON_GET_HADR() that checks: - Current HADR role and state - Log gap in bytes and time - Connection status - Primary and standby log positions

Exercise 29.3: Automatic Client Reroute Design

Objective: Design an ACR configuration for a multi-tier application.

Scenario: Meridian Bank's online banking application has the following architecture:

Mobile App → API Gateway → Application Server (Java) → DB2 HADR
Web Browser → Web Server → Application Server (Java) → DB2 HADR
Batch Scheduler → Batch Server (Python) → DB2 HADR

The application servers use JDBC Type 4 (JCC) drivers. The batch server uses Python with ibm_db.

Tasks:

Write the db2dsdriver.cfg XML configuration that enables: - ACR with seamless failover - Maximum 3 retry attempts - 10-second retry interval - 30-second ACR timeout - Alternate server pointing to the standby
Write the Java JDBC connection URL that enables ACR: jdbc:db2://primary:50000/MERIDIAN:...
For the Java application, write the exception handling code that detects an ACR reroute (SQL error -30108) and retries the transaction.
For the Python batch application, write equivalent error handling code.
Design a test plan to verify ACR works correctly. Include: - Test case for graceful takeover - Test case for sudden primary failure - Test case for network partition between primary and standby - Expected application behavior for each test case

Exercise 29.4: Reads-on-Standby Configuration

Objective: Configure and validate reads-on-standby for reporting workloads.

Scenario: Meridian's analytics team wants to run reports against the HADR standby to offload the primary. The reports include:

Report	Frequency	Duration	Tables Accessed
Daily transaction summary	Hourly	5 minutes	TRANSACTIONS, ACCOUNTS
Customer portfolio analysis	Daily	45 minutes	CUSTOMERS, ACCOUNTS, LOANS
Branch performance dashboard	Real-time	Continuous	All tables
Regulatory compliance extract	Monthly	2 hours	All tables

Tasks:

Identify which reports are good candidates for reads-on-standby and which should remain on the primary. Justify each decision based on: - Isolation level requirements - Tolerance for slightly stale data - Duration and resource consumption
For the reports running on the standby, what isolation levels are available? What impact does this have on the report results?
Write the connection configuration for the reporting application to connect to the standby server.
If the HADR log gap increases to 60 seconds during the 2-hour regulatory compliance extract, analyze: - Why is this happening? - What is the impact on the primary? - What corrective actions should be taken?
Design a monitoring approach that alerts when reads-on-standby are causing the log gap to grow beyond an acceptable threshold.

Exercise 29.5: pureScale vs. HADR Comparison

Objective: Evaluate pureScale and HADR for a specific workload.

Scenario: A new Meridian Bank application processes real-time payment transactions. Requirements:

Zero data loss (RPO = 0)
Sub-second RTO (no perceptible outage)
10,000 TPS write throughput
50,000 TPS read throughput
99.999% availability target

Tasks:

Evaluate whether HADR can meet these requirements. For each requirement, explain whether HADR satisfies it and any limitations.
Evaluate whether pureScale can meet these requirements. For each requirement, explain whether pureScale satisfies it.
Create a comparison table covering: - Infrastructure requirements (servers, storage, networking) - Licensing and operational costs (estimate ranges) - Operational complexity - Failure scenarios and recovery behavior - Read/write scalability
Make a recommendation with supporting rationale. Under what conditions would you change your recommendation?
If the budget allows only two servers (primary + standby), can any combination of technologies meet the requirements? What compromises would be necessary?

Exercise 29.6: Q Replication Design

Objective: Design a Q Replication topology for cross-site data distribution.

Scenario: Meridian Bank operates two data centers:

DC-East (New York): Primary operations center, 60% of customers
DC-West (Dallas): Secondary operations center, 40% of customers

The bank wants active-active capability: customers can be served from either data center, and both data centers can process transactions. If one data center fails, the other takes over all traffic.

Tasks:

Design a Q Replication topology showing: - Source and target databases at each site - MQ queue managers and channels - Bidirectional replication subscriptions - Which tables are replicated
For the ACCOUNTS table, design a conflict resolution strategy. Consider: - What happens if the same account is updated at both sites simultaneously? - How does the bank ensure account balances remain consistent? - Should certain operations be restricted to one site (e.g., account closure)?
Calculate the end-to-end replication latency assuming: - Network latency (NY to Dallas): 35 ms - Q Capture processing: 50 ms - MQ transport: network latency + 10 ms overhead - Q Apply processing: 50 ms
Design a monitoring approach for the Q Replication environment including: - Latency monitoring - Conflict rate monitoring - Queue depth monitoring - Data consistency verification
Write a failover procedure for the scenario where DC-East becomes completely unavailable.

Exercise 29.7: CDC to Kafka Pipeline

Objective: Design a CDC pipeline from DB2 to Apache Kafka.

Scenario: Meridian Bank wants to stream changes from the MERIDIAN database to Kafka for consumption by three downstream systems:

Fraud detection engine — needs transaction changes within 2 seconds of commit
Data warehouse loader — needs all table changes, can tolerate 30-second latency
Customer notification service — needs ACCOUNTS and TRANSACTIONS changes, 5-second latency

Tasks:

Design the Kafka topic structure. Consider: - One topic per table vs. one topic for all tables - Topic naming conventions - Partitioning strategy (what key to partition on?) - Retention policy
For the TRANSACTIONS table, write the expected Kafka message format (JSON) for an INSERT and an UPDATE operation. Include: - Operation type (I/U/D) - Before-image (for updates and deletes) - After-image (for inserts and updates) - Transaction timestamp - Source metadata (LSN, transaction ID)
Calculate the required Kafka cluster sizing: - 5,000 transactions per second - Average message size: 500 bytes - 7-day retention - Replication factor: 3
Design error handling for the CDC pipeline: - What happens if Kafka is temporarily unavailable? - What happens if the CDC engine crashes and restarts? - How do you detect and handle message duplicates in consumers?
Write a consumer group design for the three downstream systems, ensuring each system gets the messages it needs without interfering with the others.