Chapter 33 Exercises: JDBC, ODBC, and Python Access

DataField.Dev

Chapter 33 Exercises: JDBC, ODBC, and Python Access

These exercises progress from foundational concepts to production-level design challenges. Work through them in order; later exercises build on skills from earlier ones.

Exercise 33.1: Basic JDBC Connection and Query

Objective: Write a complete Java program that connects to DB2 and retrieves data.

Task: Write a Java program that:

Connects to the Meridian Bank DB2 database using a Type 4 JDBC driver
Queries banking.accounts for all accounts belonging to customer 100425
Displays account ID, type, balance, and status for each account
Uses try-with-resources for all JDBC objects (Connection, PreparedStatement, ResultSet)
Uses PreparedStatement with parameter markers (not string concatenation)
Handles SQLException by printing SQLCODE, SQLSTATE, and the error message

Expected output format:

Account ID   Type        Balance        Status
=========== =========== ============== ========
      10001 CHECKING      $12,450.00   ACTIVE
      10002 SAVINGS       $45,230.50   ACTIVE

Exercise 33.2: SQL Injection Analysis

Objective: Understand the security and performance implications of parameterized vs. concatenated SQL.

Task:

Write two versions of a method that retrieves account details by customer ID: - Version A: Uses Statement with string concatenation (vulnerable) - Version B: Uses PreparedStatement with parameter markers (safe)
For each version, answer: - What SQL does DB2 receive if the user provides input 100425? - What SQL does DB2 receive if the user provides input 100425 OR 1=1? - If the method is called 10,000 times with different customer IDs, how many unique SQL statements does DB2 parse? What happens to the package cache? - Which version enables DB2 to reuse the access plan?
Write a test that demonstrates the SQL injection vulnerability in Version A.

Exercise 33.3: Connection Pool Sizing

Objective: Design a connection pool configuration for a specific workload.

Scenario: Meridian Bank's online banking application handles:

2,000 concurrent users during peak hours
Average user think time between requests: 8 seconds
Average DB queries per request: 4
Average DB query duration: 12ms
Application runs on 3 instances (load-balanced)
DB2 server has 16 CPU cores with SSD storage

Tasks:

Calculate the theoretical number of concurrent database connections needed (total, then per instance).
Apply the (cores * 2) + spindle_count formula and compare with your calculation.
Design a complete HikariCP configuration (maximumPoolSize, minimumIdle, connectionTimeout, idleTimeout, maxLifetime, connectionTestQuery, leakDetectionThreshold). Justify each value.
What DB2 monitoring query would you run to verify your pool sizing is correct?
What symptoms would you expect if the pool were set to 500 connections per instance?

Exercise 33.4: Transaction Management — Bill Payment

Objective: Implement proper transaction management for a multi-step financial operation.

Task: Write a Java method processBillPayment(int sourceAccountId, int payeeId, BigDecimal amount) that:

Verifies the source account exists, is ACTIVE, and has sufficient balance
Debits the source account
Inserts a transaction record for the debit
Inserts a payment record in the banking.payments table
Updates the payee's last_payment_date
Commits on success, rolls back entirely on any failure
Handles SQLCODE -911 (deadlock) with retry logic (up to 3 retries, exponential backoff)
Returns a result object indicating success or the specific failure reason

Include all resource cleanup in finally blocks. Test your method mentally with these scenarios: - Successful payment - Insufficient funds (step 1 fails) - Payee not found (step 5 fails) - Deadlock on first attempt, success on retry

Exercise 33.5: Batch Insert Performance Comparison

Objective: Measure and understand the performance difference between individual inserts and batch operations.

Task:

Write a Java method that inserts 10,000 transaction records one at a time (individual executeUpdate() calls with auto-commit on).
Write a second method that inserts the same 10,000 records using addBatch() / executeBatch() with batches of 1,000 and a single commit at the end.
For each method, calculate: - Number of network round-trips to DB2 - Number of commits - Expected relative performance (which is faster and by approximately how much?)
Write the equivalent batch insert in Python using both ibm_db (prepare/execute loop) and ibm_db_dbi (executemany()).

Exercise 33.6: Python ibm_db — Account Analysis Script

Objective: Write a Python script using ibm_db for account analysis.

Task: Write a Python script that:

Connects to DB2 using ibm_db with credentials from environment variables
Accepts a customer ID as a command-line argument
Uses a prepared statement to query all accounts for that customer
Displays: - Account summary (count by type, total balance by type) - Accounts with balances above $100,000 (sorted by balance descending) - Accounts with zero or negative balance (flagged as alerts)
Handles connection errors and SQL errors with descriptive messages including SQLSTATE
Uses try/finally to ensure the connection is always closed

Exercise 33.7: SQLAlchemy ORM — Customer Portfolio

Objective: Define ORM models and write queries using SQLAlchemy.

Task:

Define SQLAlchemy ORM classes for Customer, Account, and Transaction with proper relationships (one customer has many accounts; one account has many transactions).
Write the following queries using the ORM (not raw SQL): - All customers whose total balance across all active accounts exceeds $500,000 - The top 10 accounts by balance, including the customer name - All accounts opened in the last 30 days, with a count by account type - The 5 most recent transactions for a given account ID
Write a function transfer_funds(session, from_id, to_id, amount) that uses ORM objects and with_for_update() for proper locking.

Exercise 33.8: Pandas Data Pipeline

Objective: Build a data analysis pipeline using pandas and DB2.

Task: Write a Python script that:

Extracts the last 6 months of transaction data from banking.transactions using pd.read_sql() with chunksize=50000
Computes monthly statistics: - Total transaction count and amount by month - Average transaction size by type - Month-over-month growth rates
Identifies anomalies: - Accounts with monthly transaction counts exceeding 3 standard deviations above the mean - Individual transactions exceeding $50,000
Creates a summary DataFrame and writes it back to analytics.monthly_txn_analysis using to_sql(if_exists='append')
Uses parameterized queries (not string interpolation) for all date filters

Exercise 33.9: ODBC/CLI Configuration

Objective: Understand ODBC/CLI connection configuration for different environments.

Task:

Write complete odbc.ini and db2cli.ini entries for these scenarios: - DB2 LUW on Linux with SSL/TLS enabled - DB2 for z/OS through DB2 Connect gateway - DB2 on IBM Cloud (Db2 on Cloud service)
For each configuration, identify: - Which authentication method is used - Whether encryption is enabled - The default schema - Any performance-related settings (fetch size, query timeout, etc.)
Write a Python function that accepts a dictionary of connection parameters and builds a valid ibm_db connection string, handling both SSL and non-SSL configurations.

Exercise 33.10: Error Handling Matrix

Objective: Build a comprehensive error handling strategy across all three access methods.

Task: For each of these DB2 error scenarios, write the error handling code in JDBC, ibm_db (Python), and SQLAlchemy:

Scenario	SQLCODE	SQLSTATE
Unique constraint violation	-803	23505
Foreign key violation (parent missing)	-530	23503
Deadlock	-911	40001
Lock timeout	-913	57033
Communication error	-30081	08001

For each scenario and each language: - Show the exact exception/error you catch - Show how to extract the SQLCODE and SQLSTATE - Describe the appropriate application response (retry, fail gracefully, alert, etc.)

Exercise 33.11: NULL Handling Across Languages

Objective: Correctly handle NULL values in all three access methods.

Task: The banking.accounts table has these nullable columns: interest_rate, close_date, notes.

In JDBC, write code that reads all three nullable columns and correctly distinguishes between NULL and zero/empty: - Use rs.wasNull() for the interest_rate (numeric) column - Handle getString() returning null for notes - Handle getDate() returning null for close_date
In Python ibm_db, write equivalent code using fetch_assoc() and checking for None.
In SQLAlchemy, write a query that filters for accounts where interest_rate IS NULL and another that filters for interest_rate = 0. Explain the difference.

Exercise 33.12: Connection Validation and Recovery

Objective: Implement robust connection handling that recovers from network failures.

Task: Write a Java class ResilientDB2Client that:

Uses HikariCP with connectionTestQuery set to VALUES 1
Wraps every database operation in a method that catches SQLException
Detects connection failures (SQLCODE -30081, SQLSTATE starting with "08")
For connection failures: logs the error, waits 5 seconds, and retries up to 3 times
For deadlocks (-911): retries immediately with exponential backoff
For all other errors: fails immediately with a descriptive error

Test your implementation mentally against these failure scenarios: - DB2 instance restart (all existing connections become invalid) - Network blip (one request fails, next succeeds) - Sustained network outage (all retries fail)

Exercise 33.13: Fetch Size Experiment

Objective: Understand the impact of fetch size on query performance.

Task:

Write a Java program that queries banking.transactions for all transactions in 2025 (expect 200,000+ rows). Run the query with fetch sizes of: 1, 10, 100, 500, 1000, 5000.
For each fetch size, calculate: - Number of network round-trips (approximately = total_rows / fetch_size) - Memory used per fetch (approximately = fetch_size * avg_row_size)
Create a table showing the trade-offs:

Fetch Size	Round-Trips	Memory per Fetch	Relative Time
1	200,000	~100 bytes	Slowest
10	?	?	?
...	...	...	...

What fetch size would you recommend for this query? What about a query that returns only 5 rows?

Exercise 33.14: Monitoring Application Connections from DB2

Objective: Write DBA monitoring queries to track application connection behavior.

Task: Write SQL queries using MON_GET_CONNECTION and related table functions to answer:

How many connections are currently active from each application server?
Which connections have been idle (no commits or rollbacks) for more than 30 minutes?
What is the commit-to-rollback ratio for each application? (High rollback rates indicate problems.)
Which connections are consuming the most CPU time?
Which application has the most active units of work?

For each query, explain what the DBA should look for and what action to take if the numbers are abnormal.

Exercise 33.15: Decimal Precision in Pandas

Objective: Correctly handle financial decimal values through a pandas pipeline.

Task:

Read account balances from DB2 into a pandas DataFrame.
Demonstrate the precision issue: show that float(Decimal("0.10")) does not equal exactly 0.1 in float64.
Implement three strategies for preserving decimal precision: - Strategy A: Convert to Python Decimal objects - Strategy B: Store as integer cents (multiply by 100) - Strategy C: Store as string representation
Write each strategy's DataFrame to a Parquet file and read it back. Verify which strategies preserve exact values.
Recommend which strategy to use for Meridian Bank's data pipeline and explain why.

Exercise 33.16: Stored Procedure Call from Python

Objective: Call a DB2 stored procedure with IN/OUT parameters from Python.

Task:

Assume this stored procedure exists: sql CREATE PROCEDURE banking.calculate_interest( IN p_account_id INTEGER, OUT p_interest_amount DECIMAL(15,2), OUT p_return_code INTEGER )
Write Python code using ibm_db to call this procedure, passing an account ID and receiving the two OUT parameters.
Write equivalent code using ibm_db_dbi (DB-API 2.0 interface).
Add error handling for: procedure not found (SQLSTATE 42884), invalid account ID (procedure returns p_return_code = -1).

Exercise 33.17: Security Audit Checklist

Objective: Audit application code for DB2 security best practices.

Task: Review the following code snippets and identify all security issues. For each issue, explain the risk and provide the corrected code.

Snippet A (Java):

String url = "jdbc:db2://db2prod:50000/MBNKPROD";
Connection conn = DriverManager.getConnection(url, "db2admin", "admin123");
String sql = "SELECT * FROM banking.accounts WHERE customer_id = "
           + request.getParameter("id");
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery(sql);

Snippet B (Python):

conn_str = "DATABASE=MBNKPROD;HOSTNAME=db2prod;PORT=50000;" \
           "PROTOCOL=TCPIP;UID=root;PWD=password;"
conn = ibm_db.connect(conn_str, "", "")
sql = f"DELETE FROM banking.transactions WHERE txn_id = {user_input}"
ibm_db.exec_immediate(conn, sql)
ibm_db.commit(conn)

Exercise 33.18: Full Application Access Layer Design

Objective: Design and implement a complete data access layer for Meridian Bank.

Task: Design a data access layer with these components:

Java AccountDAO class: - HikariCP connection pool initialization (justify all pool settings) - getAccount(int accountId) — parameterized single-row query - getActiveAccountsByCustomer(int customerId) — multi-row query with fetch size - createAccount(int customerId, String type, BigDecimal deposit) — insert with generated key - transferFunds(int from, int to, BigDecimal amount) — multi-step transaction with deadlock retry - Proper resource management and error handling throughout
Python analytics.py module: - SQLAlchemy engine with pool configuration - daily_balance_snapshot() — extracts current balances to Parquet - transaction_volume_by_branch(start_date, end_date) — aggregation query - write_summary(df, table_name) — writes DataFrame to DB2
For both components, ensure: - All queries use parameterized statements - All credentials come from environment variables - All resources are cleaned up properly - Error handling distinguishes between retryable and non-retryable errors

Exercise 33.19: Connection Pool Exhaustion Diagnosis

Objective: Diagnose and fix a connection pool exhaustion scenario.

Scenario: The Meridian Bank web application is failing intermittently with SQLTransientConnectionException: Connection is not available, request timed out after 10000ms. HikariCP metrics show:

Total connections:    20 (= maximumPoolSize)
Active connections:   20
Idle connections:     0
Threads waiting:      45

Task:

What does this data tell you about the application's connection usage?
List three possible causes for connection pool exhaustion.
For each cause, describe how you would diagnose it (what logs, metrics, or DB2 queries to check).
Write the Java code pattern that would cause a connection leak.
Write the corrected code that prevents the leak.
What HikariCP setting would help you detect leaks proactively?

Exercise 33.20: Cross-Language Integration Test

Objective: Verify that Java and Python applications can coexist against the same DB2 database.

Task:

Write a Java program that inserts 100 test accounts into banking.accounts.
Write a Python script that reads those same accounts using ibm_db and verifies the data.
Write a second Python script using pandas that reads the accounts into a DataFrame and computes summary statistics.
Ensure all three programs use the same connection parameters (from environment variables).
Document the data type mappings observed: - DB2 DECIMAL(15,2) maps to what Java type? What Python type? What pandas dtype? - DB2 VARCHAR(50) maps to what in each language? - DB2 DATE maps to what in each language?

Exercise 33.21: WITH UR for Read-Only Extracts

Objective: Understand when and how to use uncommitted read isolation for extract workloads.

Task:

Explain what WITH UR (Uncommitted Read) means in terms of locking behavior.
Write a Python extraction query that uses WITH UR to read all accounts without acquiring any locks.
Under what circumstances could WITH UR return incorrect data? Give a specific example.
Why is WITH UR appropriate for Meridian Bank's nightly data pipeline but inappropriate for the account balance check in a fund transfer?
How would you specify uncommitted read isolation in JDBC? In SQLAlchemy?

Exercise 33.22: JDBC Isolation Level Mapping

Objective: Master the mapping between JDBC and DB2 isolation levels.

Task: Complete this mapping table from memory, then verify against the chapter:

JDBC Constant	DB2 Isolation	DB2 Abbreviation	Lock Behavior
`TRANSACTION_READ_UNCOMMITTED`	?	?	?
`TRANSACTION_READ_COMMITTED`	?	?	?
`TRANSACTION_REPEATABLE_READ`	?	?	?
`TRANSACTION_SERIALIZABLE`	?	?	?

Then answer: - Why is the JDBC naming confusing compared to DB2 naming? - What isolation level would you use for Meridian Bank's balance inquiry page? - What isolation level would you use for the fund transfer operation? Why?

Exercise 33.23: Production Readiness Checklist

Objective: Create a checklist for verifying that an application's DB2 access layer is production-ready.

Task: Write a checklist with at least 15 items covering: - Connection management (pool sizing, validation, leak detection, SSL) - SQL practices (parameterized queries, fetch size, batch operations) - Transaction management (isolation level, commit/rollback, timeout) - Error handling (deadlock retry, specific SQLCODE handling, logging) - Security (credentials, least privilege, injection prevention) - Monitoring (what metrics to track, what alerts to set)

For each item, specify: what to check, what the correct configuration looks like, and what the risk is if it is wrong.

Exercise 33.24: Connection String Builder Utility

Objective: Write a reusable utility for building DB2 connection strings.

Task: Write a Python class DB2ConnectionBuilder that:

Accepts connection parameters via a fluent interface: python conn_str = (DB2ConnectionBuilder() .hostname("db2prod.meridianbank.com") .port(50000) .database("MBNKPROD") .user_from_env("DB2_USER") .password_from_env("DB2_PASSWORD") .schema("BANKING") .ssl(keystore="/app/certs/key.kdb", stash="/app/certs/key.sth") .query_timeout(30) .build())
Validates that all required parameters are set before building.
Raises a clear error if environment variables are not set.
Supports both SSL and non-SSL configurations.
Returns a properly formatted ibm_db connection string.

Exercise 33.25: Performance Tuning Investigation

Objective: Diagnose and resolve a slow application query using DBA tools.

Scenario: Meridian Bank's account service is experiencing slow response times. The development team reports that the query SELECT * FROM banking.accounts WHERE customer_id = ? takes 500ms on average.

Task:

What DB2 monitoring queries would you run to investigate? (List at least 3.)
The EXPLAIN output shows a table scan instead of an index lookup. What are two possible causes?
The application is using Statement with string concatenation instead of PreparedStatement. How does this affect the DB2 optimizer's ability to use cached plans?
After switching to PreparedStatement, the average drops to 2ms. Calculate the total CPU savings if this query runs 100,000 times per day.
The developer asks if they should also change the SELECT * to select only the 5 columns they need. Calculate the approximate data transfer reduction if the table has 25 columns averaging 40 bytes each and the needed columns total 100 bytes.