Chapter 33 Key Takeaways

DataField.Dev

Chapter 33 Key Takeaways

Core Concepts

DB2 supports multiple access methods for different workloads: JDBC for Java applications, ODBC/CLI for native C/C++ applications, and ibm_db/ibm_db_sa for Python. All use dynamic SQL and connect over TCP/IP (DRDA protocol) or shared memory.
JDBC Type 4 is the default choice. Pure Java, no native libraries, no DB2 client installation required. Use Type 2 only when benchmarks show a measurable advantage for local connections. Type 4 is essential for Docker/Kubernetes and cloud deployments.
Connection pooling is mandatory for production. Opening a connection takes 50-200ms. Use HikariCP (Java) or SQLAlchemy's built-in pool (Python). Right-size the pool — more connections is not better. The formula (cores * 2) + spindle_count gives a good starting point.
Always use PreparedStatement (JDBC) or prepared statements (ibm_db). They prevent SQL injection, enable statement caching, and maximize DB2's dynamic statement cache hit rate. Never concatenate user input into SQL strings.
Disable auto-commit for multi-statement transactions. Fund transfers, order processing, and any multi-step operation must execute in a single transaction. Use setAutoCommit(false) (JDBC) or ibm_db.autocommit(conn, SQL_AUTOCOMMIT_OFF) (Python), then explicitly commit or rollback.
Batch operations reduce network overhead. JDBC's addBatch()/executeBatch() and pandas' to_sql(chunksize=N) send multiple rows in a single network round-trip. Batch sizes of 500-5,000 are typical.
JDBC isolation levels map to DB2 isolation levels — but the names do not match intuitively. READ_COMMITTED = CS, REPEATABLE_READ = RS, SERIALIZABLE = RR. Know the mapping.
Python ibm_db provides direct DB2 access with exec_immediate for one-off queries and prepare/execute for parameterized queries. Use fetch_assoc for dictionary-style row access.
SQLAlchemy (ibm_db_sa) brings ORM and expression language to DB2. Define Python classes mapping to DB2 tables, use sessions for transaction management, and leverage the expression language for type-safe query building.
pandas read_sql with chunksize handles large datasets. For tables with millions of rows, chunked reading keeps memory usage bounded. Combined with Parquet output, this enables efficient DB2-to-data-lake pipelines.

Meridian Bank Patterns

Online banking API uses Java with JDBC, HikariCP connection pooling (20 connections per pod), and PreparedStatement for all queries.
Data pipeline uses Python with ibm_db_sa, chunked pandas extraction, and Parquet output to the data lake. Extraction uses WITH UR to avoid impacting production.
Connection pools are right-sized — 20 connections per application instance, not 200. DB2 thread resources are finite and expensive.

Common Pitfalls to Avoid

Oversizing connection pools (wastes DB2 resources, increases contention)
Using Statement with string concatenation (SQL injection risk, defeats caching)
Forgetting to check rs.wasNull() for nullable primitive columns in JDBC
Loading entire large tables into pandas without chunking (out of memory)
Using to_sql(if_exists='replace') in production (drops and recreates the table)
Leaving auto-commit enabled for multi-step transactions
Not handling deadlocks (-911) with retry logic
Confusing JDBC isolation level names with DB2 isolation level names