Case Study 2: Docker Development Environment for a Development Team

DataField.Dev

Case Study 2: Docker Development Environment for a Development Team

Background

NovaPay is a fintech startup with 45 engineers building a payment processing platform. Their production database is Db2 11.5 running on bare-metal Linux servers in a co-located data center. The development team has grown rapidly from 8 to 45 people over 18 months, and their development workflow has become a significant bottleneck.

Currently, the team shares three "dev" database instances running on a single Linux server in the office. Developers connect remotely to these shared instances for all development and testing work. The shared environment causes constant friction: schema changes by one developer break another's tests, long-running queries from the data science team lock tables needed by application developers, and the daily "rebuild" script that drops and recreates all tables runs at midnight but sometimes fails, leaving the morning team with an inconsistent database.

Priya Sharma, the engineering manager, has decided to move to a "database-per-developer" model using Docker containers. Each developer will run their own isolated Db2 instance locally, with a standardized setup that mirrors production.

The Challenge

Priya's team faces several interrelated problems.

Problem 1: Environment Consistency

With 45 developers, each running their own instance, configuration drift is inevitable unless the setup is fully automated. Every developer's Db2 must have: - The same database configuration parameters as production (or a scaled-down version of them) - The same schema -- all 180 tables, 45 views, and 22 stored procedures - A consistent sample data set for testing - The same user permissions model

Problem 2: Resource Constraints

Not every developer has a powerful workstation. Laptops range from 8 GB RAM machines used by front-end developers who occasionally need database access to 64 GB workstations used by backend engineers. The Db2 container must be configurable for different resource envelopes.

Problem 3: Schema Evolution

The team ships new database changes weekly. When a developer pulls the latest code, their local Db2 instance must be brought in sync with the latest schema. This needs to happen reliably and quickly -- developers should not spend 30 minutes each morning rebuilding their database.

Problem 4: CI/CD Integration

The continuous integration pipeline runs 2,400 integration tests that require a live DB2 database. Currently, these tests run sequentially against a shared CI database, taking 45 minutes. The team wants to parallelize test execution using ephemeral Db2 containers.

Solution Design

Priya's team designs a layered Docker-based solution.

Layer 1: Base Db2 Image with Custom Configuration

They create a Dockerfile that extends the official ibmcom/db2 image:

FROM ibmcom/db2:11.5.8.0

# Copy custom configuration
COPY db2-config/db2-dbm.cfg /var/custom/
COPY db2-config/db2-db.cfg /var/custom/
COPY db2-config/setup-database.sh /var/custom/

# Copy schema and data scripts
COPY sql/schema/ /var/custom/sql/schema/
COPY sql/data/ /var/custom/sql/data/
COPY sql/procedures/ /var/custom/sql/procedures/

# The setup script runs after Db2 starts
ENV DBNAME=NOVAPAY
ENV DB2INSTANCE=db2inst1
ENV DB2INST1_PASSWORD=dev_password
ENV LICENSE=accept

EXPOSE 50000

The setup-database.sh script executes in order: 1. Creates the database with the correct codeset and pagesize 2. Applies database configuration parameters (scaled for the container's memory allocation) 3. Creates tablespaces and buffer pools 4. Runs all schema DDL scripts in dependency order 5. Loads sample data 6. Runs RUNSTATS on all tables 7. Writes a "ready" marker file

Layer 2: Docker Compose for Developer Workstations

A docker-compose.yml provides developer-friendly startup:

version: '3.8'
services:
  db2:
    image: novapay/db2-dev:latest
    container_name: novapay-db2
    privileged: true
    ports:
      - "50000:50000"
    environment:
      - DB2INST1_PASSWORD=dev_password
      - LICENSE=accept
      - DBNAME=NOVAPAY
    volumes:
      - db2data:/database
    deploy:
      resources:
        limits:
          memory: 4G
        reservations:
          memory: 2G

volumes:
  db2data:
    driver: local

The team provides three compose profiles: - minimal (2 GB RAM) -- for front-end developers who need basic connectivity - standard (4 GB RAM) -- for most backend developers - full (8 GB RAM) -- for database developers and data engineers

Each profile adjusts buffer pool sizes and configuration parameters proportionally.

Layer 3: Schema Migration Tooling

The team adopts a migration-based schema management approach. Each schema change is captured as a numbered SQL migration file:

migrations/
  V001__create_core_tables.sql
  V002__add_payment_status_index.sql
  V003__create_audit_triggers.sql
  V004__add_merchant_category_table.sql
  ...
  V087__alter_transaction_add_currency.sql

A migration runner (using Flyway with DB2 support) applies pending migrations when a developer starts their container. A developer who has been on vacation for a week simply runs docker compose up and the migrations bring their schema current in seconds.

Layer 4: CI/CD Ephemeral Containers

The CI pipeline uses a pre-built Docker image with the schema already applied (baked in at image build time). Test parallelization works by spinning up multiple Db2 containers:

# CI configuration (simplified)
test_integration:
  parallel: 6
  services:
    db2:
      image: novapay/db2-ci:${SCHEMA_VERSION}
      variables:
        DB2INST1_PASSWORD: ci_password
        LICENSE: accept
  script:
    - wait-for-db2.sh
    - pytest tests/integration/ --shard=${CI_NODE_INDEX}

Six parallel containers each run one-sixth of the integration test suite. Total CI time drops from 45 minutes to 9 minutes.

Implementation Details

Configuration Scaling

The team discovers that Db2 configuration parameters must be adjusted for container memory limits. Running production parameters (designed for 256 GB RAM servers) in a 4 GB container causes immediate failures. They develop a configuration matrix:

Parameter	Production (256 GB)	Full (8 GB)	Standard (4 GB)	Minimal (2 GB)
SORTHEAP	16384	2048	1024	512
SHEAPTHRES_SHR	2097152	131072	65536	32768
LOCKLIST	8192	2048	1024	512
PCKCACHESZ	16384	4096	2048	1024
BP_DATA (pages)	500000	10000	5000	2500
BP_INDEX (pages)	200000	5000	2500	1000

Startup Time Optimization

Initial container startup takes 8 minutes -- too slow for developer patience. The team optimizes: 1. Pre-creating the database in the Docker image (saves 3 minutes) 2. Using a named Docker volume so the database persists between restarts (subsequent starts take 30 seconds) 3. Running schema migrations only, not full rebuilds, on restart (saves 2 minutes) 4. Parallelizing RUNSTATS execution (saves 1 minute)

Final cold-start time: 2 minutes. Warm restart: 30 seconds.

Data Seeding Strategy

The team maintains three data profiles: - minimal -- 100 rows per table, sufficient for unit tests - standard -- 10,000 rows per table with realistic distributions, for development - load-test -- 1 million rows per table for performance testing

Developers choose their profile with an environment variable: DATA_PROFILE=standard.

Outcome

After three months with the Docker-based workflow:

Developer setup time drops from "half a day fighting with shared databases" to "15 minutes, mostly waiting for the image to download the first time"
Schema-related incidents (broken tests from conflicting changes) drop from 12 per week to zero
CI pipeline time decreases from 45 minutes to 9 minutes, enabling faster code review cycles
New developer onboarding (database portion) goes from a 20-page setup guide to docker compose up

Developer satisfaction survey scores for "database development experience" improve from 2.1/5 to 4.3/5.

Lessons Learned

The privileged: true flag is required for Db2 Docker containers because DB2 needs to modify kernel parameters (shared memory segments, semaphores). This has security implications that must be understood and accepted for development use. The team documents this clearly and restricts the practice to development environments only.
Persistent volumes are essential for developer productivity. Without them, every docker compose down destroys the database, forcing a full rebuild on the next start. Named volumes preserve the database across container restarts.
Configuration parameters must scale with available resources. Production-sized buffer pools in a memory-constrained container cause DB2 to fail on startup or perform worse than default configuration. Each resource profile needs its own tested parameter set.
Migration-based schema management is non-negotiable for team development. Ad-hoc SQL scripts run manually by each developer create subtle inconsistencies that surface as mysterious test failures. A migration framework ensures every developer's database is in exactly the same state.
Pre-built CI images dramatically reduce pipeline time. Building the database during each CI run wastes minutes per job that multiply across dozens of daily builds. Baking the schema into the CI image at build time and invalidating only when migrations change is far more efficient.
The Docker approach does not replace production DBA expertise. Several developers assumed that "if it works in Docker, it works in production." The team learns to clearly label the Docker environment as a development approximation, not a production replica. Production deployment still requires proper capacity planning, security hardening, and monitoring -- topics covered in later chapters of this book.

Discussion Questions

The team uses privileged: true for their Db2 containers. What are the security implications? How might you reduce the privilege level while still running Db2 successfully?
Why did the team choose migration-based schema management (Flyway) over simply rebuilding the database from scratch each time? What are the trade-offs?
The configuration scaling table shows different parameter values for different memory profiles. How would you determine the correct values for a new resource envelope (say, 6 GB)?
The CI pipeline runs 6 parallel Db2 containers. What are the limiting factors for parallelism? At what point would adding more containers not reduce total test time?
How would this Docker-based approach need to change if the team were working with DB2 for z/OS as their production platform instead of Db2 LUW?