Chapter 2: Key Takeaways

  1. The relational model separates logic from physics. Codd's central insight was that the logical view of data (tables, rows, columns) should be independent of its physical storage. This gives you physical data independence — you can reorganize storage, add indexes, or repartition data without changing any SQL or application code. DB2 embodies this principle through its optimizer, which determines execution plans so you describe what you want, not how to get it.

  2. Tables, rows, and columns map to relations, tuples, and attributes. These formal terms are not academic trivia — they appear in DB2 documentation, error messages, and technical discussions. Know all three vocabularies: formal (relational theory), informal (common usage), and SQL/DB2 terminology.

  3. Every table needs a primary key. A primary key uniquely identifies each row and is the foundation of referential integrity. Use surrogate keys (system-generated integers) as primary keys for stability and performance, and enforce natural key uniqueness (SSN, account number, branch code) with UNIQUE constraints.

  4. Foreign keys enforce referential integrity — the most important constraint in a relational database. A foreign key guarantees that every reference from one table to another is valid, preventing orphan rows. At Meridian Bank, this means every account points to a real customer and a real branch. Enforce it in the database, not in the application, because the database is the only layer that applies universally and atomically.

  5. Choose data types deliberately. Use DECIMAL for money (never DOUBLE or REAL). Use CHAR for fixed-length codes and VARCHAR for variable-length text. Use TIMESTAMP(6) for audit columns. Use SMALLINT with CHECK constraints for boolean flags to maintain cross-platform compatibility between DB2 for z/OS and DB2 for LUW.

  6. Normalization to 3NF prevents update, insert, and delete anomalies. First Normal Form eliminates multi-valued columns. Second Normal Form eliminates partial dependencies. Third Normal Form eliminates transitive dependencies. Each normal form addresses a specific category of data redundancy and its associated corruption risks. Aim for 3NF as your baseline — it is the standard target for enterprise database design.

  7. Denormalize only with evidence, never prophylactically. Denormalization trades data integrity risk for query performance. It is justified when you have measured performance data showing a problem, when less invasive solutions (indexes, MQTs, query rewriting) are insufficient, and when you have a reliable mechanism to maintain consistency. Always document what you denormalized, why, and how consistency is maintained.

  8. Relational algebra gives you mental models for SQL. SELECT (restriction) filters rows, PROJECT (projection) filters columns, JOIN combines tables. Understanding these operations helps you reason about query behavior, predict performance, and avoid pitfalls like accidental Cartesian products.

  9. Constraints are not overhead — they are insurance. NOT NULL, UNIQUE, CHECK, and FOREIGN KEY constraints cost microseconds per operation but prevent data corruption that can cost millions to remediate. Enforce business rules in the database layer where they apply universally, not in the application layer where they can be bypassed by bugs, race conditions, or new applications.

  10. Design with the DB2 catalog in mind. DB2 stores all metadata — table definitions, column definitions, constraints, statistics — in its own relational catalog tables. You can query SYSCAT views (LUW) or SYSIBM tables (z/OS) with standard SQL. This is Codd's Rule 4 in action, and it is one of DB2's great strengths for administration and automation.

  11. The Meridian Bank core schema (BRANCH, CUSTOMER, ACCOUNT) is your working laboratory. These three tables, designed in 3NF with surrogate keys, foreign keys, CHECK constraints, and consistent naming conventions, will be the foundation for every SQL example, performance exercise, and architecture discussion in the chapters ahead. Understand their design deeply — every choice was intentional.

  12. Good data modeling is an investment, not a cost. The NorthStar case study shows that shortcuts in data modeling compound over time into expensive remediation projects. The GPRH case study shows that disciplined normalization enables regulatory compliance, data quality, and even clinical safety. At Meridian Bank — handling millions of dollars daily — the stakes are just as high.