Chapter 13 Key Takeaways: Logical Database Design


Core Principles

  1. Logical design is a contract. The logical data model is an agreement between the business and the technical team about what data means, how it relates, and what rules govern it. It is independent of any database technology.

  2. The cost of skipping design always exceeds the cost of doing it. Design shortcuts save time in week one but create data anomalies by month three, accumulate workarounds by year one, and force costly re-platforming by year two.

  3. Design from business understanding, validate with normalization. Build the ER model top-down from requirements, then apply normalization bottom-up as a quality check. Normalization is a validation technique, not a design method.


Requirements Gathering

  1. Start with nouns (entities), ask about descriptors (attributes), then ask about connections (relationships). Follow this sequence in every stakeholder interview.

  2. Ask about edge cases and time. The happy path reveals the obvious entities. Edge cases reveal optional relationships, status attributes, and lifecycle rules. Time-related questions reveal whether you need history tables and effective dating.

  3. Treat existing schemas as evidence, not truth. Legacy databases contain valid entity knowledge but also accumulated workarounds and obsolete design decisions.


Entities and Attributes

  1. Every entity needs a surrogate primary key. Use system-generated integers. Enforce natural keys (SSN, account number, email) as UNIQUE constraints. Natural keys change; surrogate keys do not.

  2. Decompose composite attributes. Addresses, names, and other multi-part attributes should be broken into individual columns at the logical level. You will always need to query or sort by the components.

  3. Do not store derived attributes in the logical model. Age (derived from date of birth) and total balance (derived from summing accounts) belong in queries, not columns. Storage decisions come in physical design.

  4. Multi-valued attributes require a separate entity. A customer with multiple phone numbers needs a CUSTOMER_PHONE table, never a comma-separated list in a single column.


Relationships and Cardinality

  1. Always read relationships from both directions. "A customer has many accounts" and "An account belongs to one customer" are two views of the same relationship. Both must be validated.

  2. One-to-one relationships are rare. When you find one, ask whether the two entities should be merged. Legitimate reasons to keep them separate include access control, performance isolation, and optional existence.

  3. Many-to-many relationships must be resolved with a junction table. The junction table is a first-class entity with its own meaningful name and relationship-specific attributes.

  4. Identifying relationships propagate the parent key into the child's primary key. Use them for weak entities (LOAN_PAYMENT identified by LOAN_ID + PAYMENT_SEQUENCE). Non-identifying relationships keep the parent key as a regular foreign key.


ER Notation and Diagrams

  1. Use Crow's foot notation. It is the industry standard, compact, and supported by all major modeling tools. The notation matters less than the conversations the diagrams enable.

Supertype/Subtype Design

  1. Use supertype/subtype when entities share common attributes but have type-specific attributes. The discriminator attribute in the supertype determines which subtype table holds the additional data.

  2. Table-per-type is the recommended implementation for most enterprise systems. It avoids NULL-heavy rows (single table), avoids JOIN limitations (table-per-concrete-class), and allows proper NOT NULL constraints on subtype attributes.


Normalization Validation

  1. 1NF: Atomic values, no repeating groups. If you see comma-separated lists or numbered columns (PHONE1, PHONE2, PHONE3), you have a 1NF violation.

  2. 2NF: No partial dependencies on a composite key. Every non-key attribute must depend on the entire primary key.

  3. 3NF: No transitive dependencies. No non-key attribute should depend on another non-key attribute. If BRANCH_NAME depends on BRANCH_ID (which depends on CUSTOMER_ID), BRANCH_NAME does not belong in the CUSTOMER table.

  4. If normalization keeps pointing to the same missing entity, create it. Attributes that refuse to fit in any existing table are telling you that a new entity exists.


Design Patterns and Anti-Patterns

  1. Know the five common patterns: audit trails, effective dating, address polymorphism, configurable attributes, and tree structures. Recognizing them saves design time.

  2. Know the five anti-patterns: EAV abuse, the mega-table, OTLT, over-normalization, and comma-separated lists. Recognizing them prevents design disasters.

  3. The EAV anti-pattern destroys type safety, optimizer intelligence, and referential integrity. Never use it for core business data.

  4. Separate lookup tables per domain, not one lookup table for all domains (OTLT). Each lookup table can have domain-specific attributes and enforce proper foreign keys.


The Meridian Bank Model

  1. The complete model has 30+ entities across seven domains: customer, account, transaction, loan, product/fee, branch/employee, and regulatory/audit.

  2. Key design decisions are documented with rationale. Surrogate keys, ACCOUNT_HOLDER junction table, separate PENDING_TRANSACTION, BIGINT for high-volume tables, effective dating on fee and rate schedules.

  3. The logical model is the foundation for everything in Part III. Chapter 14 (physical design), Chapter 15 (index design), and Chapter 16 (schema evolution) all build on this model.