Glossary
Key terms used throughout Database Fundamentals. Cross-references point to the chapter where a term is introduced or treated most fully.
ACID — Atomicity, Consistency, Isolation, Durability: the four guarantees of a transaction (Ch. 26).
Aggregate function — a function (COUNT, SUM, AVG, MIN, MAX) that collapses many rows into one value; skips NULLs (Ch. 7).
Anti-join — finding rows in A with no matching row in B (NOT EXISTS / LEFT JOIN ... IS NULL / EXCEPT) (Ch. 6, 9, 10).
Atomicity — a transaction happens completely or not at all (Ch. 26).
Autovacuum — the background process that runs VACUUM/ANALYZE automatically (Ch. 28).
B-tree — the default index structure; a balanced, sorted tree giving O(log n) lookups, ranges, and ordering (Ch. 23).
BASE — Basically Available, Soft state, Eventual consistency: the availability-favoring counterpart to ACID in many NoSQL systems (Ch. 33, 35).
BCNF (Boyce-Codd Normal Form) — a stricter 3NF: every determinant is a candidate key (Ch. 19).
Bloat — wasted space from un-reclaimed dead tuples; grows when VACUUM can't keep up (Ch. 28).
Buffer pool (shared_buffers) — the in-memory cache of recently-used pages (Ch. 28).
Candidate key — a minimal set of columns that uniquely identifies a row (Ch. 3).
CAP theorem — during a network partition, a distributed system can guarantee at most two of Consistency, Availability, Partition tolerance (Ch. 35).
Cardinality — (relationship) how many entities relate (1:1, 1:N, M:N) (Ch. 17); (relation) the number of rows (Ch. 3).
CHECK constraint — a rule each row must satisfy (e.g., price >= 0) (Ch. 3, 14).
Closure — a relational operation takes relations in and returns a relation out; enables subqueries, CTEs, views (Ch. 4).
Closure table — a hierarchy model storing all ancestor-descendant pairs for fast subtree reads (Ch. 21).
Columnar storage — storing data by column; fast for aggregations over few columns/many rows; used by analytical databases (Ch. 28, 34).
Composite key — a primary/unique key spanning multiple columns (Ch. 3).
Concurrency control — mechanisms (locks, MVCC) for correct simultaneous access (Ch. 27).
Connection pool — a reusable set of database connections shared across requests (Ch. 27, 29).
Constraint — a declared rule the database enforces (NOT NULL, UNIQUE, CHECK, PK, FK) (Ch. 3, 14).
Correlated subquery — a subquery referencing the outer query; evaluated per outer row (Ch. 9).
CTE (Common Table Expression) — a named temporary result (WITH); recursive CTEs traverse hierarchies (Ch. 11).
Cursor — a database client object that executes statements and holds results (Ch. 29).
Data warehouse — an analytics-optimized database (denormalized, often columnar) fed from OLTP (Ch. 34).
Deadlock — a cycle of transactions each waiting for a lock the other holds; PostgreSQL detects and aborts one (Ch. 27).
Dead tuple — an old row version left by MVCC, reclaimed by VACUUM (Ch. 28).
Denormalization — deliberately re-introducing redundancy for read performance (Ch. 20).
Dimension table — descriptive context (date, product, customer) in a star schema (Ch. 34).
DML / DDL — Data Manipulation Language (INSERT/UPDATE/DELETE) / Data Definition Language (CREATE/ALTER/DROP) (Ch. 13, 14).
Durability — committed data survives crashes (via the WAL) (Ch. 26, 28).
Embedding — a vector representing the meaning of text/image; used in vector search (Ch. 36).
Entity — a thing your system tracks (becomes a table) (Ch. 17).
ER model — Entity-Relationship model: a conceptual data design (Ch. 17).
ETL / ELT — Extract-Transform-Load / Extract-Load-Transform: data pipeline shapes (Ch. 31).
Eventual consistency — replicas converge over time; a read may be briefly stale (Ch. 33, 35).
EXPLAIN — shows the query plan (and, with ANALYZE, actual execution) (Ch. 24).
Fact table — the measurements/events (with measures + dimension keys) in a star schema (Ch. 34).
Fan-out — a one-to-many join multiplying rows, causing over-counting if mis-aggregated (Ch. 6).
Foreign key (FK) — a column whose values must match a primary key elsewhere; enforces referential integrity (Ch. 3).
Functional dependency — A → B: A determines B; the tool of normalization (Ch. 19).
GIN index — an index for composite values: JSONB, arrays, full-text (Ch. 16, 23).
GiST index — an index for ranges, geometric/spatial, nearest-neighbor (Ch. 23, 36).
Grain — what one row of a (joined or fact) table represents; key for correct aggregation (Ch. 6, 7, 34).
Index — a separate sorted structure for fast lookups (Ch. 23).
Isolation level — how much concurrent transactions are insulated (Read Committed, Repeatable Read, Serializable) (Ch. 26).
JSONB — PostgreSQL's binary JSON type; indexable, queryable document storage (Ch. 16).
Junction table — a table resolving a many-to-many relationship (composite key of two FKs) (Ch. 17, 18).
Least privilege — granting each role only the permissions it needs (Ch. 32).
Lost update — a read-modify-write race where one update overwrites another (Ch. 27).
Materialized view — a view whose results are stored (cached) and refreshed (Ch. 15).
MVCC (Multi-Version Concurrency Control) — keeping multiple row versions so readers don't block writers (Ch. 26, 28).
N+1 query problem — accessing a relationship in a loop firing one query per row (Ch. 30).
NoSQL — non-relational databases (document, key-value, column-family, graph) (Ch. 33).
Normalization — structuring tables to eliminate redundancy (1NF–BCNF) (Ch. 19).
NULL — the marker for unknown/absent; obeys three-valued logic; test with IS NULL (Ch. 3).
OLTP / OLAP — transactional (run the app) / analytical (analyze the business) workloads (Ch. 34).
Optimistic concurrency — assume conflicts are rare; check-at-write (version column) and retry (Ch. 27).
ORM — Object-Relational Mapper; maps objects to rows, generating SQL (Ch. 30).
Page — the 8 KB unit of storage and I/O; tables are collections of pages (the heap) (Ch. 28).
Parameterized query — passing values separately from the SQL; prevents SQL injection (Ch. 29, 32).
Partitioning — splitting one table into partitions (range/list/hash) for scale/maintenance (Ch. 25).
Partition pruning — skipping partitions that can't match the query's partition-key filter (Ch. 25).
Pessimistic concurrency — lock the row up front (SELECT ... FOR UPDATE) (Ch. 27).
PITR (Point-In-Time Recovery) — restoring to any moment via base backup + archived WAL (Ch. 28, 38).
Primary key (PK) — the chosen unique, non-null identifier of a row (Ch. 3).
Projection (π) — choosing columns (the SELECT list) (Ch. 4).
Referential integrity — the guarantee that foreign keys reference existing rows (Ch. 3).
Relation / tuple / attribute — table / row / column (the relational model's formal terms) (Ch. 3).
Replication — keeping copies of the database on other servers (primary-replica) (Ch. 35).
Row-Level Security (RLS) — database-enforced per-row access policies (Ch. 32).
Selection (σ) — choosing rows (the WHERE clause) (Ch. 4).
Sharding — splitting data across servers (horizontal partitioning) for scale (Ch. 35).
Slowly Changing Dimension (SCD) — strategies (Type 1/2/3) for dimension changes over time (Ch. 34).
SQL injection — executing attacker-controlled SQL via concatenated input; prevented by parameterization (Ch. 29, 32).
Star schema — a fact table surrounded by dimension tables (dimensional modeling) (Ch. 34).
Staging table — a loose table to bulk-load raw data before transforming into real tables (Ch. 31).
Surrogate key — an artificial, stable, meaningless identifier (vs. a natural key) (Ch. 3).
Three-valued logic — true/false/unknown, arising from NULL comparisons (Ch. 3).
Transaction — an all-or-nothing unit of work (Ch. 26).
Trigger — a function that runs automatically on data changes (Ch. 15).
Vector search — nearest-neighbor similarity over embeddings (Ch. 36).
View — a named, saved query (virtual table) (Ch. 15).
WAL (Write-Ahead Log) — the durable log of changes written before data pages; basis of durability, recovery, replication (Ch. 28).
Window function — computes over related rows without collapsing them (OVER (...)) (Ch. 12).
See also: the Index (term → chapter) and each chapter's key-takeaways.