> Where you are: Part I, Chapter 2 of 40. Chapter 1 was all ideas. This one is all doing. By the end you'll have PostgreSQL running, the mercado practice database loaded, and your first real queries returning rows.
In This Chapter
- From ideas to a running database
- Installing PostgreSQL
- The client/server model (a 60-second mental model)
- Meeting psql
- A graphical client (optional)
- Creating your first database
- Your first table, by hand
- Loading the Mercado practice database
- Your first real queries on Mercado
- Controlling how psql displays results
- Scripting with psql: files, variables, and safety
- Connections, roles, and authentication (a practical primer)
- Running PostgreSQL in the cloud
- Making psql your own: .psqlrc
- Troubleshooting the most common setup problems
- Encoding, collation, and locale: get it right once
- Versions and staying current
- Asking the database about itself
- A word about schemas: the public namespace
- Your first backup and restore (a two-minute preview)
- A psql workflow that scales
- Progressive project: stand up your own database
- The big picture: what you've actually built
- Summary
Chapter 2: Setting Up PostgreSQL, Your First Database, and the Tools You'll Use
Where you are: Part I, Chapter 2 of 40. Chapter 1 was all ideas. This one is all doing. By the end you'll have PostgreSQL running, the
mercadopractice database loaded, and your first real queries returning rows.Learning paths: 💻 📊 🔬 🏗️ — everyone. This is the setup the whole book depends on.
From ideas to a running database
In Chapter 1 we argued, in the abstract, that a database management system solves problems spreadsheets can't. Now you're going to install one of the best DBMSs in the world — PostgreSQL — and start using it. This chapter is a checklist as much as a lesson: install the software, meet the tools, create a database, load the practice data, and run your first queries. Take it slowly, type every command yourself, and don't move on until each step works. Setup friction is the single most common reason people give up on databases; we're going to get you past it cleanly.
There are three things you'll install or meet:
- The PostgreSQL server — the actual DBMS, a program that runs in the background and manages your data.
psql— the official command-line client. Plain, fast, scriptable, and the tool this book uses for nearly every example. Learn it and you can work on any machine, including a bare server over SSH.- A graphical client (optional) — pgAdmin or DBeaver, if you prefer clicking to typing. Useful, but
psqlis what we'll show.
Why this matters. You could use a hosted database and never install anything. But installing PostgreSQL locally — and especially understanding the client/server split — demystifies what a database actually is: a server process you talk to, not a file you open. That mental model pays off in every later chapter, from connections (Ch. 29) to administration (Ch. 38).
Installing PostgreSQL
PostgreSQL runs on macOS, Windows, and Linux. Pick the path for your platform. If you want the least-friction, most-consistent experience — and you don't mind installing Docker — skip to the Docker section; it gives everyone in the book the exact same environment.
Whatever route you choose, you want PostgreSQL 15 or newer (16 or 17 are great too). Everything in this book targets 15+.
macOS
The simplest option is Postgres.app: download it, drag it to Applications, open it, and click "Initialize." You now have a running server. Add its command-line tools to your PATH as the app instructs, so psql works in your terminal.
Or use Homebrew:
brew install postgresql@16
brew services start postgresql@16 # start the server now and on login
Windows
Download the installer from EDB (the standard Windows distribution). Run it and accept the defaults. The installer sets up the server, psql, and pgAdmin, and asks you to choose a password for the postgres superuser — write that password down. It also offers "Stack Builder," which you can skip.
After installing, open "SQL Shell (psql)" from the Start menu, or use PowerShell. You may need to add PostgreSQL's bin directory (e.g. C:\Program Files\PostgreSQL\16\bin) to your PATH so psql works everywhere.
Linux (Debian/Ubuntu)
sudo apt update
sudo apt install postgresql postgresql-client
sudo systemctl enable --now postgresql # start now and on boot
On Fedora/RHEL: sudo dnf install postgresql-server postgresql, then sudo postgresql-setup --initdb and sudo systemctl enable --now postgresql.
On Linux, installation creates a system user named postgres. The quickest way to get a superuser shell is:
sudo -u postgres psql
Docker (the universal option — recommended for consistency)
If you have Docker, you can run PostgreSQL without installing it on your host at all. This is how we recommend following the book if you want zero surprises — and it's also how the instructor lab is set up (see the instructor guide).
docker run --name mercado-pg \
-e POSTGRES_PASSWORD=mercado \
-e POSTGRES_DB=mercado \
-p 5432:5432 \
-d postgres:16
That single command downloads PostgreSQL 16, starts it in the background, creates a database named mercado, and exposes it on the standard port 5432. To open a psql session inside the container:
docker exec -it mercado-pg psql -U postgres -d mercado
To stop and start it later: docker stop mercado-pg / docker start mercado-pg. Your data persists between stops (it lives in the container's volume).
Try this. Whatever route you took, verify the install from a terminal:
bash psql --versionYou should see something likepsql (PostgreSQL) 16.2. If the command isn't found, yourPATHdoesn't include PostgreSQL'sbindirectory — fix that before continuing.
The client/server model (a 60-second mental model)
Before you connect, understand what you're connecting to. PostgreSQL is a server: a long-running process (often called the postmaster and its workers) that owns your data files and listens for connections, by default on TCP port 5432. A client — psql, pgAdmin, or your Python app later — opens a connection, sends SQL, and receives results.
┌──────────────┐ connection (port 5432) ┌────────────────────┐
│ CLIENT │ ───────── SQL ─────────────► │ PostgreSQL SERVER │
│ psql / GUI │ ◄─────── rows ───────────── │ (owns the data) │
│ / your app │ │ database: mercado│
└──────────────┘ └────────────────────┘
This split is why the same database can serve many clients at once (Chapter 1's concurrency), and why in Part V your Python application will connect the same way psql does. The data lives in one place — the server — and everyone talks to it.
To connect, a client needs: a host (where the server is — localhost for your own machine), a port (5432), a database name, and a user/role (with a password if required). psql accepts these as flags or as a connection string:
psql -h localhost -p 5432 -U postgres -d postgres
# or, equivalently:
psql "postgresql://postgres@localhost:5432/postgres"
If you installed locally and omit flags, psql uses sensible defaults (your OS username, local host, port 5432).
Meeting psql
psql is a REPL — a read-evaluate-print loop — for SQL. Once connected, you'll see a prompt like:
psql (16.2)
Type "help" for help.
postgres=#
The prompt tells you which database you're in (postgres) and whether you're a superuser (# for superuser, > for an ordinary user). You can do two kinds of things at this prompt:
- Run SQL — anything ending in a semicolon
;. The semicolon is what tellspsql"this statement is finished; run it." Forgetting it is the #1 beginner stumble:psqljust shows a continuation prompt (postgres-#) and waits. - Run meta-commands —
psql's own shortcuts, which start with a backslash\. These are not SQL; they're conveniences for navigating the database. The most important ones:
| Meta-command | What it does |
|---|---|
\l |
List all databases |
\c mercado |
Connect to the mercado database |
\dt |
List tables in the current database |
\d customers |
Describe the customers table (columns, types, keys, indexes) |
\du |
List roles (users) |
\dn |
List schemas (namespaces) |
\x |
Toggle expanded display (great for wide rows) |
\timing |
Toggle showing how long each query takes |
\i path/file.sql |
Run (include) the SQL in a file |
\e |
Open the last query in your editor |
\? |
Help on meta-commands |
\h SELECT |
Help on the SQL syntax of a command |
\q |
Quit |
Try this. At the prompt, type
\land press Enter to see your databases. Then\?to skim the meta-command list. You don't need to memorize them —\?is always there — but\dt,\d,\c,\i, and\qwill become reflexes within a day.
A graphical client (optional)
If you prefer a visual tool, two excellent free options:
- pgAdmin — the official PostgreSQL GUI; ships with the Windows installer. A tree of servers/databases/tables on the left, a query editor on the right, and visual table browsers.
- DBeaver — a popular cross-database GUI; clean query editor, good ER-diagram viewer, works with PostgreSQL and everything else.
Connect a GUI by giving it the same four things psql needs: host (localhost), port (5432), database, and user/password. We'll show psql throughout because it's universal and copy-pasteable, but everything works identically in a GUI — pick whatever keeps you writing SQL.
Dialect Difference.
psql's backslash meta-commands (\dt,\d, etc.) are specific to PostgreSQL's client. Other databases have their own (MySQL'sSHOW TABLES;, SQL Server'ssp_help). The SQL you write is mostly portable; the client conveniences are not.
Creating your first database
When PostgreSQL is installed it already contains a default database (usually called postgres) that you can connect to. But you'll make your own for the practice data. There are two equivalent ways.
From the shell, using the createdb helper command:
createdb mercado
Or from inside psql, with SQL:
CREATE DATABASE mercado;
(If you used the Docker command above, mercado already exists — POSTGRES_DB=mercado created it for you.)
Now connect to it. From the shell: psql -d mercado. From inside psql: \c mercado. Your prompt changes to show the new database:
mercado=#
You're now "inside" the mercado database. Every table you create and every query you run happens here.
Common mistake. Creating a table while connected to the wrong database. If you
CREATE TABLEwhile your prompt sayspostgres=#, your table lands in thepostgresdatabase, notmercado. Glance at the prompt before you run DDL. (\c mercadoputs you where you mean to be.)
Your first table, by hand
Before loading the full Mercado dataset, let's make one tiny table from scratch — so you see the whole lifecycle of create → insert → query → drop in miniature. Type this at the mercado=# prompt:
CREATE TABLE hello (
id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
word text NOT NULL
);
INSERT INTO hello (word) VALUES ('databases'), ('are'), ('fun');
SELECT * FROM hello;
PostgreSQL responds:
id | word
----+-----------
1 | databases
2 | are
3 | fun
(3 rows)
You just exercised the core loop of all database work: you defined a structure (CREATE TABLE), added data (INSERT), and asked a question (SELECT). The GENERATED ALWAYS AS IDENTITY clause told PostgreSQL to assign the id values automatically — you'll meet it properly in Chapter 14. Now clean up:
DROP TABLE hello;
Gone. That five-line round trip is, in essence, everything the rest of the book elaborates on.
Loading the Mercado practice database
Now the real thing. The book ships three SQL files (in the sql/ folder of the book's materials):
schema.sql— creates Mercado's 13 tables, keys, and constraints. (No data.)seed-sample.sql— loads a small, deterministic dataset (12 customers, 15 products, 15 orders, …). Because it's fixed, the query results printed in this book will exactly match what you see. Use this one to follow along.generate_data.sql— loads a large dataset (~100,000 orders) for the performance chapters (23–25). You'll run this later, when those chapters ask you to.
Load the schema and the sample data. From your shell (in the folder that contains sql/):
psql -d mercado -f sql/schema.sql
psql -d mercado -f sql/seed-sample.sql
Or, from inside a psql session connected to mercado:
\i sql/schema.sql
\i sql/seed-sample.sql
You'll see a stream of CREATE TABLE, INSERT 0 N, and SELECT 1 messages scroll by. That's success.
Verify the load
Trust, but verify. List the tables:
\dt
List of relations
Schema | Name | Type | Owner
--------+-------------+-------+----------
public | addresses | table | postgres
public | categories | table | postgres
public | customers | table | postgres
public | employees | table | postgres
public | inventory | table | postgres
public | order_items | table | postgres
public | orders | table | postgres
public | payments | table | postgres
public | products | table | postgres
public | reviews | table | postgres
public | shipments | table | postgres
public | suppliers | table | postgres
public | warehouses | table | postgres
(13 rows)
Thirteen tables, exactly as designed. Now confirm the data loaded by counting a few:
SELECT
(SELECT count(*) FROM customers) AS customers,
(SELECT count(*) FROM products) AS products,
(SELECT count(*) FROM orders) AS orders,
(SELECT count(*) FROM order_items) AS order_items;
customers | products | orders | order_items
-----------+----------+--------+-------------
12 | 15 | 15 | 26
(1 row)
If you see those numbers, Mercado is loaded and you're ready for the rest of the book. (Don't worry about the exact order_items count to the unit — it's the shape that matters: a dozen customers, fifteen products, fifteen orders, a couple dozen line items.)
Try this. Inspect a table's structure with
\d:sql \d customersYou'll see every column, its type, whether it allowsNULL, the primary key, and any constraints.\d <table>is the fastest way to remember what's in a table you haven't touched in a while — you'll use it constantly.
Your first real queries on Mercado
Let's actually use the data. Type these one at a time:
-- Everyone in the customers table
SELECT * FROM customers;
customer_id | first_name | last_name | email | phone | loyalty_tier | created_at
-------------+------------+-----------+--------------------------+----------+--------------+------------------------
1 | Alice | Nguyen | alice.nguyen@example.com | 555-0101 | gold | 2023-01-15 00:00:00+00
2 | Bob | Martinez | bob.martinez@example.com | 555-0102 | standard | 2023-02-03 00:00:00+00
...
(12 rows)
-- Just the products that cost more than $200, cheapest first
SELECT name, price
FROM products
WHERE price > 200
ORDER BY price;
-- How many orders does each status have?
SELECT status, count(*) AS num_orders
FROM orders
GROUP BY status
ORDER BY num_orders DESC;
You won't fully understand WHERE, ORDER BY, or GROUP BY yet — that's Chapters 5 and 7. The point right now is simpler and more important: you asked the database questions in SQL, and it answered them from organized, persistent data. The abstraction from Chapter 1 is now something running on your own machine.
Try this. Turn on query timing and the expanded display, then re-run a query:
sql \timing \x SELECT * FROM products WHERE sku = 'LAP-001';\xreformats wide rows as a vertical list (much easier to read), and\timingprints how long the query took. Toggle them off by running\xand\timingagain.
Controlling how psql displays results
psql's default output is fine for narrow results, but real tables are wide, and the default grid wraps into an unreadable mess. A handful of formatting controls fix that, and they're worth learning early because you'll stare at query output for the rest of the book.
\x(expanded display) flips each row from a horizontal line into a vertical list ofcolumn | valuepairs. Indispensable for wide tables likeproductsororders. Run\x autoandpsqldecides per result whether to expand based on terminal width — the best of both worlds.\pset null '∅'makesNULLs visible. By default aNULLprints as nothing — indistinguishable from an empty string. Setting a marker (some people use(null)or¤) saves you from a whole class of confusion when you reach Chapter 3's discussion ofNULL.\pset border 2and\pset linestyle unicodedraw nicer table borders.\atoggles "aligned" vs "unaligned" output;\ttoggles the header and row-count footer. Together (\a \t) they produce raw, parseable output — handy when piping results to another tool.\gre-runs the previous query;\gxruns it with expanded display for just this one result without toggling\xglobally.
-- A wide row is far more readable expanded:
\x
SELECT * FROM products WHERE sku = 'LAP-001';
-[ RECORD 1 ]------------------------------
product_id | 1
sku | LAP-001
name | Nimbus 14 Laptop
category_id | 3
supplier_id | 2
price | 1299.00
attributes | {"ram": "16GB", "cpu": "M3"}
is_active | t
created_at | 2023-01-10 00:00:00+00
Try this. Set up readable defaults for this session and re-run a query:
sql \pset null '∅' \x auto SELECT customer_id, first_name, phone FROM customers;Now the customers with no phone show∅instead of a blank you might mistake for an empty string. We'll come back to why that distinction matters in Chapter 3.
Scripting with psql: files, variables, and safety
You already loaded the schema with \i. As your work grows, you'll run SQL from files constantly — migrations, reports, data loads — and a few psql features make that reliable rather than error-prone.
Run a file and stop on the first error. By default, psql plows ahead even if statement 3 of 50 fails, which can leave your database half-built. Always load schema files with ON_ERROR_STOP so a failure halts immediately:
psql -d mercado -v ON_ERROR_STOP=1 -f sql/schema.sql
Run a single statement without entering the REPL with -c:
psql -d mercado -c "SELECT count(*) FROM orders;"
Wrap a whole file in one transaction with -1 (or --single-transaction), so it either fully succeeds or fully rolls back — exactly what you want for a migration:
psql -d mercado -1 -v ON_ERROR_STOP=1 -f migration-001.sql
Use variables for repeatable scripts. psql substitutes :var (and :'var' for a quoted string):
\set tier 'gold'
SELECT first_name, last_name FROM customers WHERE loyalty_tier = :'tier';
Common mistake. Loading a multi-statement script without
ON_ERROR_STOPand assuming success because the prompt came back. If statement 12 failed with a constraint error, the rest may have run anyway, leaving an inconsistent database and a confusing error buried in the scrollback. Make-v ON_ERROR_STOP=1a reflex for any file that matters. (We formalize this discipline in Chapter 22, Schema Migration.)
Connections, roles, and authentication (a practical primer)
In Chapter 1 we said a database is something you connect to. That connection has to be authenticated, and PostgreSQL's model here confuses newcomers, so a quick orientation now saves pain later (the full treatment is Chapter 32, Security, and Chapter 38, Administration).
- Roles are users. In PostgreSQL, "users" and "groups" are unified into one concept: the role. The
postgresrole created at install is a superuser — it can do anything. You'll create less-powerful roles for applications in Part V. - Authentication is configured in
pg_hba.conf(HBA = host-based authentication). This file decides who can connect, from where, to which database, and how they must prove identity (password, trust, certificate). When a local connection "just works" with no password, it's because anpg_hba.confrule allowed it; when you getpassword authentication failed, a rule required a password you didn't supply. - Connection settings have defaults from environment variables. Rather than typing
-h -p -U -devery time, you can setPGHOST,PGPORT,PGUSER, andPGDATABASE. And rather than typing your password, you can store it once in a~/.pgpassfile (chmod 600), whichpsqland your apps will read automatically.
# Set once in your shell profile, then just type `psql`:
export PGHOST=localhost
export PGPORT=5432
export PGUSER=postgres
export PGDATABASE=mercado
# ~/.pgpass (format: host:port:database:user:password ; mode 600)
localhost:5432:mercado:postgres:mercado
Why this matters. Understanding that a connection is
(host, port, database, role, auth)— not a magic incantation — is what lets you debug "it works on my machine but not in production" later. Nearly every connection failure is one of those five pieces being wrong. Keep the list in mind; Part V's application connections use the exact same five.Common mistake (security). Putting a real password in a shell command (
PGPASSWORD=secret psql ...) or committing it to a script. It ends up in your shell history and your version control. Use~/.pgpass(file permissions protect it) or a secrets manager. We treat credential handling properly in Chapter 32.
Running PostgreSQL in the cloud
You installed PostgreSQL locally, which is the best way to learn — you control everything and nothing costs money. But in production, most teams run a managed PostgreSQL service, where a cloud provider handles backups, patching, replication, and failover for you. You should know the landscape exists:
- Amazon RDS / Aurora for PostgreSQL, Google Cloud SQL, Azure Database for PostgreSQL — the big-three managed offerings. You get a PostgreSQL endpoint; they run the server.
- Supabase, Neon, Render, Railway — developer-friendly platforms with generous free tiers, instant provisioning, and (for Neon) "serverless" Postgres that scales to zero.
The crucial point for this book: a managed PostgreSQL is still PostgreSQL. You connect with the same psql, run the same SQL, and everything you learn here applies unchanged. The only differences are operational — you don't manage the server process, and some superuser-only features are restricted. When a chapter says "connect to your database," a cloud endpoint works exactly as well as localhost.
Dialect Difference. Some managed services run PostgreSQL-compatible engines (e.g., Amazon Aurora) or older major versions. Check the version with
SELECT version();and confirm it's 15+ for full compatibility with this book's examples.
Making psql your own: .psqlrc
psql reads a file called .psqlrc in your home directory every time it starts, so you can bake in your preferred settings once. A sensible starter:
-- ~/.psqlrc
\set QUIET 1
\timing on
\x auto
\pset null '∅'
\set COMP_KEYWORD_CASE upper
\set HISTSIZE 5000
\unset QUIET
\echo 'psql ready — type \\? for help, \\q to quit'
This turns on timing, auto-expanded display, a visible NULL marker, upper-cased keyword tab-completion, and a bigger command history — every session, automatically. Small comfort, large cumulative payoff over a book's worth of queries.
Try this. Create the file above (adjust to taste), then start a fresh
psql. You should see your banner, and\timingand theNULLmarker already on. If you ever want a clean session without it, start withpsql -X(the-Xskips.psqlrc).
Troubleshooting the most common setup problems
psql: command not found— PostgreSQL'sbindirectory isn't on yourPATH. Find it (e.g.C:\Program Files\PostgreSQL\16\binon Windows, or the Postgres.app tools path on macOS) and add it.could not connect to server/connection refused— the server isn't running. Start it (brew services start postgresql@16,sudo systemctl start postgresql, ordocker start mercado-pg).password authentication failed for user "postgres"— you typed the wrong password (the one you set during install), or you're connecting as the wrong user. On Linux, trysudo -u postgres psql.database "mercado" does not exist— you skippedcreatedb mercado(or you're connected to the wrong server). Create it, then reconnect.permission denied to create database— your role lacks the privilege. Connect as a superuser (postgres) to create the database, or grant your roleCREATEDB.- The
\i sql/schema.sqlfile isn't found — paths in\iare relative to where you launchedpsql. Use the full path, orcdto the right folder first.
Common mistake. Running a long statement and seeing the prompt change to
mercado-#(note the dash) with nothing happening. That dash means "I'm still waiting for you to finish this statement." You forgot the closing;. Type;and Enter, and it'll run. To abandon a half-typed statement, pressCtrl+C.
Encoding, collation, and locale: get it right once
Two settings, chosen when a database is created, quietly affect every text column you'll ever store, and they're awkward to change later — so it's worth thirty seconds of attention now.
- Encoding is how text is stored as bytes. The right answer in modern systems is always UTF-8, which can represent every character in every language (and emoji). PostgreSQL defaults to UTF-8 in almost all installations; the Docker image does. You can confirm with
\l, which shows each database's encoding. - Collation is the sort order and comparison rules for text — whether
'apple' < 'Banana', how accented characters order, whether comparisons are case-sensitive. Collation is tied to a locale (likeen_US.UTF-8). It determines whatORDER BY nameactually produces and which strings aUNIQUEtext column treats as equal.
-- See your databases, their owners, encodings, and collations:
\l
-- Sort order follows the database collation. To force a specific
-- ordering regardless of the default, you can ask for one per-query:
SELECT name FROM products ORDER BY name COLLATE "C"; -- raw byte order
For this book, the defaults are fine — UTF-8 encoding and your system's locale. The reason to know they exist is that collation surprises are real: a query that sorts "correctly" on your laptop can sort differently on a server with a different locale, and a case-insensitive search is a collation question, not a WHERE trick. When you meet text sorting and searching in Part II, you'll know where the behavior comes from.
Common mistake. Creating a production database with a non-UTF-8 encoding (sometimes a stale default on old systems) and discovering months later that it can't store a customer's name in their own alphabet. Check
\learly; recreate withCREATE DATABASE ... ENCODING 'UTF8'if needed, before there's data to migrate.
Versions and staying current
PostgreSQL ships a new major version roughly once a year (…15, 16, 17…), each adding features and performance, plus regular minor versions (16.1, 16.2…) that are bug-and-security fixes only. A few practical rules:
- Target 15+ for this book. Everything here works on 15, 16, and 17. Check yours with
SELECT version();orpsql --version. - Minor upgrades are safe and important — they're just fixes, fully compatible, and include security patches. Apply them.
- Major upgrades can change behavior and require a planned migration (
pg_upgradeor dump-and-restore). That's an administration topic (Chapter 38), not something to fear, but not something to do absent-mindedly on a live system either.
The version cadence matters because it's how PostgreSQL keeps absorbing the capabilities that, per theme #4, let it replace other databases — JSONB, better partitioning, logical replication, and more have all arrived this way. Reading release notes once a year is one of the cheapest ways to keep your skills current (a habit we revisit in Chapter 40, the database career).
Asking the database about itself
One of the relational model's quiet elegances (Chapter 3 will make this formal) is that PostgreSQL describes itself using the very same tables-and-queries machinery it offers you. The \d meta-commands you've been using are really just pre-packaged queries against this self-description. Knowing how to query it directly is a skill you'll use for the rest of the book — and the rest of your career.
There are two ways in:
information_schema— a standardized, cross-database set of views defined by the SQL standard. Portable, if a little verbose.- The PostgreSQL system catalogs (
pg_catalog, thepg_*tables) — PostgreSQL-specific, more detailed, and what the\dcommands actually read.
-- The portable way: list your tables via the standard information_schema.
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY table_name;
-- List the columns and types of one table:
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'customers'
ORDER BY ordinal_position;
-- The PostgreSQL way: a quick size report on every table, biggest first.
-- (pg_catalog functions are PostgreSQL-specific but enormously handy.)
SELECT relname AS table,
pg_size_pretty(pg_total_relation_size(oid)) AS total_size
FROM pg_class
WHERE relkind = 'r' AND relnamespace = 'public'::regnamespace
ORDER BY pg_total_relation_size(oid) DESC;
You don't need these today, but file them away: when you wonder "what tables reference customers?" or "which of my tables has grown huge?", the answer is a query against the catalog. The database is the most authoritative documentation of itself.
Try this. Run
\dt+(note the+). The plus sign askspsqlfor extra detail — including each table's size on disk. Many\dcommands take a+:\d+ customers,\df+,\dn+. It's the fastest way to get more without writing a catalog query yourself.Theory → Practice. That a database stores its own structure as data you can query is not a gimmick — it's the relational model applied to itself, and it's how tools like pgAdmin, ORMs (Chapter 30), and migration frameworks (Chapter 22) discover your schema. Everything is values in relations, even the description of the relations.
A word about schemas: the public namespace
When you ran \dt you may have noticed every table listed under a Schema column reading public. A schema in PostgreSQL is a namespace inside a database — a folder for tables, in effect — and every database starts with one called public where your tables land by default. You won't organize tables into multiple schemas until later (it's a design and security tool, Chapters 21 and 32), but two facts are worth knowing now so nothing later seems mysterious:
- Fully-qualified names.
customersis shorthand forpublic.customers. You can always write the fullschema.tablename, and you must when two schemas hold tables of the same name. - The
search_path. PostgreSQL finds unqualified names by walking a list of schemas called thesearch_path(by default, roughly"$user", public). It's why typingcustomersfindspublic.customerswithout you saying so.
-- See and set the search path:
SHOW search_path;
-- List tables in a specific schema explicitly:
SELECT table_name FROM information_schema.tables
WHERE table_schema = 'public';
For now, everything lives in public and you can ignore schemas entirely. File the concept away: when an enterprise database greets you with sales.orders and hr.employees, you'll know those are two schemas organizing one database, not two databases — and that the search_path is what decides which one a bare name resolves to.
Dialect Difference. "Schema" is an overloaded word. In PostgreSQL it's a namespace within a database. In MySQL, "schema" is essentially a synonym for "database." In Oracle, a schema is tied to a user account. Same word, three different scopes — a frequent source of cross-database confusion. Appendix J keeps the terms straight.
Your first backup and restore (a two-minute preview)
You'll spend a whole chapter on backups (Chapter 38), but you can — and should — learn the one command that protects your work today. PostgreSQL ships pg_dump, which writes a database's entire contents to a file as SQL (or a compressed archive):
# Back up the whole mercado database to a plain-SQL file:
pg_dump -d mercado -f mercado-backup.sql
# Restore it into a fresh database (the file is just SQL statements):
createdb mercado_restored
psql -d mercado_restored -f mercado-backup.sql
That plain-SQL dump is human-readable — open it and you'll see the CREATE TABLE and INSERT statements that rebuild your database. For larger databases you'd use the custom format (pg_dump -Fc), which is compressed and lets you restore selectively with pg_restore. But the idea is the same, and the lesson from Chapter 1 stands: a backup you have never restored is not a backup. Try the round-trip above once now, so the muscle memory exists before you need it in anger.
Common mistake. Confusing
pg_dump(a client program you run from the shell) with a SQL command you run insidepsql.pg_dumpis its own executable, likepsqlandcreatedb— run it from your terminal, not from themercado=#prompt.
A psql workflow that scales
As you move through the book you'll settle into a rhythm. A workflow that holds up from Chapter 5 through the capstone:
- Keep your SQL in files, not just typed at the prompt. Write a query in your editor, save it as
report.sql, and run it with\i report.sql. When it needs tweaking, edit the file and re-run — far better than retyping a long query. - Use
\eto pop the last statement into your$EDITOR, fix it, and have it run when you save and close. Perfect for iterating on a query that's almost right. - Lean on history and tab-completion. The up-arrow recalls previous statements;
Tabcompletes table and column names. Both save enormous time and prevent typos. - Re-run with
\g/\watch.\grepeats the last query;\watch 5re-runs it every five seconds — a poor-man's dashboard for watching a count change as you load data. - Inspect, don't guess. Reach for
\d tablebefore writing a query against a table you haven't touched recently. Thirty seconds of\dsaves five minutes of column-name guessing.
These habits are small, but compounded over a book's worth of queries — and a career's — they're the difference between fighting the tool and flowing through it. Theme #2 in miniature: you learn SQL by writing it, and a comfortable environment is what keeps you writing.
Progressive project: stand up your own database
You loaded Mercado. Now create the home for your project (the domain you chose in Chapter 1):
createdb myproject # use your own name: library, clinic, etc.
psql -d myproject
You don't have a schema yet — that comes in Part III — but having the empty database ready makes the project feel real. As an optional warm-up, create one table for the most central entity in your domain (e.g., members for a library, patients for a clinic) with two or three columns, insert a couple of rows, SELECT them back, and then DROP the table. The goal isn't the table; it's getting comfortable with the create → insert → query loop in your database. You'll design it properly soon.
The big picture: what you've actually built
Step back from the commands for a moment. You didn't just install software — you stood up the full client/server architecture that underlies essentially every application you use. There is now a server process on your machine that owns a set of data files, enforces rules, and answers questions in SQL; and there are clients (psql, maybe a GUI, soon your own programs) that connect to it over a port and converse in that language. The exact same shape — client, connection, server, database, role — describes a hobby project on your laptop and a banking system serving millions. You've just met it at the smallest, most legible scale.
That's the real deliverable of this chapter. The Mercado data is loaded, yes, but the durable win is the mental model: a database is a service you connect to, not a file you open; the data lives in one authoritative place; and everything — your queries, your app, your reports, your backups — is a client of that one server. Every remaining chapter is, in some sense, a deeper exploration of one part of this picture you've now assembled end to end.
Why this matters. People who skip a real local setup and only ever poke at a hosted database through a web console tend to carry fuzzy notions of what a database is. Having installed the server, started and stopped it, connected as a role, and watched a query travel from client to server and back, you have a concrete model that will make connections (Chapter 29), security (Chapter 32), and administration (Chapter 38) feel like familiar territory rather than abstractions.
Summary
You installed PostgreSQL (locally or via Docker), met psql and its essential meta-commands (\l, \c, \dt, \d, \i, \q), and understood the client/server model that makes a database something you connect to, not a file you open. You created the mercado database, loaded its 13-table schema and the deterministic sample data, and verified the load. Finally, you ran your first real queries — and stood up an empty database for your own progressive project.
You can now: - Install and start PostgreSQL on your platform (including Docker). - Connect with
psqland navigate with meta-commands. - Create a database, create/insert/query/drop a table. - Load the Mercado schema and sample data, and verify it. - Diagnose the most common setup errors.
What's next. In Chapter 3 we return to fundamentals — but now with a live database to explore. You'll learn the relational model properly: what tables, rows, columns, keys, domains, and NULL really are, and the integrity rules that keep Mercado's data honest. The vocabulary you build there is the vocabulary of every chapter after it.
Practice in exercises.md, test yourself with the quiz, see setup decisions in the case studies, review the key takeaways, and go deeper with further reading.