Chapter 2: Setting Up PostgreSQL, Your First Database, and the Tools You'll Use

DataField.Dev

26 min read

> Where you are: Part I, Chapter 2 of 40. Chapter 1 was all ideas. This one is all doing. By the end you'll have PostgreSQL running, the mercado practice database loaded, and your first real queries returning rows.

In This Chapter

From ideas to a running database
Installing PostgreSQL
The client/server model (a 60-second mental model)
Meeting psql
A graphical client (optional)
Creating your first database
Your first table, by hand
Loading the Mercado practice database
Your first real queries on Mercado
Controlling how psql displays results
Scripting with psql: files, variables, and safety
Connections, roles, and authentication (a practical primer)
Running PostgreSQL in the cloud
Making psql your own: .psqlrc
Troubleshooting the most common setup problems
Encoding, collation, and locale: get it right once
Versions and staying current
Asking the database about itself
A word about schemas: the public namespace
Your first backup and restore (a two-minute preview)
A psql workflow that scales
Progressive project: stand up your own database
The big picture: what you've actually built
Summary

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 2: Setting Up PostgreSQL, Your First Database, and the Tools You'll Use

Where you are: Part I, Chapter 2 of 40. Chapter 1 was all ideas. This one is all doing. By the end you'll have PostgreSQL running, the mercado practice database loaded, and your first real queries returning rows.

Learning paths: 💻 📊 🔬 🏗️ — everyone. This is the setup the whole book depends on.

From ideas to a running database

In Chapter 1 we argued, in the abstract, that a database management system solves problems spreadsheets can't. Now you're going to install one of the best DBMSs in the world — PostgreSQL — and start using it. This chapter is a checklist as much as a lesson: install the software, meet the tools, create a database, load the practice data, and run your first queries. Take it slowly, type every command yourself, and don't move on until each step works. Setup friction is the single most common reason people give up on databases; we're going to get you past it cleanly.

There are three things you'll install or meet:

The PostgreSQL server — the actual DBMS, a program that runs in the background and manages your data.
psql — the official command-line client. Plain, fast, scriptable, and the tool this book uses for nearly every example. Learn it and you can work on any machine, including a bare server over SSH.
A graphical client (optional) — pgAdmin or DBeaver, if you prefer clicking to typing. Useful, but psql is what we'll show.

Why this matters. You could use a hosted database and never install anything. But installing PostgreSQL locally — and especially understanding the client/server split — demystifies what a database actually is: a server process you talk to, not a file you open. That mental model pays off in every later chapter, from connections (Ch. 29) to administration (Ch. 38).

Installing PostgreSQL

PostgreSQL runs on macOS, Windows, and Linux. Pick the path for your platform. If you want the least-friction, most-consistent experience — and you don't mind installing Docker — skip to the Docker section; it gives everyone in the book the exact same environment.

Whatever route you choose, you want PostgreSQL 15 or newer (16 or 17 are great too). Everything in this book targets 15+.

macOS

The simplest option is Postgres.app: download it, drag it to Applications, open it, and click "Initialize." You now have a running server. Add its command-line tools to your PATH as the app instructs, so psql works in your terminal.

Or use Homebrew:

brew install postgresql@16
brew services start postgresql@16      # start the server now and on login

Windows

Download the installer from EDB (the standard Windows distribution). Run it and accept the defaults. The installer sets up the server, psql, and pgAdmin, and asks you to choose a password for the postgres superuser — write that password down. It also offers "Stack Builder," which you can skip.

After installing, open "SQL Shell (psql)" from the Start menu, or use PowerShell. You may need to add PostgreSQL's bin directory (e.g. C:\Program Files\PostgreSQL\16\bin) to your PATH so psql works everywhere.

Linux (Debian/Ubuntu)

sudo apt update
sudo apt install postgresql postgresql-client
sudo systemctl enable --now postgresql    # start now and on boot

On Fedora/RHEL: sudo dnf install postgresql-server postgresql, then sudo postgresql-setup --initdb and sudo systemctl enable --now postgresql.

On Linux, installation creates a system user named postgres. The quickest way to get a superuser shell is:

sudo -u postgres psql

Docker (the universal option — recommended for consistency)

If you have Docker, you can run PostgreSQL without installing it on your host at all. This is how we recommend following the book if you want zero surprises — and it's also how the instructor lab is set up (see the instructor guide).

docker run --name mercado-pg \
  -e POSTGRES_PASSWORD=mercado \
  -e POSTGRES_DB=mercado \
  -p 5432:5432 \
  -d postgres:16

That single command downloads PostgreSQL 16, starts it in the background, creates a database named mercado, and exposes it on the standard port 5432. To open a psql session inside the container:

docker exec -it mercado-pg psql -U postgres -d mercado

To stop and start it later: docker stop mercado-pg / docker start mercado-pg. Your data persists between stops (it lives in the container's volume).

Try this. Whatever route you took, verify the install from a terminal: bash psql --version You should see something like psql (PostgreSQL) 16.2. If the command isn't found, your PATH doesn't include PostgreSQL's bin directory — fix that before continuing.

The client/server model (a 60-second mental model)

Before you connect, understand what you're connecting to. PostgreSQL is a server: a long-running process (often called the postmaster and its workers) that owns your data files and listens for connections, by default on TCP port 5432. A client — psql, pgAdmin, or your Python app later — opens a connection, sends SQL, and receives results.

   ┌──────────────┐     connection (port 5432)      ┌────────────────────┐
   │  CLIENT      │  ─────────  SQL  ─────────────►  │  PostgreSQL SERVER │
   │  psql / GUI  │  ◄───────  rows  ─────────────   │  (owns the data)   │
   │  / your app  │                                  │   database: mercado│
   └──────────────┘                                  └────────────────────┘

This split is why the same database can serve many clients at once (Chapter 1's concurrency), and why in Part V your Python application will connect the same way psql does. The data lives in one place — the server — and everyone talks to it.

To connect, a client needs: a host (where the server is — localhost for your own machine), a port (5432), a database name, and a user/role (with a password if required). psql accepts these as flags or as a connection string:

psql -h localhost -p 5432 -U postgres -d postgres
# or, equivalently:
psql "postgresql://postgres@localhost:5432/postgres"

If you installed locally and omit flags, psql uses sensible defaults (your OS username, local host, port 5432).

Meeting psql

psql is a REPL — a read-evaluate-print loop — for SQL. Once connected, you'll see a prompt like:

psql (16.2)
Type "help" for help.

postgres=#

The prompt tells you which database you're in (postgres) and whether you're a superuser (# for superuser, > for an ordinary user). You can do two kinds of things at this prompt:

Run SQL — anything ending in a semicolon ;. The semicolon is what tells psql "this statement is finished; run it." Forgetting it is the #1 beginner stumble: psql just shows a continuation prompt (postgres-#) and waits.
Run meta-commands — psql's own shortcuts, which start with a backslash \. These are not SQL; they're conveniences for navigating the database. The most important ones:

Meta-command	What it does
`\l`	List all databases
`\c mercado`	Connect to the `mercado` database
`\dt`	List tables in the current database
`\d customers`	Describe the `customers` table (columns, types, keys, indexes)
`\du`	List roles (users)
`\dn`	List schemas (namespaces)
`\x`	Toggle expanded display (great for wide rows)
`\timing`	Toggle showing how long each query takes
`\i path/file.sql`	Run (include) the SQL in a file
`\e`	Open the last query in your editor
`\?`	Help on meta-commands
`\h SELECT`	Help on the SQL syntax of a command
`\q`	Quit

Try this. At the prompt, type \l and press Enter to see your databases. Then \? to skim the meta-command list. You don't need to memorize them — \? is always there — but \dt, \d, \c, \i, and \q will become reflexes within a day.

A graphical client (optional)

If you prefer a visual tool, two excellent free options:

pgAdmin — the official PostgreSQL GUI; ships with the Windows installer. A tree of servers/databases/tables on the left, a query editor on the right, and visual table browsers.
DBeaver — a popular cross-database GUI; clean query editor, good ER-diagram viewer, works with PostgreSQL and everything else.

Connect a GUI by giving it the same four things psql needs: host (localhost), port (5432), database, and user/password. We'll show psql throughout because it's universal and copy-pasteable, but everything works identically in a GUI — pick whatever keeps you writing SQL.

Dialect Difference. psql's backslash meta-commands (\dt, \d, etc.) are specific to PostgreSQL's client. Other databases have their own (MySQL's SHOW TABLES;, SQL Server's sp_help). The SQL you write is mostly portable; the client conveniences are not.

Creating your first database

When PostgreSQL is installed it already contains a default database (usually called postgres) that you can connect to. But you'll make your own for the practice data. There are two equivalent ways.

From the shell, using the createdb helper command:

createdb mercado

Or from inside psql, with SQL:

CREATE DATABASE mercado;

(If you used the Docker command above, mercado already exists — POSTGRES_DB=mercado created it for you.)

Now connect to it. From the shell: psql -d mercado. From inside psql: \c mercado. Your prompt changes to show the new database:

mercado=#

You're now "inside" the mercado database. Every table you create and every query you run happens here.

Common mistake. Creating a table while connected to the wrong database. If you CREATE TABLE while your prompt says postgres=#, your table lands in the postgres database, not mercado. Glance at the prompt before you run DDL. (\c mercado puts you where you mean to be.)

Your first table, by hand

Before loading the full Mercado dataset, let's make one tiny table from scratch — so you see the whole lifecycle of create → insert → query → drop in miniature. Type this at the mercado=# prompt:

CREATE TABLE hello (
    id    integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    word  text NOT NULL
);

INSERT INTO hello (word) VALUES ('databases'), ('are'), ('fun');

SELECT * FROM hello;

PostgreSQL responds:

 id |   word
----+-----------
  1 | databases
  2 | are
  3 | fun
(3 rows)

You just exercised the core loop of all database work: you defined a structure (CREATE TABLE), added data (INSERT), and asked a question (SELECT). The GENERATED ALWAYS AS IDENTITY clause told PostgreSQL to assign the id values automatically — you'll meet it properly in Chapter 14. Now clean up:

DROP TABLE hello;

Gone. That five-line round trip is, in essence, everything the rest of the book elaborates on.

Loading the Mercado practice database

Now the real thing. The book ships three SQL files (in the sql/ folder of the book's materials):

schema.sql — creates Mercado's 13 tables, keys, and constraints. (No data.)
seed-sample.sql — loads a small, deterministic dataset (12 customers, 15 products, 15 orders, …). Because it's fixed, the query results printed in this book will exactly match what you see. Use this one to follow along.
generate_data.sql — loads a large dataset (~100,000 orders) for the performance chapters (23–25). You'll run this later, when those chapters ask you to.

Load the schema and the sample data. From your shell (in the folder that contains sql/):

psql -d mercado -f sql/schema.sql
psql -d mercado -f sql/seed-sample.sql

Or, from inside a psql session connected to mercado:

\i sql/schema.sql
\i sql/seed-sample.sql

You'll see a stream of CREATE TABLE, INSERT 0 N, and SELECT 1 messages scroll by. That's success.

Verify the load

Trust, but verify. List the tables:

\dt

              List of relations
 Schema |    Name     | Type  |  Owner
--------+-------------+-------+----------
 public | addresses   | table | postgres
 public | categories  | table | postgres
 public | customers   | table | postgres
 public | employees   | table | postgres
 public | inventory   | table | postgres
 public | order_items | table | postgres
 public | orders      | table | postgres
 public | payments    | table | postgres
 public | products    | table | postgres
 public | reviews     | table | postgres
 public | shipments   | table | postgres
 public | suppliers   | table | postgres
 public | warehouses  | table | postgres
(13 rows)

Thirteen tables, exactly as designed. Now confirm the data loaded by counting a few:

SELECT
  (SELECT count(*) FROM customers)   AS customers,
  (SELECT count(*) FROM products)    AS products,
  (SELECT count(*) FROM orders)      AS orders,
  (SELECT count(*) FROM order_items) AS order_items;

 customers | products | orders | order_items
-----------+----------+--------+-------------
        12 |       15 |     15 |          26
(1 row)

If you see those numbers, Mercado is loaded and you're ready for the rest of the book. (Don't worry about the exact order_items count to the unit — it's the shape that matters: a dozen customers, fifteen products, fifteen orders, a couple dozen line items.)

Try this. Inspect a table's structure with \d: sql \d customers You'll see every column, its type, whether it allows NULL, the primary key, and any constraints. \d <table> is the fastest way to remember what's in a table you haven't touched in a while — you'll use it constantly.

Your first real queries on Mercado

Let's actually use the data. Type these one at a time:

-- Everyone in the customers table
SELECT * FROM customers;

 customer_id | first_name | last_name |          email           |  phone   | loyalty_tier |       created_at
-------------+------------+-----------+--------------------------+----------+--------------+------------------------
           1 | Alice      | Nguyen    | alice.nguyen@example.com | 555-0101 | gold         | 2023-01-15 00:00:00+00
           2 | Bob        | Martinez  | bob.martinez@example.com | 555-0102 | standard     | 2023-02-03 00:00:00+00
 ...
(12 rows)

-- Just the products that cost more than $200, cheapest first
SELECT name, price
FROM products
WHERE price > 200
ORDER BY price;

-- How many orders does each status have?
SELECT status, count(*) AS num_orders
FROM orders
GROUP BY status
ORDER BY num_orders DESC;

You won't fully understand WHERE, ORDER BY, or GROUP BY yet — that's Chapters 5 and 7. The point right now is simpler and more important: you asked the database questions in SQL, and it answered them from organized, persistent data. The abstraction from Chapter 1 is now something running on your own machine.

Try this. Turn on query timing and the expanded display, then re-run a query: sql \timing \x SELECT * FROM products WHERE sku = 'LAP-001'; \x reformats wide rows as a vertical list (much easier to read), and \timing prints how long the query took. Toggle them off by running \x and \timing again.

Controlling how psql displays results

psql's default output is fine for narrow results, but real tables are wide, and the default grid wraps into an unreadable mess. A handful of formatting controls fix that, and they're worth learning early because you'll stare at query output for the rest of the book.

\x (expanded display) flips each row from a horizontal line into a vertical list of column | value pairs. Indispensable for wide tables like products or orders. Run \x auto and psql decides per result whether to expand based on terminal width — the best of both worlds.
\pset null '∅' makes NULLs visible. By default a NULL prints as nothing — indistinguishable from an empty string. Setting a marker (some people use (null) or ¤) saves you from a whole class of confusion when you reach Chapter 3's discussion of NULL.
\pset border 2 and \pset linestyle unicode draw nicer table borders.
\a toggles "aligned" vs "unaligned" output; \t toggles the header and row-count footer. Together (\a \t) they produce raw, parseable output — handy when piping results to another tool.
\g re-runs the previous query; \gx runs it with expanded display for just this one result without toggling \x globally.

-- A wide row is far more readable expanded:
\x
SELECT * FROM products WHERE sku = 'LAP-001';

-[ RECORD 1 ]------------------------------
product_id  | 1
sku         | LAP-001
name        | Nimbus 14 Laptop
category_id | 3
supplier_id | 2
price       | 1299.00
attributes  | {"ram": "16GB", "cpu": "M3"}
is_active   | t
created_at  | 2023-01-10 00:00:00+00

Try this. Set up readable defaults for this session and re-run a query: sql \pset null '∅' \x auto SELECT customer_id, first_name, phone FROM customers; Now the customers with no phone show ∅ instead of a blank you might mistake for an empty string. We'll come back to why that distinction matters in Chapter 3.

Scripting with psql: files, variables, and safety

You already loaded the schema with \i. As your work grows, you'll run SQL from files constantly — migrations, reports, data loads — and a few psql features make that reliable rather than error-prone.

Run a file and stop on the first error. By default, psql plows ahead even if statement 3 of 50 fails, which can leave your database half-built. Always load schema files with ON_ERROR_STOP so a failure halts immediately:

psql -d mercado -v ON_ERROR_STOP=1 -f sql/schema.sql

Run a single statement without entering the REPL with -c:

psql -d mercado -c "SELECT count(*) FROM orders;"

Wrap a whole file in one transaction with -1 (or --single-transaction), so it either fully succeeds or fully rolls back — exactly what you want for a migration:

psql -d mercado -1 -v ON_ERROR_STOP=1 -f migration-001.sql

Use variables for repeatable scripts. psql substitutes :var (and :'var' for a quoted string):

\set tier 'gold'
SELECT first_name, last_name FROM customers WHERE loyalty_tier = :'tier';

Common mistake. Loading a multi-statement script without ON_ERROR_STOP and assuming success because the prompt came back. If statement 12 failed with a constraint error, the rest may have run anyway, leaving an inconsistent database and a confusing error buried in the scrollback. Make -v ON_ERROR_STOP=1 a reflex for any file that matters. (We formalize this discipline in Chapter 22, Schema Migration.)

Connections, roles, and authentication (a practical primer)

In Chapter 1 we said a database is something you connect to. That connection has to be authenticated, and PostgreSQL's model here confuses newcomers, so a quick orientation now saves pain later (the full treatment is Chapter 32, Security, and Chapter 38, Administration).

Roles are users. In PostgreSQL, "users" and "groups" are unified into one concept: the role. The postgres role created at install is a superuser — it can do anything. You'll create less-powerful roles for applications in Part V.
Authentication is configured in pg_hba.conf (HBA = host-based authentication). This file decides who can connect, from where, to which database, and how they must prove identity (password, trust, certificate). When a local connection "just works" with no password, it's because an pg_hba.conf rule allowed it; when you get password authentication failed, a rule required a password you didn't supply.
Connection settings have defaults from environment variables. Rather than typing -h -p -U -d every time, you can set PGHOST, PGPORT, PGUSER, and PGDATABASE. And rather than typing your password, you can store it once in a ~/.pgpass file (chmod 600), which psql and your apps will read automatically.

# Set once in your shell profile, then just type `psql`:
export PGHOST=localhost
export PGPORT=5432
export PGUSER=postgres
export PGDATABASE=mercado

# ~/.pgpass  (format: host:port:database:user:password ; mode 600)
localhost:5432:mercado:postgres:mercado

Why this matters. Understanding that a connection is (host, port, database, role, auth) — not a magic incantation — is what lets you debug "it works on my machine but not in production" later. Nearly every connection failure is one of those five pieces being wrong. Keep the list in mind; Part V's application connections use the exact same five.

Common mistake (security). Putting a real password in a shell command (PGPASSWORD=secret psql ...) or committing it to a script. It ends up in your shell history and your version control. Use ~/.pgpass (file permissions protect it) or a secrets manager. We treat credential handling properly in Chapter 32.

Running PostgreSQL in the cloud

You installed PostgreSQL locally, which is the best way to learn — you control everything and nothing costs money. But in production, most teams run a managed PostgreSQL service, where a cloud provider handles backups, patching, replication, and failover for you. You should know the landscape exists:

Amazon RDS / Aurora for PostgreSQL, Google Cloud SQL, Azure Database for PostgreSQL — the big-three managed offerings. You get a PostgreSQL endpoint; they run the server.
Supabase, Neon, Render, Railway — developer-friendly platforms with generous free tiers, instant provisioning, and (for Neon) "serverless" Postgres that scales to zero.

The crucial point for this book: a managed PostgreSQL is still PostgreSQL. You connect with the same psql, run the same SQL, and everything you learn here applies unchanged. The only differences are operational — you don't manage the server process, and some superuser-only features are restricted. When a chapter says "connect to your database," a cloud endpoint works exactly as well as localhost.

Dialect Difference. Some managed services run PostgreSQL-compatible engines (e.g., Amazon Aurora) or older major versions. Check the version with SELECT version(); and confirm it's 15+ for full compatibility with this book's examples.

Making psql your own: .psqlrc

psql reads a file called .psqlrc in your home directory every time it starts, so you can bake in your preferred settings once. A sensible starter:

-- ~/.psqlrc
\set QUIET 1
\timing on
\x auto
\pset null '∅'
\set COMP_KEYWORD_CASE upper
\set HISTSIZE 5000
\unset QUIET
\echo 'psql ready — type \\? for help, \\q to quit'

This turns on timing, auto-expanded display, a visible NULL marker, upper-cased keyword tab-completion, and a bigger command history — every session, automatically. Small comfort, large cumulative payoff over a book's worth of queries.

Try this. Create the file above (adjust to taste), then start a fresh psql. You should see your banner, and \timing and the NULL marker already on. If you ever want a clean session without it, start with psql -X (the -X skips .psqlrc).

Troubleshooting the most common setup problems

psql: command not found — PostgreSQL's bin directory isn't on your PATH. Find it (e.g. C:\Program Files\PostgreSQL\16\bin on Windows, or the Postgres.app tools path on macOS) and add it.
could not connect to server / connection refused — the server isn't running. Start it (brew services start postgresql@16, sudo systemctl start postgresql, or docker start mercado-pg).
password authentication failed for user "postgres" — you typed the wrong password (the one you set during install), or you're connecting as the wrong user. On Linux, try sudo -u postgres psql.
database "mercado" does not exist — you skipped createdb mercado (or you're connected to the wrong server). Create it, then reconnect.
permission denied to create database — your role lacks the privilege. Connect as a superuser (postgres) to create the database, or grant your role CREATEDB.
The \i sql/schema.sql file isn't found — paths in \i are relative to where you launched psql. Use the full path, or cd to the right folder first.

Common mistake. Running a long statement and seeing the prompt change to mercado-# (note the dash) with nothing happening. That dash means "I'm still waiting for you to finish this statement." You forgot the closing ;. Type ; and Enter, and it'll run. To abandon a half-typed statement, press Ctrl+C.

Encoding, collation, and locale: get it right once

Two settings, chosen when a database is created, quietly affect every text column you'll ever store, and they're awkward to change later — so it's worth thirty seconds of attention now.

Encoding is how text is stored as bytes. The right answer in modern systems is always UTF-8, which can represent every character in every language (and emoji). PostgreSQL defaults to UTF-8 in almost all installations; the Docker image does. You can confirm with \l, which shows each database's encoding.
Collation is the sort order and comparison rules for text — whether 'apple' < 'Banana', how accented characters order, whether comparisons are case-sensitive. Collation is tied to a locale (like en_US.UTF-8). It determines what ORDER BY name actually produces and which strings a UNIQUE text column treats as equal.

-- See your databases, their owners, encodings, and collations:
\l

-- Sort order follows the database collation. To force a specific
-- ordering regardless of the default, you can ask for one per-query:
SELECT name FROM products ORDER BY name COLLATE "C";   -- raw byte order

For this book, the defaults are fine — UTF-8 encoding and your system's locale. The reason to know they exist is that collation surprises are real: a query that sorts "correctly" on your laptop can sort differently on a server with a different locale, and a case-insensitive search is a collation question, not a WHERE trick. When you meet text sorting and searching in Part II, you'll know where the behavior comes from.

Common mistake. Creating a production database with a non-UTF-8 encoding (sometimes a stale default on old systems) and discovering months later that it can't store a customer's name in their own alphabet. Check \l early; recreate with CREATE DATABASE ... ENCODING 'UTF8' if needed, before there's data to migrate.

Versions and staying current

PostgreSQL ships a new major version roughly once a year (…15, 16, 17…), each adding features and performance, plus regular minor versions (16.1, 16.2…) that are bug-and-security fixes only. A few practical rules:

Target 15+ for this book. Everything here works on 15, 16, and 17. Check yours with SELECT version(); or psql --version.
Minor upgrades are safe and important — they're just fixes, fully compatible, and include security patches. Apply them.
Major upgrades can change behavior and require a planned migration (pg_upgrade or dump-and-restore). That's an administration topic (Chapter 38), not something to fear, but not something to do absent-mindedly on a live system either.

The version cadence matters because it's how PostgreSQL keeps absorbing the capabilities that, per theme #4, let it replace other databases — JSONB, better partitioning, logical replication, and more have all arrived this way. Reading release notes once a year is one of the cheapest ways to keep your skills current (a habit we revisit in Chapter 40, the database career).

Asking the database about itself

One of the relational model's quiet elegances (Chapter 3 will make this formal) is that PostgreSQL describes itself using the very same tables-and-queries machinery it offers you. The \d meta-commands you've been using are really just pre-packaged queries against this self-description. Knowing how to query it directly is a skill you'll use for the rest of the book — and the rest of your career.

There are two ways in:

information_schema — a standardized, cross-database set of views defined by the SQL standard. Portable, if a little verbose.
The PostgreSQL system catalogs (pg_catalog, the pg_* tables) — PostgreSQL-specific, more detailed, and what the \d commands actually read.

-- The portable way: list your tables via the standard information_schema.
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY table_name;

-- List the columns and types of one table:
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'customers'
ORDER BY ordinal_position;

-- The PostgreSQL way: a quick size report on every table, biggest first.
-- (pg_catalog functions are PostgreSQL-specific but enormously handy.)
SELECT relname AS table,
       pg_size_pretty(pg_total_relation_size(oid)) AS total_size
FROM pg_class
WHERE relkind = 'r' AND relnamespace = 'public'::regnamespace
ORDER BY pg_total_relation_size(oid) DESC;

You don't need these today, but file them away: when you wonder "what tables reference customers?" or "which of my tables has grown huge?", the answer is a query against the catalog. The database is the most authoritative documentation of itself.

Try this. Run \dt+ (note the +). The plus sign asks psql for extra detail — including each table's size on disk. Many \d commands take a +: \d+ customers, \df+, \dn+. It's the fastest way to get more without writing a catalog query yourself.

Theory → Practice. That a database stores its own structure as data you can query is not a gimmick — it's the relational model applied to itself, and it's how tools like pgAdmin, ORMs (Chapter 30), and migration frameworks (Chapter 22) discover your schema. Everything is values in relations, even the description of the relations.

A word about schemas: the `public` namespace

When you ran \dt you may have noticed every table listed under a Schema column reading public. A schema in PostgreSQL is a namespace inside a database — a folder for tables, in effect — and every database starts with one called public where your tables land by default. You won't organize tables into multiple schemas until later (it's a design and security tool, Chapters 21 and 32), but two facts are worth knowing now so nothing later seems mysterious:

Fully-qualified names. customers is shorthand for public.customers. You can always write the full schema.table name, and you must when two schemas hold tables of the same name.
The search_path. PostgreSQL finds unqualified names by walking a list of schemas called the search_path (by default, roughly "$user", public). It's why typing customers finds public.customers without you saying so.

-- See and set the search path:
SHOW search_path;

-- List tables in a specific schema explicitly:
SELECT table_name FROM information_schema.tables
WHERE table_schema = 'public';

For now, everything lives in public and you can ignore schemas entirely. File the concept away: when an enterprise database greets you with sales.orders and hr.employees, you'll know those are two schemas organizing one database, not two databases — and that the search_path is what decides which one a bare name resolves to.

Dialect Difference. "Schema" is an overloaded word. In PostgreSQL it's a namespace within a database. In MySQL, "schema" is essentially a synonym for "database." In Oracle, a schema is tied to a user account. Same word, three different scopes — a frequent source of cross-database confusion. Appendix J keeps the terms straight.

Your first backup and restore (a two-minute preview)

You'll spend a whole chapter on backups (Chapter 38), but you can — and should — learn the one command that protects your work today. PostgreSQL ships pg_dump, which writes a database's entire contents to a file as SQL (or a compressed archive):

# Back up the whole mercado database to a plain-SQL file:
pg_dump -d mercado -f mercado-backup.sql

# Restore it into a fresh database (the file is just SQL statements):
createdb mercado_restored
psql -d mercado_restored -f mercado-backup.sql

That plain-SQL dump is human-readable — open it and you'll see the CREATE TABLE and INSERT statements that rebuild your database. For larger databases you'd use the custom format (pg_dump -Fc), which is compressed and lets you restore selectively with pg_restore. But the idea is the same, and the lesson from Chapter 1 stands: a backup you have never restored is not a backup. Try the round-trip above once now, so the muscle memory exists before you need it in anger.

Common mistake. Confusing pg_dump (a client program you run from the shell) with a SQL command you run inside psql. pg_dump is its own executable, like psql and createdb — run it from your terminal, not from the mercado=# prompt.

A psql workflow that scales

As you move through the book you'll settle into a rhythm. A workflow that holds up from Chapter 5 through the capstone:

Keep your SQL in files, not just typed at the prompt. Write a query in your editor, save it as report.sql, and run it with \i report.sql. When it needs tweaking, edit the file and re-run — far better than retyping a long query.
Use \e to pop the last statement into your $EDITOR, fix it, and have it run when you save and close. Perfect for iterating on a query that's almost right.
Lean on history and tab-completion. The up-arrow recalls previous statements; Tab completes table and column names. Both save enormous time and prevent typos.
Re-run with \g / \watch. \g repeats the last query; \watch 5 re-runs it every five seconds — a poor-man's dashboard for watching a count change as you load data.
Inspect, don't guess. Reach for \d table before writing a query against a table you haven't touched recently. Thirty seconds of \d saves five minutes of column-name guessing.

These habits are small, but compounded over a book's worth of queries — and a career's — they're the difference between fighting the tool and flowing through it. Theme #2 in miniature: you learn SQL by writing it, and a comfortable environment is what keeps you writing.

Progressive project: stand up your own database

You loaded Mercado. Now create the home for your project (the domain you chose in Chapter 1):

createdb myproject       # use your own name: library, clinic, etc.
psql -d myproject

You don't have a schema yet — that comes in Part III — but having the empty database ready makes the project feel real. As an optional warm-up, create one table for the most central entity in your domain (e.g., members for a library, patients for a clinic) with two or three columns, insert a couple of rows, SELECT them back, and then DROP the table. The goal isn't the table; it's getting comfortable with the create → insert → query loop in your database. You'll design it properly soon.

The big picture: what you've actually built

Step back from the commands for a moment. You didn't just install software — you stood up the full client/server architecture that underlies essentially every application you use. There is now a server process on your machine that owns a set of data files, enforces rules, and answers questions in SQL; and there are clients (psql, maybe a GUI, soon your own programs) that connect to it over a port and converse in that language. The exact same shape — client, connection, server, database, role — describes a hobby project on your laptop and a banking system serving millions. You've just met it at the smallest, most legible scale.

That's the real deliverable of this chapter. The Mercado data is loaded, yes, but the durable win is the mental model: a database is a service you connect to, not a file you open; the data lives in one authoritative place; and everything — your queries, your app, your reports, your backups — is a client of that one server. Every remaining chapter is, in some sense, a deeper exploration of one part of this picture you've now assembled end to end.

Why this matters. People who skip a real local setup and only ever poke at a hosted database through a web console tend to carry fuzzy notions of what a database is. Having installed the server, started and stopped it, connected as a role, and watched a query travel from client to server and back, you have a concrete model that will make connections (Chapter 29), security (Chapter 32), and administration (Chapter 38) feel like familiar territory rather than abstractions.

Summary

You installed PostgreSQL (locally or via Docker), met psql and its essential meta-commands (\l, \c, \dt, \d, \i, \q), and understood the client/server model that makes a database something you connect to, not a file you open. You created the mercado database, loaded its 13-table schema and the deterministic sample data, and verified the load. Finally, you ran your first real queries — and stood up an empty database for your own progressive project.

You can now: - Install and start PostgreSQL on your platform (including Docker). - Connect with psql and navigate with meta-commands. - Create a database, create/insert/query/drop a table. - Load the Mercado schema and sample data, and verify it. - Diagnose the most common setup errors.

What's next. In Chapter 3 we return to fundamentals — but now with a live database to explore. You'll learn the relational model properly: what tables, rows, columns, keys, domains, and NULL really are, and the integrity rules that keep Mercado's data honest. The vocabulary you build there is the vocabulary of every chapter after it.

Practice in exercises.md, test yourself with the quiz, see setup decisions in the case studies, review the key takeaways, and go deeper with further reading.

In This Chapter

Chapter 2: Setting Up PostgreSQL, Your First Database, and the Tools You'll Use

From ideas to a running database

Installing PostgreSQL

macOS

Windows

Linux (Debian/Ubuntu)

Docker (the universal option — recommended for consistency)

The client/server model (a 60-second mental model)

Meeting psql

A graphical client (optional)

Creating your first database

Your first table, by hand

Loading the Mercado practice database

Verify the load

Your first real queries on Mercado

Controlling how psql displays results

Scripting with psql: files, variables, and safety

Connections, roles, and authentication (a practical primer)

Running PostgreSQL in the cloud

Making psql your own: .psqlrc

Troubleshooting the most common setup problems

Encoding, collation, and locale: get it right once

Versions and staying current

Asking the database about itself

A word about schemas: the public namespace

Your first backup and restore (a two-minute preview)

A psql workflow that scales

Progressive project: stand up your own database

The big picture: what you've actually built

Summary

A word about schemas: the `public` namespace