Appendix G — EXPLAIN and Tuning Reference

A compact field guide to reading query plans and tuning (companion to Chapters 23–24, 28, 38).

Running EXPLAIN

EXPLAIN <query>;                       -- the plan + cost estimates (no execution)
EXPLAIN ANALYZE <query>;               -- runs it; estimated AND actual rows/time
EXPLAIN (ANALYZE, BUFFERS) <query>;    -- + cache hits vs disk reads

Read the plan bottom-up (most-indented nodes run first). Each node: Node (cost=startup..total rows=est width=bytes) (actual time=.. rows=actual loops=n)

Scan nodes

Node	Meaning	Good/bad
Seq Scan	read every row	fine for small tables / returning most rows; red flag on a big table returning few rows
Index Scan	walk index → fetch matching rows	great for selective lookups
Index Only Scan	answered from the index alone (covering)	fastest
Bitmap Index + Heap Scan	index → bitmap → fetch in physical order	good for medium selectivity

Join nodes

Node	Best when
Nested Loop	small outer + indexed inner (bad if both large/unindexed)
Hash Join	large, unindexed inputs (needs `work_mem`)
Merge Join	inputs already sorted (or affordable to sort)

Red flags → fixes

In the plan	Likely fix
`Seq Scan` on big table, few rows returned	add an index on the filter/join column (Ch. 23)
huge `Rows Removed by Filter`	index the filtered column
estimated `rows` ≫/≪ actual `rows`	stale statistics → `ANALYZE`
`Sort Method: external merge Disk:`	raise `work_mem`, or add a sorted index
`Nested Loop` over large inputs	index the inner side (→ indexed loop or hash join)
correlated subquery re-running per row	rewrite as join / window function (Ch. 9, 12)

Optimize the bottleneck node (highest actual time), not cheap ones. The loop: diagnose (read plan) → fix → verify (re-read plan).

Indexing cheat-sheet (Ch. 23)

Index columns in WHERE (equality/range), JOIN, ORDER BY/GROUP BY.
Index foreign keys used in joins (the #1 slow-join cause).
Types: B-tree (default), GIN (JSONB/FTS/arrays), GiST (ranges/spatial), BRIN (huge ordered), multi-column (leftmost-prefix), partial (WHERE ...), expression (lower(col)).
Costs: slower writes, storage, maintenance. Don't index small/low-cardinality/rarely-queried columns; drop unused indexes (pg_stat_user_indexes, idx_scan = 0).
Build with CREATE INDEX CONCURRENTLY in production.

Statistics & maintenance

ANALYZE [table]; — refresh planner stats (after bulk loads/mass updates).
VACUUM [table]; — reclaim dead tuples (autovacuum does it; monitor n_dead_tup).
Long transactions block VACUUM → bloat (Ch. 28). idle_in_transaction_session_timeout guards it.

Key configuration settings (Ch. 38)

Setting	Rough guidance
`shared_buffers`	~25% RAM (the buffer pool)
`effective_cache_size`	~50–75% RAM (planner hint)
`work_mem`	modest, per operation (×concurrency)
`maintenance_work_mem`	larger (VACUUM, index builds)
`max_connections`	bounded; pool beyond it (PgBouncer)

Tune iteratively (change → measure); don't blindly copy settings.

Monitoring views (Ch. 38)

pg_stat_activity (live queries) · pg_stat_statements (top queries by total time) · pg_stat_user_tables (n_dead_tup, vacuum times) · pg_stat_user_indexes (idx_scan).

Plan-visualization tools

Paste EXPLAIN ANALYZE output into explain.depesz.com or explain.dalibo.com for an annotated, slow-node-highlighted view.

See also: Chapter 23 (Indexing), Chapter 24 (Query Optimization), Chapter 28 (Internals), Chapter 38 (Administration).