SQL Cheat Sheet for Beginners: Queries, Joins, and Functions
SQL is the universal language of data. Every major database, from PostgreSQL and MySQL to SQLite and SQL Server, speaks it. Whether you are pulling reports, analyzing business data, or building applications, SQL is a skill that pays dividends for your entire career. This cheat sheet covers every essential SQL concept with practical examples you can adapt to your own tables and data.
SELECT Basics
Every SQL query starts with SELECT. It tells the database what columns you want to retrieve.
-- Select specific columns
SELECT first_name, last_name, email
FROM customers;
-- Select all columns
SELECT *
FROM customers;
-- Select with an alias
SELECT first_name AS name, annual_salary / 12 AS monthly_salary
FROM employees;
-- Select distinct (unique) values
SELECT DISTINCT department
FROM employees;
-- Limit the number of rows returned
SELECT *
FROM orders
LIMIT 10;
-- Skip rows with OFFSET (pagination)
SELECT *
FROM products
LIMIT 10 OFFSET 20; -- rows 21-30
Ordering results:
-- Sort ascending (default)
SELECT * FROM employees ORDER BY last_name;
-- Sort descending
SELECT * FROM employees ORDER BY salary DESC;
-- Sort by multiple columns
SELECT * FROM employees ORDER BY department ASC, salary DESC;
-- Sort by column position (1-based)
SELECT first_name, last_name, salary
FROM employees
ORDER BY 3 DESC; -- order by salary
Filtering with WHERE
WHERE filters rows before they are returned.
-- Comparison operators
SELECT * FROM products WHERE price > 50;
SELECT * FROM products WHERE price >= 50;
SELECT * FROM products WHERE price != 50; -- or <>
SELECT * FROM employees WHERE department = 'Engineering';
-- AND / OR
SELECT * FROM employees
WHERE department = 'Sales' AND salary > 60000;
SELECT * FROM employees
WHERE department = 'Sales' OR department = 'Marketing';
-- IN — match against a list
SELECT * FROM employees
WHERE department IN ('Sales', 'Marketing', 'Support');
-- NOT IN
SELECT * FROM employees
WHERE department NOT IN ('Engineering', 'DevOps');
-- BETWEEN — inclusive range
SELECT * FROM orders
WHERE order_date BETWEEN '2025-01-01' AND '2025-12-31';
SELECT * FROM products
WHERE price BETWEEN 10 AND 50;
-- LIKE — pattern matching
SELECT * FROM customers WHERE email LIKE '%@gmail.com';
SELECT * FROM products WHERE name LIKE 'iPhone%'; -- starts with
SELECT * FROM products WHERE name LIKE '%Pro%'; -- contains
SELECT * FROM customers WHERE phone LIKE '555-____'; -- _ matches one character
-- IS NULL / IS NOT NULL
SELECT * FROM customers WHERE phone IS NULL;
SELECT * FROM customers WHERE email IS NOT NULL;
Operator precedence matters. Use parentheses to be explicit:
-- Without parentheses, AND binds tighter than OR — this may not do what you expect
SELECT * FROM employees
WHERE department = 'Sales' OR department = 'Marketing' AND salary > 80000;
-- With parentheses — clear intent
SELECT * FROM employees
WHERE (department = 'Sales' OR department = 'Marketing') AND salary > 80000;
Aggregation and Grouping
Aggregate functions compute a single result from a set of rows.
| Function | Description |
|---|---|
COUNT(*) |
Number of rows |
COUNT(column) |
Number of non-NULL values |
COUNT(DISTINCT col) |
Number of unique non-NULL values |
SUM(column) |
Total |
AVG(column) |
Average |
MIN(column) |
Minimum value |
MAX(column) |
Maximum value |
-- Basic aggregations
SELECT COUNT(*) AS total_orders FROM orders;
SELECT SUM(amount) AS total_revenue FROM orders;
SELECT AVG(salary) AS avg_salary FROM employees;
SELECT MIN(price) AS cheapest, MAX(price) AS most_expensive FROM products;
-- Count distinct values
SELECT COUNT(DISTINCT customer_id) AS unique_customers FROM orders;
GROUP BY splits rows into groups and applies aggregation to each group:
-- Revenue by department
SELECT department, SUM(salary) AS total_payroll
FROM employees
GROUP BY department;
-- Orders per month
SELECT
EXTRACT(YEAR FROM order_date) AS year,
EXTRACT(MONTH FROM order_date) AS month,
COUNT(*) AS order_count,
SUM(amount) AS revenue
FROM orders
GROUP BY EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date)
ORDER BY year, month;
-- Multiple aggregations
SELECT
department,
COUNT(*) AS headcount,
ROUND(AVG(salary), 2) AS avg_salary,
MIN(salary) AS min_salary,
MAX(salary) AS max_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;
HAVING filters groups after aggregation (WHERE filters rows before):
-- Departments with average salary above 80k
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 80000;
-- Customers who placed more than 5 orders
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5
ORDER BY order_count DESC;
Execution order of a SQL query:
FROM(andJOIN)WHEREGROUP BYHAVINGSELECTDISTINCTORDER BYLIMIT/OFFSET
This is why you cannot use a column alias from SELECT in a WHERE clause — WHERE runs before SELECT.
Joins
Joins combine rows from two or more tables based on a related column.
INNER JOIN — returns only rows with matching keys in both tables:
Table A Table B
+----+ +----+
| 1 | | 1 |
| 2 | ==> | 3 | ==> Result: 1, 3
| 3 | | 5 |
| 4 | +----+
+----+
SELECT o.order_id, o.order_date, c.first_name, c.last_name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id;
LEFT JOIN — all rows from the left table, matching rows from the right (NULL if no match):
-- All customers and their orders (including customers with no orders)
SELECT c.first_name, c.last_name, o.order_id, o.amount
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;
-- Find customers who have never placed an order
SELECT c.first_name, c.last_name
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
RIGHT JOIN — all rows from the right table, matching rows from the left:
SELECT o.order_id, c.first_name
FROM customers c
RIGHT JOIN orders o ON c.customer_id = o.customer_id;
FULL OUTER JOIN — all rows from both tables, with NULLs where there is no match:
SELECT c.first_name, o.order_id
FROM customers c
FULL OUTER JOIN orders o ON c.customer_id = o.customer_id;
CROSS JOIN — every row from table A paired with every row from table B (Cartesian product):
-- Generate all combinations of sizes and colors
SELECT s.size_name, c.color_name
FROM sizes s
CROSS JOIN colors c;
Self-join — join a table to itself (useful for hierarchical data):
-- Find each employee and their manager
SELECT
e.first_name AS employee,
m.first_name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;
Joining multiple tables:
SELECT
o.order_id,
c.first_name,
p.product_name,
oi.quantity,
oi.unit_price
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id;
Subqueries
A subquery is a query nested inside another query.
Subquery in WHERE:
-- Employees who earn more than the company average
SELECT first_name, last_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
-- Products that have been ordered at least once
SELECT product_name
FROM products
WHERE product_id IN (SELECT DISTINCT product_id FROM order_items);
-- Customers whose most recent order is before 2025
SELECT first_name, last_name
FROM customers
WHERE customer_id NOT IN (
SELECT customer_id
FROM orders
WHERE order_date >= '2025-01-01'
);
Subquery in FROM (derived table):
-- Average of department averages
SELECT AVG(dept_avg) AS avg_of_averages
FROM (
SELECT department, AVG(salary) AS dept_avg
FROM employees
GROUP BY department
) dept_salaries;
Correlated subquery — references the outer query (runs once per outer row):
-- Employees who earn more than their department average
SELECT first_name, last_name, salary, department
FROM employees e
WHERE salary > (
SELECT AVG(salary)
FROM employees
WHERE department = e.department
);
-- Most recent order for each customer
SELECT *
FROM orders o
WHERE order_date = (
SELECT MAX(order_date)
FROM orders
WHERE customer_id = o.customer_id
);
EXISTS — check whether a subquery returns any rows:
-- Customers who have placed at least one order
SELECT first_name, last_name
FROM customers c
WHERE EXISTS (
SELECT 1
FROM orders o
WHERE o.customer_id = c.customer_id
);
Window Functions
Window functions perform calculations across a set of rows related to the current row, without collapsing rows into groups.
Syntax: function() OVER (PARTITION BY ... ORDER BY ...)
-- Row number within each department, ordered by salary
SELECT
first_name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank_in_dept
FROM employees;
-- Top 3 earners in each department
SELECT * FROM (
SELECT
first_name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
FROM employees
) ranked
WHERE rn <= 3;
Ranking functions:
| Function | Behavior for ties |
|---|---|
ROW_NUMBER() |
Unique number for every row (no ties) |
RANK() |
Same rank for ties, skips numbers (1, 2, 2, 4) |
DENSE_RANK() |
Same rank for ties, no gaps (1, 2, 2, 3) |
NTILE(n) |
Divides rows into n roughly equal groups |
SELECT
product_name,
revenue,
RANK() OVER (ORDER BY revenue DESC) AS revenue_rank,
DENSE_RANK() OVER (ORDER BY revenue DESC) AS dense_revenue_rank,
NTILE(4) OVER (ORDER BY revenue DESC) AS quartile
FROM products;
LAG and LEAD — access previous or next row values:
-- Compare each month's revenue to the previous month
SELECT
month,
revenue,
LAG(revenue, 1) OVER (ORDER BY month) AS prev_month,
revenue - LAG(revenue, 1) OVER (ORDER BY month) AS change
FROM monthly_revenue;
-- Next order date for each customer
SELECT
customer_id,
order_date,
LEAD(order_date, 1) OVER (PARTITION BY customer_id ORDER BY order_date) AS next_order
FROM orders;
Running totals and moving averages:
-- Running total of revenue
SELECT
order_date,
amount,
SUM(amount) OVER (ORDER BY order_date) AS running_total
FROM orders;
-- Running total per customer
SELECT
customer_id,
order_date,
amount,
SUM(amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS customer_running_total
FROM orders;
-- 3-month moving average
SELECT
month,
revenue,
AVG(revenue) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg_3
FROM monthly_revenue;
Data Modification
INSERT:
-- Insert a single row
INSERT INTO customers (first_name, last_name, email)
VALUES ('Alice', 'Smith', 'alice@example.com');
-- Insert multiple rows
INSERT INTO customers (first_name, last_name, email)
VALUES
('Bob', 'Jones', 'bob@example.com'),
('Charlie', 'Lee', 'charlie@example.com'),
('Diana', 'Clark', 'diana@example.com');
-- Insert from a SELECT
INSERT INTO archived_orders
SELECT * FROM orders WHERE order_date < '2024-01-01';
UPDATE:
-- Update specific rows
UPDATE employees
SET salary = salary * 1.10
WHERE department = 'Engineering';
-- Update multiple columns
UPDATE products
SET price = 29.99, updated_at = CURRENT_TIMESTAMP
WHERE product_id = 42;
-- Update with a subquery
UPDATE employees
SET salary = salary * 1.05
WHERE department_id IN (
SELECT department_id FROM departments WHERE name = 'Sales'
);
DELETE:
-- Delete specific rows
DELETE FROM orders WHERE order_date < '2020-01-01';
-- Delete all rows (careful)
DELETE FROM temp_table;
-- TRUNCATE — faster way to delete all rows (resets auto-increment)
TRUNCATE TABLE temp_table;
Always use a WHERE clause with UPDATE and DELETE unless you intentionally want to affect every row in the table.
Table Operations
-- Create a table
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE,
department VARCHAR(50),
salary DECIMAL(10, 2) DEFAULT 0,
hire_date DATE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Create a table with a foreign key
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
customer_id INTEGER NOT NULL REFERENCES customers(customer_id),
order_date DATE NOT NULL,
amount DECIMAL(10, 2)
);
-- Add a column
ALTER TABLE employees ADD COLUMN phone VARCHAR(20);
-- Drop a column
ALTER TABLE employees DROP COLUMN phone;
-- Rename a column
ALTER TABLE employees RENAME COLUMN salary TO annual_salary;
-- Change column type
ALTER TABLE employees ALTER COLUMN salary TYPE DECIMAL(12, 2);
-- Add a constraint
ALTER TABLE employees ADD CONSTRAINT salary_positive CHECK (salary >= 0);
-- Drop a table
DROP TABLE IF EXISTS temp_table;
-- Create an index for faster queries
CREATE INDEX idx_employees_department ON employees(department);
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);
Common Patterns and Recipes
Top-N Per Group
Find the top 3 highest-paid employees in each department:
SELECT * FROM (
SELECT
first_name,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
FROM employees
) ranked
WHERE rn <= 3;
Running Totals
SELECT
order_date,
amount,
SUM(amount) OVER (ORDER BY order_date ROWS UNBOUNDED PRECEDING) AS running_total
FROM orders;
Find Duplicates
SELECT email, COUNT(*) AS count
FROM customers
GROUP BY email
HAVING COUNT(*) > 1;
Year-Over-Year Comparison
SELECT
curr.month,
curr.revenue AS this_year,
prev.revenue AS last_year,
ROUND((curr.revenue - prev.revenue) / prev.revenue * 100, 1) AS yoy_pct
FROM monthly_revenue curr
JOIN monthly_revenue prev
ON curr.month_num = prev.month_num
AND curr.year = prev.year + 1;
Conditional Aggregation with CASE
SELECT
department,
COUNT(*) AS total,
COUNT(CASE WHEN salary > 80000 THEN 1 END) AS high_earners,
COUNT(CASE WHEN salary <= 80000 THEN 1 END) AS others,
ROUND(AVG(CASE WHEN gender = 'F' THEN salary END), 2) AS avg_female_salary,
ROUND(AVG(CASE WHEN gender = 'M' THEN salary END), 2) AS avg_male_salary
FROM employees
GROUP BY department;
Gaps and Islands
Find gaps in a sequence (missing IDs):
SELECT
curr.id + 1 AS gap_start,
next.id - 1 AS gap_end
FROM numbers curr
JOIN numbers next ON next.id = (
SELECT MIN(id) FROM numbers WHERE id > curr.id
)
WHERE next.id - curr.id > 1;
Find consecutive date ranges (islands):
WITH grouped AS (
SELECT
event_date,
event_date - INTERVAL '1 day' * ROW_NUMBER() OVER (ORDER BY event_date) AS grp
FROM events
)
SELECT
MIN(event_date) AS start_date,
MAX(event_date) AS end_date,
COUNT(*) AS consecutive_days
FROM grouped
GROUP BY grp
ORDER BY start_date;
CTEs (Common Table Expressions)
CTEs make complex queries readable by breaking them into named steps:
WITH department_stats AS (
SELECT
department,
AVG(salary) AS avg_salary,
COUNT(*) AS headcount
FROM employees
GROUP BY department
),
high_paying AS (
SELECT * FROM department_stats WHERE avg_salary > 80000
)
SELECT
e.first_name,
e.last_name,
e.salary,
e.department,
d.avg_salary AS dept_avg
FROM employees e
JOIN high_paying d ON e.department = d.department
WHERE e.salary > d.avg_salary
ORDER BY e.salary DESC;
Quick Reference
| Task | Syntax |
|---|---|
| Select all | SELECT * FROM table |
| Filter | WHERE condition |
| Sort | ORDER BY col DESC |
| Limit rows | LIMIT 10 |
| Remove duplicates | SELECT DISTINCT col |
| Count rows | SELECT COUNT(*) |
| Group and aggregate | GROUP BY col |
| Filter groups | HAVING condition |
| Inner join | JOIN t2 ON t1.id = t2.id |
| Left join | LEFT JOIN t2 ON t1.id = t2.id |
| Subquery | WHERE col IN (SELECT ...) |
| Row numbering | ROW_NUMBER() OVER (ORDER BY col) |
| Running total | SUM(col) OVER (ORDER BY col) |
| Previous row | LAG(col) OVER (ORDER BY col) |
| Insert row | INSERT INTO t (cols) VALUES (vals) |
| Update rows | UPDATE t SET col = val WHERE ... |
| Delete rows | DELETE FROM t WHERE ... |
| Create table | CREATE TABLE t (col TYPE, ...) |
| Add index | CREATE INDEX idx ON t(col) |
This cheat sheet gives you the SQL vocabulary to handle the vast majority of data questions you will face. Keep it bookmarked, practice with real data, and you will find that fluent SQL is one of the most versatile skills you can have.
Learn data analysis with Python in our free Python for Business Beginners textbook.