Chapter 20 — Key Takeaways
Discrete Probability: Outcomes, Random Variables, Expectation. One-page review card. Reread before the exam.
The model in four objects
| Object | What it is | Symbol |
|---|---|---|
| Sample space | set of all mutually-exclusive, exhaustive outcomes | $S$ |
| Outcome | one element of $S$ (exactly one occurs) | $s \in S$ |
| Event | a subset of outcomes; occurs when the outcome lies in it | $E \subseteq S$ |
| Probability function | $P(E) = \sum_{s\in E} p(s)$, with $p(s)\ge 0$, $\sum_s p(s)=1$ | $P$ |
Events are sets → use Chapter 8 operations: $E\cup F$ ("or"), $E\cap F$ ("and"), $\overline{E}$ ("not"), $E\cap F=\emptyset$ ("mutually exclusive").
Computing $P(E)$
| Model | Formula | When it applies |
|---|---|---|
| Equally-likely (Laplace) | $P(E)=\dfrac{\lvert E\rvert}{\lvert S\rvert}=\dfrac{\#\text{favorable}}{\#\text{possible}}$ | only if all outcomes equally likely |
| General discrete | $P(E)=\sum_{s\in E} p(s)$ | always |
Laplace turns every probability into two counting problems (Ch. 15–16).
Axioms & their consequences
| Rule | Statement |
|---|---|
| Non-negativity | $P(E)\ge 0$ |
| Normalization | $P(S)=1$ |
| Additivity (disjoint) | $E\cap F=\emptyset \Rightarrow P(E\cup F)=P(E)+P(F)$ |
| Complement | $P(\overline{E})=1-P(E)$ |
| Impossible event | $P(\emptyset)=0$ |
| Monotonicity | $E\subseteq F \Rightarrow P(E)\le P(F)$ |
| Inclusion–exclusion | $P(E\cup F)=P(E)+P(F)-P(E\cap F)$ |
| Union bound | $P\!\left(\bigcup_i E_i\right)\le \sum_i P(E_i)$ |
Complement first: for "at least one…", compute $1-P(\text{none})$. Birthday: $P(\text{share})=1-\dfrac{P(365,n)}{365^n}$ → crosses $1/2$ at $n=23$. Same math = hash collisions (collisions likely at $n\approx\sqrt{m}$).
Random variables
| Term | Definition |
|---|---|
| Random variable | a function $X\colon S\to\mathbb{R}$ (not random, not a variable) |
| Event $\{X=a\}$ | $\{s\in S: X(s)=a\}$; has probability $P(X=a)$ |
| Distribution | the values of $X$ with their probabilities (must sum to 1) |
| Indicator $\mathbf{1}_A$ | $1$ if $s\in A$, else $0$; switches on when $A$ occurs |
Build a distribution: apply $X$ to every outcome, group by value (Counter).
Expected value
$$E[X]=\sum_{s\in S} X(s)\,p(s)=\sum_a a\,P(X=a)\quad(\text{long-run average / balance point})$$
| Fact | Statement |
|---|---|
| Indicator expectation | $E[\mathbf{1}_A]=P(A)$ |
| Linearity | $E[aX+bY]=aE[X]+bE[Y]$, and $E[\sum_i X_i]=\sum_i E[X_i]$ |
| Independence needed? | NO — linearity holds even for dependent variables |
The one move to remember: to find a hard $E[X]$, write $X=\sum_i \mathbf{1}_{A_i}$ (a sum of indicators), then $E[X]=\sum_i P(A_i)$. No joint distribution required.
Linearity worked examples
| Problem | Setup | Answer |
|---|---|---|
| Hat-check (fixed points of random perm.) | $X_i=\mathbf{1}[\text{own hat}]$, $E[X_i]=1/n$ | $E[X]=1$ (any $n$) |
| Heads in $n$ fair flips | $E[X_i]=1/2$ | $E[X]=n/2$ |
| Empty hash buckets ($n$ keys, $m$ buckets) | $P(\text{bucket empty})=(1-\tfrac1m)^n$ | $m(1-\tfrac1m)^n\approx m/e$ at $n=m$ |
Variance & standard deviation
$$\operatorname{Var}(X)=E[(X-\mu)^2]=\underbrace{E[X^2]-(E[X])^2}_{\textbf{use this}},\qquad \sigma=\sqrt{\operatorname{Var}(X)}$$
- Measures spread around the mean; squaring removes signs and punishes large deviations.
- $\operatorname{Var}$ is in squared units; report $\sigma$ (original units).
- $\operatorname{Var}\ge 0$ always; $\operatorname{Var}=0$ iff $X$ is constant.
- Fair die: $E[X]=\tfrac72$, $E[X^2]=\tfrac{91}{6}$, $\operatorname{Var}=\tfrac{35}{12}\approx 2.92$, $\sigma\approx 1.71$.
Simulation vs. proof (theme 4)
| Monte-Carlo simulation | Proof | |
|---|---|---|
| Gives | an estimate $\hat p=\text{hits}/k$ | the exact value |
| Cost / error | cheap; error $\sim 1/\sqrt{k}$ (×100 trials per extra digit) | one derivation, then certain |
| Guarantee | none (seed-dependent) | settles all cases |
| Basis | law of large numbers | axioms, counting, linearity |
Rule: simulate to find/check; prove to settle. "Ran it a million times" ≠ proof.
The probabilistic method
To prove an object with property $Q$ exists: pick a random object and show $$P(\text{random object has } Q) > 0 \;\Rightarrow\; \text{such an object exists.}$$ (Proof by contradiction: if none existed, that probability would be $0$.)
Recipe (two-coloring template): 1. Randomize (color each item by a fair coin). 2. Bound $P(\text{one bad event})$ (one block monochromatic: $2^{1-k}$). 3. Union bound over the $m$ bad events: $P(\text{any bad})\le m\,2^{1-k}$. 4. If that bound $<1$ (here $m<2^{k-1}$), then $P(\text{good})>0$ → a good object exists. $\blacksquare$
Averaging form: if $E[X]=\mu$, some outcome has $X\ge\mu$ and some has $X\le\mu$.
Common pitfalls
| Pitfall | Fix |
|---|---|
| Using $P=\lvert E\rvert/\lvert S\rvert$ on a non-equally-likely space (e.g. $\{$0,1,2 heads$\}$) | Build an equally-likely space first ($\{HH,HT,TH,TT\}$). |
| Computing "at least one" directly | Use the complement: $1-P(\text{none})$. |
| $E[XY]=E[X]E[Y]$ in general | False unless independent (Ch. 21); linearity is for sums, not products. |
| $E[X^2]=(E[X])^2$ | False — the gap is $\operatorname{Var}(X)$. |
| "Simulation said so" = proof | Evidence only; derive the exact value. |
| Forgetting linearity needs no independence | It always holds — exploit it. |
Toolkit additions (dmtoolkit/probability.py)
| Function | Returns | Idea |
|---|---|---|
simulate(trial_fn, k) |
empirical $\hat p$ = fraction of $k$ trials where trial_fn() is True |
law of large numbers (estimate) |
expected_value(rv, space, weight=None) |
exact $E[\text{rv}]$ over a finite space (uniform by default) | $\sum_s X(s)p(s)$ with Fraction (exact) |
(Chapter 21 adds bayes(pa, pb_given_a, pb).) simulate ≈ expected_value of an indicator — the two
sit on opposite sides of the estimate-vs-exact line.
The two ideas to remember
- A random variable is a function; its expectation is a weighted average — and expectation is linear, independence or not. Write hard expectations as sums of indicators.
- Positive probability proves existence. The probabilistic method turns a counting/existence question into one expectation or union-bound calculation.