Chapter 20 — Key Takeaways

DataField.Dev

Chapter 20 — Key Takeaways

Discrete Probability: Outcomes, Random Variables, Expectation. One-page review card. Reread before the exam.

The model in four objects

Object	What it is	Symbol
Sample space	set of all mutually-exclusive, exhaustive outcomes	$S$
Outcome	one element of $S$ (exactly one occurs)	$s \in S$
Event	a subset of outcomes; occurs when the outcome lies in it	$E \subseteq S$
Probability function	$P(E) = \sum_{s\in E} p(s)$, with $p(s)\ge 0$, $\sum_s p(s)=1$	$P$

Events are sets → use Chapter 8 operations: $E\cup F$ ("or"), $E\cap F$ ("and"), $\overline{E}$ ("not"), $E\cap F=\emptyset$ ("mutually exclusive").

Computing $P(E)$

Model	Formula	When it applies
Equally-likely (Laplace)	$P(E)=\dfrac{\lvert E\rvert}{\lvert S\rvert}=\dfrac{\#\text{favorable}}{\#\text{possible}}$	only if all outcomes equally likely
General discrete	$P(E)=\sum_{s\in E} p(s)$	always

Laplace turns every probability into two counting problems (Ch. 15–16).

Axioms & their consequences

Rule	Statement
Non-negativity	$P(E)\ge 0$
Normalization	$P(S)=1$
Additivity (disjoint)	$E\cap F=\emptyset \Rightarrow P(E\cup F)=P(E)+P(F)$
Complement	$P(\overline{E})=1-P(E)$
Impossible event	$P(\emptyset)=0$
Monotonicity	$E\subseteq F \Rightarrow P(E)\le P(F)$
Inclusion–exclusion	$P(E\cup F)=P(E)+P(F)-P(E\cap F)$
Union bound	$P\!\left(\bigcup_i E_i\right)\le \sum_i P(E_i)$

Complement first: for "at least one…", compute $1-P(\text{none})$. Birthday: $P(\text{share})=1-\dfrac{P(365,n)}{365^n}$ → crosses $1/2$ at $n=23$. Same math = hash collisions (collisions likely at $n\approx\sqrt{m}$).

Random variables

Term	Definition
Random variable	a function $X\colon S\to\mathbb{R}$ (not random, not a variable)
Event $\{X=a\}$	$\{s\in S: X(s)=a\}$; has probability $P(X=a)$
Distribution	the values of $X$ with their probabilities (must sum to 1)
Indicator $\mathbf{1}_A$	$1$ if $s\in A$, else $0$; switches on when $A$ occurs

Build a distribution: apply $X$ to every outcome, group by value (Counter).

Expected value

$$E[X]=\sum_{s\in S} X(s)\,p(s)=\sum_a a\,P(X=a)\quad(\text{long-run average / balance point})$$

Fact	Statement
Indicator expectation	$E[\mathbf{1}_A]=P(A)$
Linearity	$E[aX+bY]=aE[X]+bE[Y]$, and $E[\sum_i X_i]=\sum_i E[X_i]$
Independence needed?	NO — linearity holds even for dependent variables

The one move to remember: to find a hard $E[X]$, write $X=\sum_i \mathbf{1}_{A_i}$ (a sum of indicators), then $E[X]=\sum_i P(A_i)$. No joint distribution required.

Linearity worked examples

Problem	Setup	Answer
Hat-check (fixed points of random perm.)	$X_i=\mathbf{1}[\text{own hat}]$, $E[X_i]=1/n$	$E[X]=1$ (any $n$)
Heads in $n$ fair flips	$E[X_i]=1/2$	$E[X]=n/2$
Empty hash buckets ($n$ keys, $m$ buckets)	$P(\text{bucket empty})=(1-\tfrac1m)^n$	$m(1-\tfrac1m)^n\approx m/e$ at $n=m$

Variance & standard deviation

$$\operatorname{Var}(X)=E[(X-\mu)^2]=\underbrace{E[X^2]-(E[X])^2}_{\textbf{use this}},\qquad \sigma=\sqrt{\operatorname{Var}(X)}$$

Measures spread around the mean; squaring removes signs and punishes large deviations.
$\operatorname{Var}$ is in squared units; report $\sigma$ (original units).
$\operatorname{Var}\ge 0$ always; $\operatorname{Var}=0$ iff $X$ is constant.
Fair die: $E[X]=\tfrac72$, $E[X^2]=\tfrac{91}{6}$, $\operatorname{Var}=\tfrac{35}{12}\approx 2.92$, $\sigma\approx 1.71$.

Simulation vs. proof (theme 4)

	Monte-Carlo simulation	Proof
Gives	an estimate $\hat p=\text{hits}/k$	the exact value
Cost / error	cheap; error $\sim 1/\sqrt{k}$ (×100 trials per extra digit)	one derivation, then certain
Guarantee	none (seed-dependent)	settles all cases
Basis	law of large numbers	axioms, counting, linearity

Rule: simulate to find/check; prove to settle. "Ran it a million times" ≠ proof.

The probabilistic method

To prove an object with property $Q$ exists: pick a random object and show $$P(\text{random object has } Q) > 0 \;\Rightarrow\; \text{such an object exists.}$$ (Proof by contradiction: if none existed, that probability would be $0$.)

Recipe (two-coloring template): 1. Randomize (color each item by a fair coin). 2. Bound $P(\text{one bad event})$ (one block monochromatic: $2^{1-k}$). 3. Union bound over the $m$ bad events: $P(\text{any bad})\le m\,2^{1-k}$. 4. If that bound $<1$ (here $m<2^{k-1}$), then $P(\text{good})>0$ → a good object exists. $\blacksquare$

Averaging form: if $E[X]=\mu$, some outcome has $X\ge\mu$ and some has $X\le\mu$.

Common pitfalls

Pitfall	Fix
Using $P=\lvert E\rvert/\lvert S\rvert$ on a non-equally-likely space (e.g. $\{$0,1,2 heads$\}$)	Build an equally-likely space first ($\{HH,HT,TH,TT\}$).
Computing "at least one" directly	Use the complement: $1-P(\text{none})$.
$E[XY]=E[X]E[Y]$ in general	False unless independent (Ch. 21); linearity is for sums, not products.
$E[X^2]=(E[X])^2$	False — the gap is $\operatorname{Var}(X)$.
"Simulation said so" = proof	Evidence only; derive the exact value.
Forgetting linearity needs no independence	It always holds — exploit it.

Toolkit additions (`dmtoolkit/probability.py`)

Function	Returns	Idea
`simulate(trial_fn, k)`	empirical $\hat p$ = fraction of $k$ trials where `trial_fn()` is True	law of large numbers (estimate)
`expected_value(rv, space, weight=None)`	exact $E[\text{rv}]$ over a finite space (uniform by default)	$\sum_s X(s)p(s)$ with `Fraction` (exact)

(Chapter 21 adds bayes(pa, pb_given_a, pb).) simulate ≈ expected_value of an indicator — the two sit on opposite sides of the estimate-vs-exact line.

The two ideas to remember

A random variable is a function; its expectation is a weighted average — and expectation is linear, independence or not. Write hard expectations as sums of indicators.
Positive probability proves existence. The probabilistic method turns a counting/existence question into one expectation or union-bound calculation.

Chapter 20 — Key Takeaways

The model in four objects

Computing $P(E)$

Axioms & their consequences

Random variables

Expected value

Variance & standard deviation

Simulation vs. proof (theme 4)

The probabilistic method

Common pitfalls

Toolkit additions (dmtoolkit/probability.py)

The two ideas to remember

Toolkit additions (`dmtoolkit/probability.py`)