Exercises: Discrete Probability

These build from mechanical warm-ups (count favorable over total) to full proofs, code, modeling, and the probabilistic method. Difficulty: ⭐ foundational, ⭐⭐ intermediate, ⭐⭐⭐ challenging. Worked solutions to the starred-with-a-dagger (†) and odd-numbered problems are in the appendix answers-to-selected.md; build the sample space yourself before you peek. The recurring discipline of the chapter applies to every exercise: choose the sample space deliberately — almost every probability mistake is really a sample-space mistake.


Part A — Warm-ups (⭐)

20.1 † A fair eight-sided die (faces $1$ through $8$) is rolled once. Write the sample space $S$ and its size, and compute $P(\text{the roll is a multiple of } 3)$ using the equally-likely formula.

20.2 For the two-coin experiment $S = \{HH, HT, TH, TT\}$, write each of these as a subset of $S$ and give its probability: (a) "the two flips match," (b) "the first flip is heads," (c) "at most one head."

20.3 † A card is drawn from a standard 52-card deck. Let $E$ = "the card is a face card" (J, Q, K) and $F$ = "the card is a heart." Compute $P(E)$, $P(F)$, $P(E \cap F)$, and then $P(E \cup F)$ using the inclusion–exclusion formula. Confirm your $P(E \cup F)$ by counting $\lvert E \cup F \rvert$ directly.

20.4 A random variable $X$ has distribution $P(X=0) = 0.5$, $P(X=1) = 0.3$, $P(X=2) = 0.2$. Verify the distribution is valid, then compute $E[X]$.

20.5 † Using the complement rule, find $P(\text{at least one six})$ when a fair die is rolled four times. (Hint: the complement is "no sixes at all.")


Part B — Prove it (⭐⭐)

20.6 † (Scaffolded — fill in the missing steps.) Prove the monotonicity consequence of the axioms: if $E \subseteq F$ then $P(E) \le P(F)$. Fill in the blanks: - Because $E \subseteq F$, we can write $F$ as the disjoint union of $E$ and $\underline{\hphantom{xxxx}}$ (the part of $F$ outside $E$). - By the additivity axiom, $P(F) = P(E) + \underline{\hphantom{xxxxxx}}$. - By the non-negativity axiom, that second term satisfies $\underline{\hphantom{xxxx}} \ge 0$. - Therefore $P(F) \ge \underline{\hphantom{xx}}$, which is the claim. $\blacksquare$

20.7 Prove that for any event $E$, $0 \le P(E) \le 1$. (You have $P(E) \ge 0$ from an axiom; get the upper bound from monotonicity and normalization.)

20.8 † Prove the indicator-expectation lemma directly from the definition of expected value: for any event $A$, $E[\mathbf{1}_A] = P(A)$. (State the two values $\mathbf{1}_A$ takes and their probabilities, then apply $E[X] = \sum_a a\,P(X=a)$.)

20.9 Prove the union bound $P(E \cup F) \le P(E) + P(F)$ for two events, starting from the inclusion–exclusion formula proved in §20.2. Which axiom guarantees the term you drop is non-negative?

20.10 † A fair coin is flipped $n$ times. Let $X$ be the number of heads. Using indicator variables $X_k = \mathbf{1}[\text{flip } k \text{ is heads}]$ and linearity of expectation, prove that $E[X] = n/2$. State explicitly where (if anywhere) you used independence of the flips.

20.11 Let $X$ be the number on one roll of a fair $n$-sided die (faces $1$ through $n$). Prove that $E[X] = \frac{n+1}{2}$, using the closed form for $\sum_{i=1}^{n} i$ from Chapter 11.


Part C — Implement it (⭐⭐)

Write Python for each. Do not run it on the grader's machine — hand-trace and include an # Expected output: comment, matching the book's convention. Keep each function to ≤ 30 lines and prefer Fraction when an exact answer is expected.

20.12 † Write a function event_prob(space, predicate) that takes a list space of equally-likely outcomes and a boolean predicate(outcome), and returns $P(E) = \lvert E \rvert / \lvert S \rvert$ as a Fraction. Use it to compute $P(\text{sum} = 7)$ for two dice by building the 36-outcome space.

20.13 Write distribution(space, rv) that returns a dict mapping each value of the random variable rv to its probability (a Fraction), for an equally-likely space. Test it on the two-dice sum and print the probability of $X = 5$ and the total (which must be $1$).

20.14 † Implement expected_value_uniform(space, rv) returning $E[X]$ exactly over an equally-likely space, then variance_uniform(space, rv) returning $\operatorname{Var}(X)$ via the computational formula $E[X^2] - (E[X])^2$. Confirm on one fair die that you get $7/2$ and $35/12$.

20.15 Write a Monte-Carlo function estimate_birthday(n, trials) that simulates $n$ people with random birthdays in $\{1, \dots, 365\}$ and returns the empirical fraction of trials in which at least two people share a birthday. (Seed the generator so your # Expected output: is reproducible, and state which exact value from §20.2 it should hover near for $n = 23$.)


Part D — Find the error (⭐⭐)

Each "argument" below is wrong. State precisely which part fails and why, citing the relevant definition, axiom, or theorem.

20.16 † Claim: "When two fair coins are flipped, the outcomes are one head, two heads, and no heads, so $P(\text{two heads}) = 1/3$." Find the flaw. (Hint: which assumption of the equally-likely formula is violated?)

20.17 Claim: "Let $X$ and $Y$ be the two values shown on a pair of fair dice. Then $E[XY] = E[X]\,E[Y] = 3.5 \times 3.5 = 12.25$, and also $E[X^2] = (E[X])^2 = 12.25$." Find the flaw. (One of these two equations is fine and one is not — say which, and why.)

20.18 † Claim: "In the hat-check problem, the events 'person $i$ gets their own hat' are dependent, so we cannot add their probabilities, and linearity of expectation does not apply. Therefore $E[X]$ requires the full derangement count." Find the flaw. (What exactly does linearity require?)

20.19 Claim: "I want to show a coloring with no monochromatic block exists. I simulated random colorings 10 million times and one of them had no monochromatic block, so I have proved such a coloring exists." Critique this. Is the conclusion (a coloring exists) true? Is the argument a valid proof? Distinguish the two.


Part E — Model it & Conjecture and test (⭐⭐⭐)

20.20 † (Model it.) A load balancer sends each of $n$ incoming requests, independently and uniformly at random, to one of $m$ servers. Translate each of the following into a precise discrete-math object (a sample space, an event, or a random variable on it): (a) "the assignment of requests to servers," (b) "server 1 receives no requests," (c) "the number of requests server 1 receives," (d) "the number of idle servers." You do not need to compute anything — just name each object precisely.

20.21 (Model it, then compute.) Continuing 20.20, let $X$ be the number of idle (empty) servers. Using indicators $X_j = \mathbf{1}[\text{server } j \text{ is idle}]$ and linearity, derive a closed form for $E[X]$. Then specialize to $n = m$ and explain the "$\approx m/e$" rule of thumb from §20.4.

20.22 † (Conjecture and test, then prove.) For a uniformly random permutation of $\{1, \dots, n\}$, let $X$ be the number of fixed points (positions $i$ with $\pi(i) = i$). Write code (brute force over all $n!$ permutations, using itertools.permutations) to compute $E[X]$ for $n = 1, 2, 3, 4, 5$. Conjecture the value of $E[X]$, then prove your conjecture with linearity of expectation. Does the answer depend on $n$? Report what the data showed and what the proof settled.

20.23 (Conjecture and test, then prove or disprove.) Define the random variable $Y$ = number of fixed points as in 20.22, and now compute $E[Y^2]$ by brute force for $n = 1, \dots, 6$. Conjecture a formula for $\operatorname{Var}(Y) = E[Y^2] - (E[Y])^2$ as a function of $n$. (You will find a strikingly simple pattern for $n \ge 2$.) State your conjecture; a full proof is a Deep Dive, but say in words why the variance is not simply the sum of the indicators' variances.

20.24 † (Conjecture and test.) The coupon collector: each cereal box contains one of $m$ equally-likely coupons. Write a Monte-Carlo simulation estimating the expected number of boxes you must buy to collect all $m$ coupons, for $m = 4, 6, 10$. Compare your estimates to the value $m \cdot H_m = m\left(1 + \tfrac12 + \tfrac13 + \cdots + \tfrac1m\right)$ (the harmonic number $H_m$ from Chapter 11). Do your simulated averages track $m H_m$? (You are checking, not proving, the formula.)


Part F — Interleaved & Deep Dive

Mixing techniques from earlier chapters keeps them sharp.

20.25 † (Ch. 16 + 20.) A 5-card poker hand is dealt from a standard 52-card deck. Using the counting tools of Chapter 16 for both numerator and denominator, compute $P(\text{exactly one pair})$ — two cards of one rank and three cards of three other distinct ranks. State which counting rule supplies each factor.

20.26 (Ch. 17 + 20.) You hash $n = 13$ keys uniformly into $m = 12$ buckets. (a) Use the pigeonhole principle (Chapter 17) to argue that at least one collision is guaranteed. (b) Then use the birthday calculation from §20.2 to find the probability of at least one collision when $n = 5$ keys go into $m = 12$ buckets, where collisions are not guaranteed. Contrast "guaranteed" with "probable."

20.27 † (Ch. 8 + 20.) Let $E$ and $F$ be events. Using only set algebra (Chapter 8) and the probability axioms, prove that $P(E) = P(E \cap F) + P(E \cap \overline{F})$. (This "law of total probability" decomposition is the workhorse behind Chapter 21's conditional reasoning.)

20.28 (Ch. 6 + 20.) The general linearity statement $E[X_1 + \cdots + X_n] = E[X_1] + \cdots + E[X_n]$ is proved by induction on $n$ from the two-variable case. Write the induction: state $P(n)$, give the base case, and write the inductive step showing where the two-variable linearity is applied.

20.29 (Ch. 19 + 20 — Fibonacci, optional.) Flip a fair coin repeatedly. Let $a_n$ be the number of length-$n$ head/tail strings that contain no two consecutive heads. Show that $a_n$ satisfies the Fibonacci-style recurrence $a_n = a_{n-1} + a_{n-2}$ (with appropriate initial values), so that $P(\text{no two consecutive heads in } n \text{ flips}) = a_n / 2^n$. Compute this probability for $n = 4$. (You met this recurrence in Chapters 18–19.)

20.30 (Deep Dive — the averaging argument.) Prove the expectation form of the probabilistic method: if a random variable $X$ has $E[X] = \mu$, then there is at least one outcome $s$ with $X(s) \ge \mu$ and at least one outcome with $X(s) \le \mu$. (Hint: if every outcome had $X < \mu$, what would the weighted average $E[X]$ have to satisfy? Contradiction.) This is the cousin of the positive- probability method from §20.6.

20.31 (Deep Dive — the probabilistic method.) Re-derive the two-coloring theorem of §20.6 from scratch for the concrete case $k = 4$: a collection of $m$ blocks, each a 4-element subset of the items. For which values of $m$ does the proof guarantee a coloring with no monochromatic block? Show the union- bound arithmetic explicitly and state the threshold on $m$.


Solutions to † and odd-numbered exercises: see appendices/answers-to-selected.md. For each proof, the rubric rewards: a deliberately chosen sample space, correct use of the axioms or linearity (and an explicit note on whether independence was needed), and — for the probabilistic method — a clear statement of which probability was shown to be positive and why that forces existence.