Chapter 7 Quiz: The Law of Large Numbers — Why Small Samples Lie

Q: The law of large numbers states that as the number of trials increases, the observed average converges toward: a) Zero b) The most recent observed value c) The true expected value (probability) d) The median of all past observations

c) The true expected value (probability) The law of large numbers guarantees convergence to the true underlying probability (or expected value) as the sample grows. It says nothing about individual outcomes and does not say the average approaches zero or the most recent value.

DataField.Dev

Chapter 7 Quiz: The Law of Large Numbers — Why Small Samples Lie

Q1. The law of large numbers states that as the number of trials increases, the observed average converges toward:

a) Zero b) The most recent observed value c) The true expected value (probability) d) The median of all past observations

Show Answer

**c) The true expected value (probability)** The law of large numbers guarantees convergence to the true underlying probability (or expected value) as the sample grows. It says nothing about individual outcomes and does not say the average approaches zero or the most recent value.

Q2. A coin lands on heads 15 times out of 20 flips. Which of the following conclusions is best supported by the law of large numbers?

a) The coin is likely biased toward heads b) Tails is now "due" to balance things out c) 20 flips is insufficient to draw a reliable conclusion about bias d) The coin will land on heads more often in the next 20 flips

Show Answer

**c) 20 flips is insufficient to draw a reliable conclusion about bias** Even a fair coin can produce 15 heads in 20 flips by chance — the probability is not negligible. The law of large numbers tells us that only over many more flips will the true probability reveal itself. Option (a) overstates the evidence; option (b) is the gambler's fallacy; option (d) assumes a hot hand in a random process.

Q3. The key difference between the weak law and strong law of large numbers is:

a) The weak law applies only to coins; the strong law applies to all random variables b) The strong law guarantees convergence with probability 1; the weak law says convergence becomes arbitrarily probable c) The weak law applies to large samples; the strong law applies to infinite samples only d) The strong law requires normally distributed data; the weak law does not

Show Answer

**b) The strong law guarantees convergence with probability 1; the weak law says convergence becomes arbitrarily probable** The weak law says that for any desired precision, you can make it highly probable (though not guaranteed) that your sample average is close to the true mean by using a large enough sample. The strong law makes the stronger claim that convergence happens almost surely (probability 1). Neither requires normal distribution (answer d) nor are they limited to specific types of variables (answer a).

Q4. If the population variance of a measure is σ², and you take a sample of size n, the variance of the sample mean is:

a) σ² b) σ² × n c) σ² / n d) √(σ² / n)

Show Answer

**c) σ² / n** The variance of the sample mean equals the population variance divided by the sample size. This is why larger samples produce more stable (less variable) estimates — the denominator grows, shrinking the variance of the mean. The standard error of the mean (option d, which is √(σ²/n)) is the square root of this quantity.

Q5. Nadia tracks 30 different content variables (posting time, format, caption length, etc.) for correlation with views and finds 2 that are "statistically significant" at the p < 0.05 level. The most likely explanation is:

a) She has discovered two genuine content levers she should act on immediately b) Her posting strategy is better than she realized c) By random chance, about 1-2 of 30 tests will appear significant even if nothing is real d) The p < 0.05 threshold ensures these findings are reliable

Show Answer

**c) By random chance, about 1-2 of 30 tests will appear significant even if nothing is real** This is the multiple comparisons problem. At p < 0.05, you expect 5% of tests to produce false positives by chance. Testing 30 variables gives an expected 1.5 false positives even when nothing is truly correlated with views. Two "significant" results from 30 tests is entirely consistent with no real effects existing.

Q6. Statistical power is defined as:

a) The probability of rejecting the null hypothesis when it is true (false positive rate) b) The probability of detecting a real effect when one exists c) The minimum sample size needed for a study d) The probability that a significant result is a true positive

Show Answer

**b) The probability of detecting a real effect when one exists** Statistical power is the probability of correctly detecting a real effect (i.e., avoiding a false negative). Option (a) describes the Type I error rate (alpha). Option (c) describes a related but different quantity. Option (d) describes the positive predictive value, which depends on both power and the false-positive rate.

Q7. Button et al. (2013) found that median statistical power in neuroscience studies was approximately:

a) 80% b) 50% c) 21% d) 5%

Show Answer

**c) 21%** Button and colleagues found that the typical neuroscience study had approximately 21% power — meaning that even when a real effect existed, the study had roughly a 4-in-5 chance of failing to detect it. This level of underpowering means that positive findings from such studies are likely to be false positives rather than true discoveries.

Q8. The "winner's curse in science" refers to:

a) Scientists who win awards tend to produce worse work afterward b) Published significant results in underpowered studies tend to overestimate true effect sizes c) The first scientist to publish a finding always overstates its importance d) Statistical significance is harder to achieve for important findings

Show Answer

**b) Published significant results in underpowered studies tend to overestimate true effect sizes** Because small samples produce noisy estimates, and because only the "significant" results get published (publication bias), the studies that make it into the literature are those that happened to overestimate the effect. When replicated with proper power, the effects shrink — the "winner" of the publication lottery got there partly by luck.

Q9. The hot hand fallacy, as revisited by Miller and Sanjurjo (2018), taught us that:

a) The hot hand does not exist in any form in skill-based sports b) The original studies confirming the hot hand were also flawed c) The original studies denying the hot hand contained a mathematical bias; after correction, a modest genuine hot hand was found in skill tasks d) The hot hand is large and explains most performance streaks in sports

Show Answer

**c) The original studies denying the hot hand contained a mathematical bias; after correction, a modest genuine hot hand was found in skill tasks** Miller and Sanjurjo found that conditioning on streaks within finite sequences introduces a selection bias that makes the hot hand artificially look absent. After correcting for this, they found evidence of a small genuine hot hand in basketball — not large enough to explain dramatic streaks, but real. This does not vindicate the original hot hand studies (option b), nor does it confirm a large effect (option d).

Q10. Pre-registration of a hypothesis means:

a) Publishing your hypothesis after you've seen the results b) Stating your hypothesis and analysis plan before collecting or viewing the data c) Getting approval from a journal before running a study d) Registering your dataset with a public repository

Show Answer

**b) Stating your hypothesis and analysis plan before collecting or viewing the data** Pre-registration is a commitment device that prevents researchers (or analysts) from unconsciously adjusting hypotheses to fit data they've already seen. Post-hoc explanations of data — fitting a story after seeing results — are much more likely to produce false conclusions because the analyst has unconsciously tested many hypotheses and selected the best-fitting one.

Q11. Nadia posts content for 6 weeks and sees Wednesday posts outperforming others by 40%. Which action is most consistent with proper application of the law of large numbers?

a) Immediately restructure her entire content calendar around Wednesdays b) Treat the finding as a hypothesis and run a controlled test over the next 3 months before changing strategy c) Dismiss the finding entirely — 6 weeks can never tell you anything d) Average the 6-week finding with a subjective "gut feeling" to make her decision

Show Answer

**b) Treat the finding as a hypothesis and run a controlled test over the next 3 months before changing strategy** The law of large numbers means that 6 weeks of volatile content data is too small to trust a specific pattern. The proper response is to treat the finding as a hypothesis worth testing systematically, not to immediately act (option a) or dismiss it entirely (option c). Averaging data with gut feelings (option d) is not a statistical method.

Q12. The Open Science Collaboration (2015) replication project found that approximately what fraction of 100 psychology studies replicated as statistically significant?

a) ~90% b) ~60% c) ~39% d) ~10%

Show Answer

**c) ~39%** The Nosek et al. (2015) Open Science Collaboration paper found that only about 39 of 100 attempted replications were statistically significant, compared to 97 of the original studies. This was a striking demonstration of how widespread the replication problem was in psychology, largely attributable to underpowered original studies and publication bias.

Q13. When a sample is very small relative to the variance of what is being measured, patterns in that sample are:

a) Likely to be accurate reflections of the true underlying relationship b) Dominated by signal rather than noise c) Dominated by noise rather than signal, making them unreliable d) More likely to be true positives than true negatives

Show Answer

**c) Dominated by noise rather than signal, making them unreliable** When variance is high relative to sample size, the law of large numbers hasn't had enough trials to average out the noise. Individual extreme observations dominate the estimate, and apparent patterns are largely coincidental rather than reflecting true relationships.

Q14. A fund manager achieves 3 years of above-average returns. Which statement about this track record is most accurate from a law of large numbers perspective?

a) Three years is a long track record that provides strong evidence of skill b) Three years of annual returns gives only 3 data points, which is a small sample in a high-variance environment c) Market returns are normally distributed, so 3 years of data is statistically sufficient d) Above-average returns over 3 years can only be explained by skill, not luck

Show Answer

**b) Three years of annual returns gives only 3 data points, which is a small sample in a high-variance environment** Annual returns are a small sample in a context with enormous variance. Many studies suggest you need 20+ years of returns data to distinguish genuine skill from luck in a high-variance investment environment. Three years tells you very little — a large number of managers will beat the market by chance over any 3-year window.

Q15. Which of the following best describes the practical advice the law of large numbers offers for decision-making?

a) Never act until you have perfect certainty from a large dataset b) Treat small-sample patterns as hypotheses requiring further testing before irreversible action c) Use your intuition to supplement small samples since data alone is never enough d) The larger the claimed effect, the more data you need before acting

Show Answer

**b) Treat small-sample patterns as hypotheses requiring further testing before irreversible action** The law of large numbers doesn't counsel paralysis (option a) or intuition supplementation (option c). Its practical message is: small samples produce unreliable pattern estimates. Before making decisions you can't reverse, collect more data to let the law run further. Option (d) gets the direction backwards — larger effects are easier to see in smaller samples; small effects require more data.