Chapter 8 Quiz: Regression to the Mean — Why Hot Streaks Cool Down

DataField.Dev

Chapter 8 Quiz: Regression to the Mean — Why Hot Streaks Cool Down

Q1. Francis Galton discovered regression to the mean by studying:

a) Card games and poker outcomes b) Heights of parents and their adult children c) Stock market returns across decades d) Student test scores before and after interventions

Show Answer

**b) Heights of parents and their adult children** Galton discovered in 1886 that the children of exceptionally tall or short parents tended to be closer to the population average than their parents — not as extreme. He called this "regression toward mediocrity," though the actual mechanism is statistical, not biological determinism.

Q2. The true mathematical mechanism behind regression to the mean is:

a) Nature's tendency to balance extremes over time b) Skill declining after periods of exceptional performance c) Extreme observations contain an unusually large luck component that doesn't reliably repeat d) The law of large numbers averaging out individual performance

Show Answer

**c) Extreme observations contain an unusually large luck component that doesn't reliably repeat** Regression to the mean occurs because observed performance = true ability + random luck. When performance is extreme, it is typically because luck was unusually favorable (or unfavorable) in addition to whatever the true ability level is. On the next observation, luck is likely to be more average, pulling the result back toward the mean. This requires no cosmic balancing force or declining skill.

Q3. If the correlation between first-season and second-season batting averages is r = 0.6, and a player was 2 standard deviations above the mean in their first season, what is the expected distance from the mean in their second season?

a) 2 standard deviations above the mean b) 1.2 standard deviations above the mean c) 0.8 standard deviations above the mean d) 0.6 standard deviations above the mean

Show Answer

**b) 1.2 standard deviations above the mean** The regression-to-the-mean formula: expected position = r × (original position) = 0.6 × 2 = 1.2 standard deviations above the mean. The player is still expected to be above average — exceptional underlying talent persists — but they are likely to regress from their extreme first-season performance. Note that this is an expectation, not a certainty.

Q4. The Israeli Air Force flight instructor example illustrates that:

a) Praise genuinely reduces pilot performance b) Criticism is a more effective teaching tool than praise c) Regression to the mean creates the illusion that criticism improves performance d) Talented pilots regress to average regardless of feedback

Show Answer

**c) Regression to the mean creates the illusion that criticism improves performance** The instructors praised exceptional maneuvers, which were then followed by worse (regressed) performance. They criticized poor maneuvers, which were then followed by better (regressed) performance. In both cases, regression to the mean was the real cause. The instructors correctly observed the pattern but drew the wrong causal conclusion — that praise hurts and criticism helps.

Q5. When a sports team hires a new coach after a terrible season and then improves the following year, which of the following is the most important alternative explanation to the coach's effectiveness?

a) The players worked harder knowing the organization was watching b) Regression to the mean: a terrible season contained unusual bad luck that would have improved regardless c) The new players acquired in the offseason drove the improvement d) The new coach changed the team culture, which takes one year to show results

Show Answer

**b) Regression to the mean: a terrible season contained unusual bad luck that would have improved regardless** While new acquisitions (option c) and culture changes (option d) may also contribute, the critical comparison for evaluating coaching effectiveness is: what would have happened without the coaching change? Teams that have terrible seasons tend to regress to the mean even without a coaching change, because terrible seasons are partly unlucky. Without a proper control group, you cannot separate the coach's effect from regression.

Q6. Regression to the mean is strongest when:

a) The correlation between first and second observations is high b) The correlation between first and second observations is low c) The sample size is large d) The variance in performance is low

Show Answer

**b) The correlation between first and second observations is low** The regression formula predicts that the expected second performance is r × (first performance's deviation). When r is low (close to 0), regression is nearly complete — extreme first performances are followed by essentially average second performances. When r is high (close to 1), there is little regression — performance is highly stable across observations.

Q7. Marcus has three great months of startup revenue. Dr. Yuki warns him not to make irreversible decisions based on this. Her main concern is:

a) The startup might be breaking the law b) Three months may contain unusual luck that is not representative of the true growth rate c) School is more important than entrepreneurship for a 17-year-old d) Investors will not trust three months of data

Show Answer

**b) Three months may contain unusual luck that is not representative of the true growth rate** Dr. Yuki's concern is that three extraordinary months are likely to contain an extraordinary luck component, and regression to the mean predicts that the fourth month will not continue the trend with equal probability. Making irreversible decisions (dropping to part-time school) based on this small and potentially lucky sample is the trap she's warning against.

Q8. Nadia's video goes viral with 450,000 views. Her next five videos average 3,000 views (her historical norm). The most accurate description of this pattern is:

a) The formula for her viral video failed because she couldn't replicate the luck component b) The algorithm penalized her for not replicating the viral video's formula c) She should completely change her content strategy d) This is expected: the viral video's exceptional performance regressed to the mean because its success partly depended on unrepeatable luck

Show Answer

**d) This is expected: the viral video's exceptional performance regressed to the mean because its success partly depended on unrepeatable luck** Option (a) is partially correct (the luck component couldn't be replicated) but frames it as a failure rather than an expected statistical outcome. Options (b) and (c) assume causation that isn't warranted. The correct framing is that 450,000 views was an extreme observation containing an unusual luck component (algorithmic timing, a viral share chain), and subsequent performance regressed to her true baseline.

Q9. The "illusion of intervention effects" refers to:

a) Placebos that appear to work due to expectation b) The tendency to credit or blame interventions for changes that would have happened due to regression c) The belief that coaching always matters more than player talent d) How organizational culture affects performance measurements

Show Answer

**b) The tendency to credit or blame interventions for changes that would have happened due to regression** When we intervene after extreme performances (bad grade → tutoring; terrible month → strategy change; bad maneuver → criticism), the natural regression of those extremes toward the mean is almost always attributed to the intervention. Without a control group that didn't receive the intervention, we cannot separate regression from genuine intervention effects.

Q10. The best way to distinguish regression to the mean from genuine performance decline is:

a) Ask the person what changed in their approach b) Look at performance across more than three observations c) Compare to a control group that had similar performance without an intervention d) Wait for a long enough period that regression would have fully occurred

Show Answer

**c) Compare to a control group that had similar performance without an intervention** The key to separating regression from genuine change is to ask: what would have happened without the change? A comparison group matched on performance level that did not receive the intervention answers this question. Self-reported changes (option a) are unreliable; more observations (option b) help but don't identify the cause; waiting (option d) doesn't provide the comparison needed.

Q11. A student scores in the bottom 5% on a standardized test and is then enrolled in an intensive tutoring program. After the program, they score at the 30th percentile. The correct conclusion is:

a) The tutoring program unambiguously improved performance b) The improvement could be due to regression to the mean and cannot be attributed to tutoring without a control group c) The student's true ability is at the 30th percentile d) The program should be expanded to all students

Show Answer

**b) The improvement could be due to regression to the mean and cannot be attributed to tutoring without a control group** Students selected for tutoring because they scored in the bottom 5% are extreme cases. Extreme cases regress to the mean — their true ability is likely not as low as the bottom-5% score suggested. A retest without any tutoring would likely show some improvement just from regression. Without a control group (students who scored in the bottom 5% and received no tutoring), you cannot determine how much of the improvement is regression and how much is the tutoring's effect.

Q12. Which performance domain exhibits the weakest year-to-year correlation (and thus the strongest regression to the mean)?

a) Chess ratings in grandmasters b) Early-stage startup monthly revenue c) Adult height measured across consecutive years d) Olympic 100m sprint times for elite athletes

Show Answer

**b) Early-stage startup monthly revenue** Early-stage startup revenue is extraordinarily volatile — small customer bases, lumpy deal cycles, and heavy dependence on a few relationships create massive month-to-month variance and weak autocorrelation. Chess ratings (option a) are designed to be stable measures of skill and change slowly. Adult height (option c) is essentially constant. Elite sprint times (option d) are highly consistent for athletes at peak condition.

Q13. Francis Galton's original term for regression to the mean was:

a) Statistical convergence b) Mean reversion c) Regression toward mediocrity d) Ancestral regression

Show Answer

**c) Regression toward mediocrity** Galton called the phenomenon "regression toward mediocrity" — reflecting his (ultimately incorrect) interpretation that there was a hereditary force pulling extreme traits back to average. The term "regression to the mean" is now standard because it describes the statistical mechanism without implying a directional "pull" toward anything like "mediocrity." The modern term is more accurate.

Q14. A fund manager significantly outperforms the market for 3 years. According to regression-to-the-mean principles, the most important question to ask is:

a) What was their investment strategy during the 3 years? b) How much of the outperformance was from luck in a high-variance environment versus genuine skill? c) How much money did they manage during those 3 years? d) What is the fund's expense ratio?

Show Answer

**b) How much of the outperformance was from luck in a high-variance environment versus genuine skill?** This is the central question regression to the mean forces us to ask. In a high-variance environment like markets, a lucky 3-year run is consistent with no skill at all. To evaluate whether the manager has genuine skill, you need to either observe them over many more periods or analyze the consistency and mechanism of their returns in ways that distinguish skill from luck.

Q15. Which statement best captures the practical wisdom of regression to the mean?

a) Never make decisions based on recent performance data b) Hot streaks are fake and should be ignored c) Treat extreme performance periods as data points worth noting but not as the new baseline for planning d) Always wait for performance to decline before acting

Show Answer

**c) Treat extreme performance periods as data points worth noting but not as the new baseline for planning** Regression to the mean does not counsel ignoring data (option a) or dismissing hot streaks (option b) — a genuine breakout performance is informative. It counsels against treating extreme periods as the new normal for irreversible decisions. Option (d) is too passive; the right approach is to gather more data while maintaining current strategy, not to wait for decline.