Chapter 8 Quiz: Hypothesis Testing and Statistical Significance
Instructions: Answer all 25 questions. Each question has one best answer unless otherwise noted. After completing the quiz, check your answers using the hidden answer key at the end.
Question 1
In the context of evaluating a sports bettor's record, the null hypothesis (H0) typically states that:
A) The bettor is skilled and has a genuine edge B) The bettor's win rate is equal to the breakeven rate C) The bettor has no predictive ability and results are due to chance D) The bettor's results are statistically significant
Question 2
A p-value of 0.03 means:
A) There is a 3% chance the null hypothesis is true B) There is a 3% chance the bettor has no skill C) If the null hypothesis were true, there is a 3% probability of observing results as extreme or more extreme than what was observed D) The bettor has a 97% chance of being skilled
Question 3
A Type I error in sports betting hypothesis testing occurs when:
A) You conclude a bettor is skilled when they actually are skilled B) You conclude a bettor is not skilled when they actually are skilled C) You conclude a bettor is skilled when they actually have no edge D) You fail to collect enough data to make a conclusion
Question 4
A bettor has won 270 out of 500 bets (54%). Under the null hypothesis of p = 0.50, what is the approximate standard error of the sample proportion?
A) 0.0100 B) 0.0224 C) 0.0500 D) 0.1000
Question 5
Which of the following sample sizes would typically be sufficient to detect a 53% true win rate as statistically significant (at alpha = 0.05 with 80% power) against a null of 50%?
A) 200 bets B) 500 bets C) 1,000 bets D) 3,000 bets
Question 6
The Bonferroni correction adjusts the significance level by:
A) Multiplying the p-values by the number of tests B) Dividing the significance level by the number of tests C) Taking the square root of the significance level D) Both A and B are equivalent valid descriptions
Question 7
A researcher tests 20 different betting strategies at the 5% significance level. If none of the strategies have a real edge, approximately how many would you expect to appear significant by chance?
A) 0 B) 1 C) 5 D) 10
Question 8
Statistical power is defined as:
A) The probability of rejecting H0 when H0 is true B) The probability of failing to reject H0 when H0 is false C) The probability of rejecting H0 when H0 is false D) The probability of failing to reject H0 when H0 is true
Question 9
A bettor has a 95% confidence interval for their win rate of [0.505, 0.575]. Which of the following is a correct interpretation?
A) There is a 95% probability their true win rate falls in this interval B) If they repeated their betting many times and constructed CIs each time, 95% of those intervals would contain the true win rate C) 95% of their future bets will fall within this range D) They are 95% confident they are a winning bettor
Question 10
The breakeven win rate for bets at -110 odds is approximately:
A) 50.0% B) 51.2% C) 52.4% D) 54.5%
Question 11
In a chi-squared goodness-of-fit test comparing observed versus expected betting outcomes across 6 categories, the degrees of freedom are:
A) 5 B) 6 C) 7 D) 12
Question 12
Which multiple testing correction method controls the False Discovery Rate (FDR) rather than the family-wise error rate (FWER)?
A) Bonferroni correction B) Holm-Bonferroni procedure C) Benjamini-Hochberg procedure D) Sidak correction
Question 13
A bettor checks their p-value after every 50 bets and stops as soon as they achieve p < 0.05. This practice:
A) Is a valid approach to sequential testing B) Inflates the Type I error rate above 0.05 C) Deflates the Type I error rate below 0.05 D) Has no effect on the Type I error rate
Question 14
Which of the following increases the power of a hypothesis test? (Select all that apply)
A) Increasing the sample size B) Increasing the significance level (alpha) C) The true effect size being larger D) All of the above
Question 15
A bettor has a record of 520 wins out of 1000 bets. The z-statistic for testing H0: p = 0.50 is:
A) 0.63 B) 1.26 C) 1.96 D) 2.53
Question 16
The "file drawer problem" in sports betting research refers to:
A) Researchers losing their data B) Studies with non-significant results not being published, creating a biased literature C) Bettors hiding their losing records D) Sportsbooks keeping their algorithms secret
Question 17
For a one-sided test at alpha = 0.05, the critical z-value is approximately:
A) 1.28 B) 1.645 C) 1.96 D) 2.33
Question 18
A Bayesian approach to evaluating a bettor differs from a frequentist approach primarily because:
A) It uses larger sample sizes B) It incorporates prior beliefs about the bettor's skill C) It always gives more significant results D) It does not require any data
Question 19
Two bettors each have a 55% win rate. Bettor A has 200 bets and Bettor B has 2000 bets. Which statement is most accurate?
A) Both provide equally strong evidence of skill B) Bettor A provides stronger evidence because the same win rate with fewer bets is more impressive C) Bettor B provides stronger evidence because the larger sample gives more precise estimation D) Neither provides strong evidence because 55% is barely above 50%
Question 20
When comparing the win rates of two independent groups (e.g., home favorites vs. road favorites), the appropriate test is:
A) Paired t-test B) One-sample z-test for proportions C) Two-sample z-test for proportions D) Chi-squared test for independence
Question 21
The Wilson score interval is preferred over the Wald interval for proportions because:
A) It is always narrower B) It has better coverage properties, especially for small samples or proportions near 0 or 1 C) It is easier to calculate D) It does not require any assumptions
Question 22
If a bettor needs 2,000 bets to demonstrate significance for a 53% true win rate, and they place 5 bets per week, approximately how long will it take?
A) About 2 years B) About 4 years C) About 7.5 years D) About 10 years
Question 23
The effect size in a test of betting skill is best described as:
A) The p-value B) The difference between the observed win rate and the null hypothesis win rate C) The sample size D) The confidence level
Question 24
When a sportsbook identifies a bettor as "sharp," they are essentially:
A) Committing a Type I error B) Rejecting the null hypothesis that the bettor has no edge C) Accepting the null hypothesis D) Performing a Bayesian analysis
Question 25
A researcher finds a p-value of 0.001 for a betting system that yields 50.5% wins over 100,000 games. The best interpretation is:
A) This is strong evidence of a profitable betting system B) This is strong evidence that the win rate differs from 50%, but the practical significance (effect size) is too small to be exploitable after vig C) The test must be incorrect because 50.5% is too close to 50% D) The researcher should use a smaller sample to get a more meaningful result
Answer Key
Click to reveal answers
**Question 1: C** The null hypothesis assumes the bettor has no edge (no predictive ability). Observed results are attributed to chance until evidence suggests otherwise. **Question 2: C** A p-value is the probability of observing data as extreme or more extreme than what was observed, *given that the null hypothesis is true*. It is NOT the probability that the null hypothesis is true. **Question 3: C** A Type I error (false positive) occurs when you reject a true null hypothesis -- concluding the bettor is skilled when they actually have no edge. **Question 4: B** SE = sqrt(p0 * (1 - p0) / n) = sqrt(0.50 * 0.50 / 500) = sqrt(0.0005) = 0.0224. **Question 5: D** Detecting a 53% win rate against a null of 50% requires approximately 2,400-3,500 bets depending on the test used. 3,000 is in the correct range. **Question 6: D** The Bonferroni correction can equivalently be described as dividing the significance level by the number of tests (alpha_adj = alpha / m) or multiplying each p-value by the number of tests (p_adj = p * m) and comparing to the original alpha. Both formulations yield the same decisions. **Question 7: B** Expected false positives = 20 * 0.05 = 1. On average, 1 out of 20 tests would appear significant by chance. **Question 8: C** Power = P(reject H0 | H0 is false) = 1 - beta, where beta is the Type II error rate. **Question 9: B** The frequentist interpretation of a confidence interval refers to the long-run coverage rate of the procedure, not the probability for a specific interval. **Question 10: C** At -110 odds, you risk $110 to win $100. Breakeven = 110 / (110 + 100) = 110/210 = 0.5238, approximately 52.4%. **Question 11: A** Degrees of freedom for a chi-squared goodness-of-fit test = number of categories - 1 = 6 - 1 = 5. **Question 12: C** The Benjamini-Hochberg procedure controls the False Discovery Rate. Bonferroni and Holm-Bonferroni control the family-wise error rate. **Question 13: B** Optional stopping (checking and stopping at significance) inflates the Type I error rate because you are giving yourself multiple chances to cross the threshold by chance. **Question 14: D** All three factors increase power: larger samples give more precise estimates, a higher alpha makes it easier to reject, and a larger true effect is easier to detect. **Question 15: B** z = (p_hat - p0) / SE = (0.52 - 0.50) / sqrt(0.50 * 0.50 / 1000) = 0.02 / 0.01581 = 1.265, approximately 1.26. **Question 16: B** The file drawer problem refers to publication bias: significant results are published while non-significant results remain "in the file drawer," creating a skewed literature. **Question 17: B** For a one-sided test at alpha = 0.05, the critical z-value is 1.645. The value 1.96 corresponds to a two-sided test at alpha = 0.05. **Question 18: B** The key distinction of Bayesian analysis is the incorporation of prior probabilities (prior beliefs about the parameter), which are updated with observed data to produce a posterior distribution. **Question 19: C** Larger sample sizes produce more precise estimates and more powerful tests. Bettor B's 55% over 2000 bets provides much stronger evidence than the same rate over 200 bets. **Question 20: C** When comparing proportions from two independent groups, use a two-sample z-test for proportions. A chi-squared test for independence (D) would also be valid and yields equivalent results for 2x2 tables. **Question 21: B** The Wilson score interval has better coverage properties than the Wald interval, particularly for small samples or when the true proportion is near 0 or 1. The Wald interval can produce impossible values (below 0 or above 1). **Question 22: C** 2,000 bets / 5 bets per week = 400 weeks = approximately 7.7 years. **Question 23: B** The effect size measures the magnitude of the departure from the null hypothesis -- in this case, how far the true win rate is from the hypothesized value (e.g., 50% or 52.38%). **Question 24: B** Identifying a bettor as sharp is analogous to rejecting the null hypothesis that they have no edge, based on the evidence of their betting record and patterns. **Question 25: B** With 100,000 observations, even tiny deviations from 50% can be statistically significant. However, a 50.5% win rate is far below the breakeven threshold needed to overcome the vig, making it statistically significant but practically useless for profitable betting.End of Chapter 8 Quiz