Chapter 14 Quiz: Self-Assessment

Instructions: Answer each question without looking back at the chapter. After completing all questions, check your answers against the key at the bottom. If you score below 70%, revisit the relevant sections before moving on to Chapter 15.


Multiple Choice

Q1. Overfitting occurs when a model:

a) Is too simple to capture the real patterns in the data b) Captures noise along with signal and fails to generalize to new data c) Has too few parameters to fit the training data d) Performs poorly on both training and test data

Q2. The bias-variance tradeoff states that:

a) Models with high bias always outperform models with high variance b) Increasing model complexity decreases both bias and variance simultaneously c) Reducing bias (by adding complexity) tends to increase variance, and vice versa d) Bias and variance are unrelated properties of a model

Q3. Underfitting occurs when:

a) A model memorizes the training data b) A model is too complex for the available data c) A model is too simple to capture the genuine patterns in the data d) A model performs well on test data but poorly on training data

Q4. The term apophenia refers to:

a) A statistical technique for detecting overfitting b) The tendency to perceive meaningful patterns in meaningless or random data c) A form of regularization used in machine learning d) The failure to recognize patterns that are genuinely present

Q5. Cross-validation helps detect overfitting by:

a) Training the model on all available data to maximize performance b) Splitting data into portions and testing the model on data it was not trained on c) Increasing the number of model parameters until training error reaches zero d) Removing outliers from the dataset

Q6. The replication crisis in science is an example of overfitting because:

a) Scientists deliberately fabricate results b) Many published findings are based on large, representative samples c) Studies that produce significant results on small, noisy datasets often fail when tested on new, independent samples d) The scientific method is inherently flawed and should be replaced

Q7. Researcher degrees of freedom refers to:

a) The number of researchers involved in a study b) The many undisclosed choices researchers make during data collection and analysis that increase the flexibility to find spurious results c) The freedom of researchers to choose their own topics d) The statistical degrees of freedom in a chi-square test

Q8. Occam's razor functions as a regularization technique because it:

a) Eliminates all uncertainty from the model b) Penalizes complex explanations, reducing the risk that the model is fitting noise c) Guarantees that the simpler explanation is always correct d) Increases the number of parameters available to the model

Q9. Which of the following is NOT a form of regularization discussed in the chapter?

a) Scientific peer review b) Portfolio diversification c) Increasing model complexity d) Constitutional constraints on government

Q10. Superstition is described as a form of overfitting because:

a) Superstitious people are less intelligent than non-superstitious people b) The human brain detects correlations between unrelated events in small samples and treats them as causal patterns c) Superstitions are based on controlled scientific experiments d) Superstitious beliefs never contain any element of truth

Q11. The hedge fund Long-Term Capital Management (LTCM) failed because:

a) Its models were too simple to capture market dynamics b) Its founders lacked the necessary expertise in financial mathematics c) Its models overfit to the stable market conditions of the mid-1990s and assigned near-zero probability to extreme events d) It used too little leverage

Q12. Narrative overfitting in history refers to:

a) Writing overly long historical narratives b) Constructing compelling causal narratives that explain the past perfectly but would fail if applied to different historical events c) Refusing to offer any interpretation of historical events d) Using only primary sources in historical research

Q13. Conspiracy thinking is related to overfitting because:

a) Conspiracy theories are always false b) Conspiracy theorists use elaborate, high-flexibility models to connect coincidental events into patterns, without testing whether those patterns would hold in new data c) Conspiracy theorists have access to more data than other people d) Government agencies always tell the truth, making conspiracy theories unnecessary

Q14. The evolutionary explanation for human apophenia is that:

a) Humans evolved in environments where seeing too many patterns had no cost b) In the ancestral environment, the cost of false negatives (missing real threats) was much higher than the cost of false positives (seeing threats that weren't there) c) Ancient humans had more data to work with than modern humans d) Evolution optimized the human brain for scientific accuracy

Q15. In the bias-variance tradeoff, irreducible error refers to:

a) The error caused by using the wrong model b) The noise in the data that no model can eliminate c) The error caused by having too few data points d) The error that can be reduced by adding more parameters

Q16. Out-of-sample testing in machine learning is analogous to:

a) Memorizing answers to practice exams in education b) Replication of scientific findings by independent researchers c) Reading the same history book multiple times d) Backtesting a trading strategy on the same data used to develop it

Q17. The risk of overfitting increases when:

a) The model has few degrees of freedom relative to the amount of data b) The model has many degrees of freedom relative to the amount of data c) The data is highly representative and abundant d) The model is constrained by strong regularization

Q18. Publication bias contributes to overfitting in science because:

a) Journals publish too many negative results b) Only positive results tend to be published, creating a biased sample of all studies conducted and inflating the apparent reliability of effects c) Peer review eliminates all false positives d) Published studies always use large sample sizes

Q19. The chapter argues that science is best understood as:

a) A method for eliminating all uncertainty b) Systematic regularization -- institutional constraints designed to prevent the human brain's natural overfitting tendency from producing false knowledge c) A domain where overfitting cannot occur d) An alternative to pattern recognition

Q20. The chapter's central argument is best summarized as:

a) Overfitting is a problem unique to machine learning that has no relevance to other fields b) The human brain is perfectly calibrated for pattern recognition and rarely makes errors c) Overfitting is a universal failure mode of pattern recognition that occurs identically across machine learning, science, history, finance, superstition, and conspiracy thinking, and the bias-variance tradeoff is an inescapable feature of all learning from data d) Simple models always outperform complex models


Short Answer

Q21. In two to three sentences, explain how the concept of "regularization" unifies Occam's razor, scientific peer review, portfolio diversification, and intellectual humility. What do all four have in common?

Q22. Describe a specific example from your own experience or knowledge (not from the chapter) where overfitting occurred or could have occurred. Identify the model, the training data, and the test data.

Q23. The chapter argues that the bias-variance tradeoff is inescapable. In your own words, explain why it is impossible to build a model that has both zero bias and zero variance. What fundamental feature of learning from finite data creates this constraint?


Answer Key

Q1: b) Captures noise along with signal and fails to generalize to new data Section 14.1-14.2 -- Overfitting is memorizing noise rather than learning signal, leading to poor generalization.

Q2: c) Reducing bias (by adding complexity) tends to increase variance, and vice versa Section 14.8 -- The bias-variance tradeoff is an inescapable constraint: you cannot minimize both simultaneously.

Q3: c) A model is too simple to capture the genuine patterns in the data Section 14.2 -- Underfitting is the opposite of overfitting: too much bias, too little flexibility.

Q4: b) The tendency to perceive meaningful patterns in meaningless or random data Section 14.5 -- Apophenia is the cognitive foundation of superstition and the human brain's built-in overfitting tendency.

Q5: b) Splitting data into portions and testing the model on data it was not trained on Section 14.2 -- Cross-validation provides an honest estimate of generalization by testing on held-out data.

Q6: c) Studies that produce significant results on small, noisy datasets often fail when tested on new, independent samples Section 14.3 -- The replication crisis is overfitting at the level of entire scientific fields.

Q7: b) The many undisclosed choices researchers make during data collection and analysis that increase the flexibility to find spurious results Section 14.3 -- Researcher degrees of freedom are the scientific equivalent of model parameters.

Q8: b) Penalizes complex explanations, reducing the risk that the model is fitting noise Section 14.9 -- Occam's razor constrains model complexity, which is the definition of regularization.

Q9: c) Increasing model complexity Section 14.9 -- Increasing complexity increases overfitting risk; regularization constrains complexity.

Q10: b) The human brain detects correlations between unrelated events in small samples and treats them as causal patterns Section 14.5 -- Superstition is the brain overfitting to coincidences, as demonstrated by Skinner's pigeon experiment.

Q11: c) Its models overfit to the stable market conditions of the mid-1990s and assigned near-zero probability to extreme events Section 14.6 -- LTCM's models were calibrated to a specific regime and failed when conditions changed.

Q12: b) Constructing compelling causal narratives that explain the past perfectly but would fail if applied to different historical events Section 14.4 -- History is vulnerable because it has a sample size of one and high interpretive flexibility.

Q13: b) Conspiracy theorists use elaborate, high-flexibility models to connect coincidental events into patterns, without testing whether those patterns would hold in new data Section 14.7 -- Conspiracy thinking uses unlimited degrees of freedom to fit noise as pattern.

Q14: b) In the ancestral environment, the cost of false negatives (missing real threats) was much higher than the cost of false positives (seeing threats that weren't there) Section 14.5 -- Asymmetric error costs led natural selection to calibrate the brain toward pattern over-detection.

Q15: b) The noise in the data that no model can eliminate Section 14.8 -- Irreducible error is the third component of total error, beyond any model's control.

Q16: b) Replication of scientific findings by independent researchers Section 14.9 -- Both test a model on data it was never exposed to during development.

Q17: b) The model has many degrees of freedom relative to the amount of data Section 14.10 -- The ratio of degrees of freedom to data points is the key diagnostic for overfitting risk.

Q18: b) Only positive results tend to be published, creating a biased sample of all studies conducted and inflating the apparent reliability of effects Section 14.3 -- Publication bias is like evaluating a model only on examples it gets right.

Q19: b) Systematic regularization -- institutional constraints designed to prevent the human brain's natural overfitting tendency from producing false knowledge Section 14.12 -- Science works not because scientists are immune to apophenia but because the method catches overfitting.

Q20: c) Overfitting is a universal failure mode of pattern recognition that occurs identically across machine learning, science, history, finance, superstition, and conspiracy thinking, and the bias-variance tradeoff is an inescapable feature of all learning from data Sections 14.8, 14.13 -- The chapter's thesis is that overfitting is universal and the tradeoff is inescapable.

Q21. Sample answer: All four are constraints that reduce the flexibility of a model, theory, or belief system, sacrificing some ability to fit the current data in exchange for better performance on new, unseen data. Occam's razor penalizes explanatory complexity, peer review applies external skepticism to research claims, diversification prevents over-commitment to a single strategy, and humility holds beliefs tentatively pending further evidence. What they share is the structure of regularization: accepting a small increase in bias (missing some real patterns) to achieve a large decrease in variance (avoiding fitting noise).

Q22. Sample answer: A restaurant owner notices that revenue increased on days when she wore her blue jacket and decreased on days when she wore other colors. She concludes that the blue jacket is good for business and wears it every day. The "model" is "blue jacket causes higher revenue," the "training data" is a few weeks of observations, and the "test data" is the next few months. The correlation is almost certainly noise -- caused by day-of-week effects, weather, local events, or random variation -- and wearing the jacket every day will have no effect on revenue. This is overfitting to a tiny, noisy sample.

Q23. Sample answer: Zero bias would require a model complex enough to capture every real pattern in the data-generating process, but such a model would also be flexible enough to capture noise in any finite sample (high variance). Zero variance would require a model rigid enough to produce the same predictions regardless of which specific training sample it sees, but such rigidity would force the model to ignore real patterns that differ from its assumptions (high bias). The fundamental constraint is that any finite dataset is a mixture of signal and noise, and a model cannot distinguish perfectly between the two because it has no access to the true data-generating process -- only to the sample. More flexibility captures more signal but also more noise; less flexibility avoids noise but also misses signal. This is an inherent property of learning from limited data, not a limitation of any specific algorithm.