Chapter 26: Further Reading — Scientific Thinking and Evidence Evaluation

Philosophy of Science

1. Popper, Karl. (1959). The Logic of Scientific Discovery. Routledge. (Originally published 1934 in German)

Popper's foundational statement of the falsifiability criterion. He argues that scientific theories are distinguished not by their verification but by their falsifiability — the logical capacity to be refuted by observation. The book also contains Popper's philosophy of probability and his critique of inductivism. Dense but essential for anyone wanting to understand the conceptual foundations of scientific methodology. Parts I and II are most relevant to this chapter.

2. Kuhn, Thomas S. (1962). The Structure of Scientific Revolutions. University of Chicago Press.

One of the most influential philosophy of science texts of the twentieth century. Kuhn argues that science does not progress through the linear accumulation of knowledge and the straightforward application of Popperian falsification. Instead, science operates within "paradigms" — shared conceptual frameworks that define normal science, generate puzzles, and eventually (after accumulating anomalies) undergo "paradigm shifts." Kuhn's work explains why scientific revolutions (from Copernicus to Einstein) are so sociologically complex and why established scientific communities resist disconfirming evidence. Essential for understanding the human dimensions of scientific practice.

3. Lakatos, Imre. (1978). The Methodology of Scientific Research Programmes. Cambridge University Press.

Lakatos refined both Popper's falsificationism and Kuhn's paradigm concept. His "methodology of scientific research programmes" describes how science operates through networks of theories with a protected core and a flexible protective belt of auxiliary hypotheses. A research programme is progressive when it generates novel predictions that are confirmed; degenerative when it only accommodates anomalies post hoc. This framework helps evaluate whether a research tradition (like nutritional epidemiology) is progressive or degenerative.

The Replication Crisis

4. Open Science Collaboration. (2015). "Estimating the reproducibility of psychological science." Science, 349(6251), aac4716.

The landmark paper that brought the replication crisis to widespread attention. The Open Science Collaboration systematically attempted to replicate 100 published psychology studies, finding that approximately 36-39% replicated with comparable effect sizes. The paper includes detailed analysis of factors that predicted replication success. Essential primary reading for understanding the empirical scope of the crisis.

5. Ioannidis, John P. A. (2005). "Why most published research findings are false." PLOS Medicine, 2(8), e124.

Perhaps the most widely cited paper in the replication crisis literature. Ioannidis demonstrates mathematically that in research contexts characterized by low prior probability, many tests of many hypotheses, flexible designs, and modest sample sizes, the majority of published significant findings are false positives — even before any fraud or intentional manipulation. The paper is a bracing corrective to naive trust in published significant results.

6. Simmons, Joseph P., Nelson, Leif D., and Simonsohn, Uri. (2011). "False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant." Psychological Science, 22(11), 1359-1366.

The paper that coined the term "researcher degrees of freedom." Simmons and colleagues demonstrated empirically that plausible-sounding analytical choices — collecting more data if initial results are non-significant, controlling for different covariates, excluding participants for reasonable-sounding reasons — could inflate the false positive rate from the nominal 5% to over 60%. This paper crystallized the methodological critique of flexible analysis practices.

7. Nosek, Brian A., et al. (2022). "Replicability, robustness, and reproducibility in psychological science." Annual Review of Psychology, 73, 719-748.

A comprehensive overview of the replication crisis, its causes, and the reform efforts underway. Nosek is a co-founder of the Center for Open Science and a leading figure in the open science reform movement. This paper provides an authoritative and current assessment of where the field stands, what reforms have been implemented, and what evidence exists for their effectiveness.

Statistics and Research Methods

8. Goldacre, Ben. (2008). Bad Science: Quacks, Hacks and Big Pharma Flacks. Fourth Estate.

Accessible and entertaining introduction to the ways statistical reasoning is misused in medicine, public health journalism, and alternative medicine marketing. Goldacre covers p-values, confidence intervals, the distinction between relative and absolute risk, publication bias, and the peculiarities of the UK supplement and alternative medicine industries. Written for general audiences, it remains one of the best introductions to statistical literacy for non-specialists.

9. Gelman, Andrew, and Loken, Eric. (2014). "The statistical crisis in science." American Scientist, 102(6), 460-465.

A short, accessible paper by one of the field's most respected statisticians explaining the concept of "the garden of forking paths" — the ways in which apparently reasonable analytical choices, made in the absence of pre-specified analysis plans, inflate false positive rates. Essential complement to the Simmons et al. (2011) paper above.

10. Wasserstein, Ronald L., and Lazar, Nicole A. (2016). "The ASA statement on p-values: Context, process, and purpose." The American Statistician, 70(2), 129-133.

The American Statistical Association's official guidance on p-values — a remarkable institutional acknowledgment that p-values have been systematically misunderstood and misused. The statement and the accompanying collection of expert perspectives provide an authoritative overview of what p-values do and do not mean, and what alternative approaches deserve consideration. Freely available online.

Evidence-Based Medicine

11. Sackett, David L., Straus, Sharon E., Richardson, W. Scott, Rosenberg, William, and Haynes, R. Brian. (2000). Evidence-Based Medicine: How to Practice and Teach It (2nd ed.). Churchill Livingstone.

The foundational text of the evidence-based medicine movement. Covers the principles of evidence hierarchies, critical appraisal of RCTs and observational studies, systematic reviews, and applying evidence to clinical practice. Rigorous but accessible to students without medical training. The evidence-based medicine framework has since been applied far beyond medicine, influencing policy analysis, education research, and public health.

12. Cochrane, Archie. (1972). Effectiveness and Efficiency: Random Reflections on Health Services. Nuffield Provincial Hospitals Trust.

The slim book that launched the evidence-based medicine movement. Cochrane argued, decades ahead of his time, that medical interventions should be subjected to randomized controlled trials, that the medical profession had been providing many ineffective treatments, and that systematic reviews of RCTs should guide clinical practice. The Cochrane Collaboration, which now produces the most rigorously conducted systematic reviews in medicine, is named in his honor.

Science Communication and Media

13. Goldacre, Ben. (2012). Bad Pharma: How Drug Companies Mislead Doctors and Harm Patients. Fourth Estate.

A rigorous and damning analysis of how pharmaceutical companies manipulate clinical trial evidence — through selective publication, trial design choices, ghostwriting, and regulatory maneuvering — to misrepresent their products' efficacy and safety. Goldacre's analysis is scrupulously documented and illustrates how the principles of evidence evaluation can be systematically exploited by commercial interests. Particularly relevant to the discussion of conflicts of interest in Section 26.7.

14. Ioannidis, John P. A. (2018). "The challenge of reforming nutritional epidemiologic research." JAMA, 320(10), 969-970.

Ioannidis's critique of nutritional epidemiology, arguing that the field generates an enormous volume of research that is unreliable due to fundamental methodological limitations. This short piece in a major medical journal sparked a prominent debate about the standards of evidence in nutritional science and is directly relevant to Case Study 26.1.

15. Sumner, Petroc, et al. (2014). "The association between exaggeration in health related science news and academic press releases: Retrospective observational study." BMJ, 349, g7015.

The landmark study demonstrating that exaggeration in health news is primarily driven by exaggeration in institutional press releases rather than by journalists alone. Sumner and colleagues analyzed 462 press releases from UK research universities and matched news articles, finding that 40% of press releases contained claims more extreme than the papers they described. Highly relevant to Case Study 26.1 and to the broader discussion of science journalism.