Further Reading — Chapter 16

Self-Testing: The Most Powerful Learning Strategy Most Students Refuse to Use

This annotated bibliography provides resources for deeper exploration of the concepts introduced in Chapter 16. Sources are organized by tier following this textbook's citation honesty system.


Tier 1 — Verified Sources

These are well-known, widely available works that the authors are confident exist with the details provided.

Books

Brown, P. C., Roediger, H. L., III, & McDaniel, M. A. (2014). Make It Stick: The Science of Successful Learning. Harvard University Press.

Referenced throughout this textbook, Make It Stick remains the most accessible trade-book treatment of retrieval practice and self-testing as learning strategies. Chapters 2 and 3 are particularly relevant to this chapter's topics: Chapter 2 covers the testing effect in depth with compelling real-world examples, and Chapter 3 discusses the paradox of desirable difficulties — why strategies that feel ineffective (like self-testing) produce the best results. The book's central argument — that effortful retrieval is the key to durable learning — provides the foundation for everything in Chapter 16.

Dunlosky, J., & Metcalfe, J. (2009). Metacognition. SAGE Publications.

The most comprehensive academic textbook on metacognition, this volume includes extensive coverage of the relationship between self-testing, metacognitive monitoring, and study regulation. Chapter 6 covers the control function of metacognition — how students use (or fail to use) test results to adjust their studying. Chapter 7 discusses practical applications including self-testing schedules and the relationship between monitoring accuracy and academic performance. More technical than the treatment in this textbook but extremely thorough.

Carey, B. (2014). How We Learn: The Surprising Truth About When, Where, and Why It Happens. Random House.

Benedict Carey's accessible science journalism covers the testing effect, spacing, interleaving, and the pretesting effect in engaging narrative form. Chapter 4 ("Spacing") and Chapter 7 ("The Testing Effect") provide well-written summaries of the research behind self-testing, including the finding that self-testing works even for complex material and for long-term retention. A good companion to this chapter for students who want a more narrative treatment of the same research.

Research Articles and Reviews

Roediger, H. L., III, & Karpicke, J. D. (2006). "Test-enhanced learning: Taking memory tests improves long-term retention." Psychological Science, 17(3), 249-255.

This landmark paper demonstrated that taking a test on studied material produces superior long-term retention compared to additional study time — even when the additional study time would have allowed students to re-read the material multiple times. The elegant experimental design showed the testing effect in action: students who were tested once retained more after a week than students who studied the material four times. This paper is the most cited empirical demonstration of the practice test effect and provides the core evidence for why self-testing works as a memory-strengthening strategy.

Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., & Willingham, D. T. (2013). "Improving students' learning with effective learning techniques: Promising directions from cognitive and educational psychology." Psychological Science in the Public Interest, 14(1), 4-58.

The most influential meta-review of learning strategies published in the last two decades. Dunlosky and colleagues evaluated ten common study strategies and rated them by evidence quality. Practice testing and distributed practice received the highest ratings ("high utility"), while strategies like rereading and highlighting received the lowest ("low utility"). This paper provides the comprehensive evidence base for this chapter's claim that self-testing is the most effective learning strategy available. At 55 pages, it's substantial, but the executive summary (pages 4-8) and the section on practice testing (pages 35-40) are essential reading.

Kornell, N., Hays, M. J., & Bjork, R. A. (2009). "Unsuccessful retrieval attempts enhance subsequent learning." Journal of Experimental Psychology: Learning, Memory, and Cognition, 35(4), 989-998.

The key paper on the pretesting effect. Kornell, Hays, and Bjork demonstrated that attempting to answer questions before studying the relevant material improved subsequent learning, even when participants got the pretest questions wrong. This finding — that failed retrieval still benefits learning — is counterintuitive and important. It provides the empirical foundation for the chapter's recommendation to test yourself before studying, not just after.

Adesope, O. O., Trevisan, D. A., & Sundararajan, N. (2017). "Rethinking the use of tests: A meta-analysis of practice testing." Review of Educational Research, 87(3), 659-701.

A comprehensive meta-analysis examining 272 independent comparisons across 118 studies of practice testing. The overall effect size was 0.6 (medium-to-large), and the benefit held across age groups, material types, and test formats. This paper provides the strongest quantitative evidence that practice testing is a robust, generalizable learning strategy — not just a laboratory effect. The section on moderating variables is particularly useful for understanding when and how practice testing is most effective.

Karpicke, J. D., & Blunt, J. R. (2011). "Retrieval practice produces more learning than elaborative studying with concept mapping." Science, 331(6018), 772-775.

A striking study published in Science demonstrating that retrieval practice (self-testing) outperformed concept mapping — itself an active, elaborative study strategy — for learning from text passages. Students who practiced retrieval retained 50% more than students who created concept maps, even on questions that required inference and transfer (not just recall). This paper is important because it demonstrates that self-testing isn't just better than passive strategies like rereading — it's better than many active strategies as well.


Tier 2 — Attributed Sources

These are findings and claims attributed to specific researchers or research traditions. The general claims are well-established in the literature, but specific publication details beyond what is provided have not been independently verified for this bibliography.

Research by Robert Bjork and Elizabeth Bjork on desirable difficulties and the distinction between storage strength and retrieval strength.

The Bjorks' extensive research program at UCLA provides the theoretical framework for understanding why self-testing works. Their distinction between storage strength (how well-encoded a memory is) and retrieval strength (how easily accessible it is right now) explains why retrieval practice is superior to re-study: re-study boosts retrieval strength temporarily (making material feel familiar) but does little for storage strength. Self-testing, by contrast, boosts storage strength precisely because it requires effortful retrieval. This framework underpins the central paradox discussed throughout this textbook: strategies that feel hard produce the most durable learning.

Research by Nate Kornell on the benefits of pretesting and the role of unsuccessful retrieval in learning.

Kornell's research program at Williams College has explored the conditions under which pretesting is most beneficial, including the timing between pretest and study, the type of material being learned, and the mechanisms by which failed retrieval enhances subsequent encoding. His work suggests that pretesting works partly by activating related knowledge (a "search" that primes the learning system) and partly by creating specific knowledge gaps that the learner is then motivated to fill.

Research by Henry Roediger III and colleagues at the Memory Lab, Washington University in St. Louis, on test-enhanced learning.

The Washington University Memory Lab, led by Roediger, has been the most prolific source of research on the testing effect over the past two decades. Their program of research has systematically explored the boundaries of test-enhanced learning: how many tests are optimal, what kinds of tests produce the biggest benefits, how testing interacts with spacing and feedback, and how the testing effect transfers to different types of assessment. Their consistent finding — that testing produces larger and more durable learning benefits than any other strategy — provides the empirical backbone for this chapter.

The Leitner system, developed by Sebastian Leitner (1972).

Sebastian Leitner, a German science journalist, popularized the spaced-repetition flashcard system that bears his name in his 1972 book So lernt man lernen ("How to Learn to Learn"). The system assigns flashcards to progressively spaced review intervals based on performance — cards answered correctly advance to longer intervals; cards answered incorrectly return to the most frequent interval. While Leitner was not a cognitive scientist and his system predates much of the formal research on spaced retrieval, his method intuitively captured the principles of expanding retrieval practice that researchers would later validate. The Leitner system remains one of the most practical and widely used methods for implementing spaced self-testing.

Research on the benefits of elaborative interrogation in flashcard design.

A body of research, including work by Pressley, McDaniel, and others, has demonstrated that asking "why" and "how" questions about factual material produces deeper encoding and better long-term retention than simply studying the facts. This research provides the basis for the chapter's recommendation to design "elaborative flashcards" that go beyond simple question-answer pairs to include explanations, examples, connections, and applications. The key finding is that the level of processing demanded by the self-test question determines the depth of learning produced.


Tier 3 — General Recommendations

These are practical resources that complement the chapter's content but are not cited as primary sources.

Anki (ankiweb.net) — Free, open-source spaced repetition flashcard software.

Anki is the most widely used digital implementation of spaced repetition and the Leitner-style scheduling system discussed in this chapter. It's free on desktop and Android (paid on iOS), supports multimedia cards, and uses a sophisticated algorithm to schedule reviews at optimal intervals. Anki is popular among medical students, language learners, and anyone managing large volumes of material. The software automates the scheduling that the Leitner system does manually, making it practical for decks of hundreds or thousands of cards. Useful as a complement to this chapter's recommendations on building a sustainable self-testing system.

Quizlet (quizlet.com) — Popular flashcard platform with study modes.

Quizlet offers a more user-friendly (if less customizable) flashcard experience than Anki, with built-in study modes including a "Learn" mode that adapts to your performance. It's particularly useful for students who want to share flashcard sets with classmates or use pre-made sets. The main limitation for this chapter's purposes is that Quizlet's default modes often test recognition rather than recall — users should be deliberate about using "Write" mode (which requires typed answers) rather than "Match" mode (which tests recognition).

Oakley, B. (2014). A Mind for Numbers: How to Excel at Math and Science (Even If You Flunked Algebra). TarcherPerigee.

Barbara Oakley's popular book includes an accessible treatment of retrieval practice and self-testing, particularly for STEM subjects. Her "recall" technique — closing the book and trying to remember the key ideas from each page — is essentially the brain dump technique described in this chapter, applied at the page level. The book also discusses the emotional barriers to self-testing (the discomfort of confronting what you don't know) and provides strategies for pushing through them. A good companion read for students in math and science who want practical, encouraging guidance.


For Instructors

McDaniel, M. A., Anderson, J. L., Derbish, M. H., & Morrisette, N. (2007). "Testing the testing effect in the classroom." European Journal of Cognitive Psychology, 19(4-5), 494-513.

This paper examines the testing effect in actual classroom settings (as opposed to laboratory studies), demonstrating that frequent low-stakes quizzing in courses improves exam performance. Relevant for instructors who want to build self-testing opportunities into their course design, not just recommend them as individual study strategies.

Agarwal, P. K., & Bain, P. M. (2019). Powerful Teaching: Unleash the Science of Learning. Jossey-Bass.

A practical guide for K-12 and higher education instructors on implementing retrieval practice and spaced practice in the classroom. Includes specific techniques for incorporating self-testing into daily instruction, homework assignments, and assessments. Written by a cognitive scientist (Agarwal) and a veteran teacher (Bain), it bridges the research-practice gap effectively.