Capstone Project 1: The Learning Intervention Study
Design and Conduct a Mini-Experiment Testing a Learning Strategy
Project Overview
You've spent this entire book learning about what works and what doesn't when it comes to studying, practicing, and building knowledge. You've read the research. You've tried strategies on yourself. Now it's time to do something that most students never get to do: run your own experiment.
This capstone asks you to take one evidence-based learning strategy from this book, design a small-scale intervention study, collect real data from yourself and a handful of peers, analyze what happened, and write it up. You're not trying to publish in a journal. You're not trying to achieve statistical significance with a sample of four people. What you're doing is something more personal and arguably more valuable: you're experiencing firsthand what it means to move from "I read that this works" to "I tested whether this works, and here's what I found."
Along the way, you'll discover why research is harder than it looks, why anecdotes aren't data (but also why data without context isn't wisdom), and why the gap between knowing a strategy and implementing it faithfully is wider than anyone expects.
This project draws most heavily on the Category B (social-behavioral) skills you've developed throughout this book — designing fair comparisons, thinking about confounds, interpreting results with appropriate humility. But it also asks you to apply Category C (practical) skills from your progressive project and Category E (scientific) knowledge about how memory and cognition actually work.
💡 Why This Matters: Every time you read a headline saying "Study shows X improves learning by 30%," someone designed an intervention, recruited participants, controlled for confounds, collected data, and interpreted results. By doing this yourself — even at a tiny scale — you'll never read a learning study the same way again. You'll ask better questions. And you'll have a much deeper appreciation for what the evidence in this book actually means.
Learning Objectives
By completing this capstone project, you will be able to:
- Select and operationalize an evidence-based learning strategy, translating a general principle (e.g., "retrieval practice improves retention") into a specific, testable intervention
- Design a controlled comparison with at least a basic attempt to hold confounding variables constant
- Collect quantitative data (test scores, recall rates, confidence ratings) and qualitative data (participant reflections, difficulty ratings) using structured templates
- Analyze your results using basic descriptive statistics and honest interpretation
- Write a clear, honest research report that distinguishes between what you found and what you can confidently conclude
- Reflect metacognitively on both the content (what did the strategy do?) and the process (what did you learn about doing research?)
Detailed Instructions
Phase 1: Choose Your Strategy (Week 1)
Pick one evidence-based learning strategy from the book to test. The best choices are strategies with a clear mechanism and a straightforward way to measure outcomes. Here are your options, ranked by how well they lend themselves to a small-scale study:
Excellent Choices (Strong contrast, easy to measure)
| Strategy | Compare Against | Measure | Key Chapters |
|---|---|---|---|
| Retrieval practice (self-testing) | Rereading the same material | Quiz score after 2–7 days | Ch 7, 16 |
| Spaced practice (3 sessions over 6 days) | Massed practice (1 long session) | Quiz score after 1 week | Ch 3, 7 |
| Interleaving (mixing problem types) | Blocked practice (one type at a time) | Test on mixed problems | Ch 7, 10 |
| Elaborative interrogation ("Why does this make sense?") | Reading without self-explanation | Comprehension test | Ch 7, 12 |
Good Choices (Measurable, but requires more careful design)
| Strategy | Compare Against | Measure | Key Chapters |
|---|---|---|---|
| Dual coding (words + diagrams) | Text-only study | Recall + comprehension | Ch 9 |
| Delayed JOLs (predict performance after a delay) | Immediate confidence ratings | Calibration accuracy | Ch 13, 15 |
| Pretesting (attempt questions before studying) | Study without pretesting | Post-study quiz | Ch 10 |
| Cornell notes vs. verbatim transcription | Verbatim notes | Recall after 48 hours | Ch 20 |
Avoid These (Too complex for a small study)
- Growth mindset interventions (too many confounds, effects are subtle and debated)
- Sleep manipulations (ethical concerns, too many uncontrolled variables)
- Motivation interventions (hard to measure, slow to show effects)
- Anything requiring more than 3 weeks of participant commitment
Your deliverable for Phase 1: A one-paragraph description of what you plan to test, why you chose it, and what you expect to find (your hypothesis). Write your hypothesis before you collect any data. This matters. It's the difference between science and storytelling.
Phase 2: Design Your Study (Week 1–2)
Now you need to turn your general idea into a specific plan. Answer each of these questions in writing:
1. Participants - Who are your participants? (Yourself + 2–3 peers minimum; 4–6 is ideal) - Are they roughly similar in background knowledge of the topic you'll use? - Have they agreed to participate? (Verbal consent is fine — this isn't an IRB submission, but basic ethics still apply. See the ethics note below.)
2. Materials - What will participants study? Choose a topic that is: - Unfamiliar to all participants (so prior knowledge doesn't swamp your results) - Rich enough to generate meaningful test questions - Divisible into comparable chunks if you're using a within-subjects design - Good material sources: a chapter from a textbook nobody in your group has read, a set of vocabulary words from a language nobody speaks, a collection of art history facts, anatomy terms, historical dates - Prepare your study materials in advance. Everyone should study the same content — only the strategy should differ.
3. Procedure - Between-subjects design (simpler): Half your participants use Strategy A, half use Strategy B, on the same material - Within-subjects design (more powerful with small samples): Each participant uses Strategy A on one set of material and Strategy B on a comparable set, then gets tested on both - How long will each study session be? (Keep it identical across conditions — 20–30 minutes works well.) - When will you test? (At least 48 hours after studying. Same-day tests don't tell you much about real learning.)
4. Measurement - Design your test before anyone studies. This keeps you honest. - Include at least 10 questions (more is better for reliability) - Mix question types if possible: recall (fill-in-the-blank), recognition (multiple choice), and application (use the concept in a new context) - Also collect: confidence ratings (1–5) for each answer, a brief written reflection from each participant
5. Controls - What are you holding constant? (Study time, materials, test difficulty, delay between study and test) - What can't you control? (Prior knowledge differences, motivation, sleep the night before, etc.) - Be honest about these limitations in your write-up. Acknowledging what you can't control is a sign of scientific maturity, not weakness.
⚠️ Ethics Note: You're asking real people to participate in your study. Even in an informal classroom project, basic ethical principles apply: - Informed consent: Tell participants what you're doing and why. They should know it's for a class project and that their data won't be shared with their names attached. - Right to withdraw: Anyone can stop at any time without pressure. - No deception about grades: Never tell someone you're studying together when you're actually running an experiment on them. Be upfront. - Respect their time: Keep sessions to the length you promised. Thank them. - Share your results: After the study, tell participants what you found. If you discovered that one strategy was better, share that knowledge — they deserve to benefit too.
Phase 3: Collect Your Data (Week 2–4)
Run your study according to your plan. Keep a research journal noting:
- Anything that went differently than planned (someone showed up late, the study room was noisy, a participant already knew some of the material)
- Your own observations about participant behavior (did someone in the retrieval practice group seem frustrated? Did someone in the rereading group seem bored?)
- Any modifications you had to make and why
Data Collection Template:
Use this format (or adapt it) for each participant:
Participant ID: _____ (use initials or numbers, not full names)
Condition: _____ (Strategy A or Strategy B)
Date of study session: _____
Duration of study session: _____ minutes
Date of test: _____
Delay between study and test: _____ days
Test Results:
Recall questions: _____ / _____ correct
Recognition questions: _____ / _____ correct
Application questions: _____ / _____ correct
Total score: _____ / _____ (_____ %)
Confidence Ratings:
Average confidence (1-5): _____
Calibration: _____ questions where confident AND correct
_____ questions where confident BUT incorrect
_____ questions where unconfident BUT correct
_____ questions where unconfident AND incorrect
Participant Reflection (brief):
"How did the study session feel?" _____
"How prepared did you feel for the test?" _____
"Would you use this strategy again? Why/why not?" _____
Phase 4: Analyze Your Results (Week 4–5)
You don't need fancy statistics. Here's what to do:
Step 1: Organize your data. Create a simple table comparing the two conditions.
| Participant | Condition | Total Score (%) | Avg Confidence | Calibration Accuracy |
|---|---|---|---|---|
| P1 | Retrieval Practice | 80% | 3.8 | 7/10 matched |
| P2 | Retrieval Practice | 75% | 3.2 | 6/10 matched |
| P3 | Rereading | 55% | 4.1 | 4/10 matched |
| P4 | Rereading | 60% | 3.9 | 5/10 matched |
Step 2: Calculate basic descriptive statistics. - Mean (average) score for each condition - Range (highest and lowest scores) for each condition - Mean confidence rating for each condition - If you have enough data: median and standard deviation
Step 3: Look for patterns. - Did the evidence-based strategy group perform better, as predicted? - Was the difference large or small? - Did the groups differ in confidence? (This is often the most interesting finding — the rereading group frequently feels more confident despite scoring lower. If you find this pattern, congratulations: you've just replicated one of the most robust findings in learning science.) - Were there any surprises?
Step 4: Consider alternative explanations. - Could prior knowledge differences explain the results? - Could motivation or effort differences explain the results? - Was your sample too small to draw conclusions? (It almost certainly was — and that's okay. Acknowledging this is part of the exercise.)
📊 A Note About Sample Size: With 3–6 participants, you cannot achieve statistical significance for most effects. That's not the point. The point is to practice the process of empirical investigation and to develop an intuition for what data can and cannot tell you. If your results go in the predicted direction, that's encouraging. If they don't, that's interesting. Neither outcome means your study "failed."
Phase 5: Write It Up (Week 5–6)
Your final write-up should be 2,500–3,500 words and follow this structure:
Write-Up Template
1. Introduction (400–600 words) - What strategy did you test and why? - What does the research literature say about this strategy? (Cite at least 3 specific findings from this textbook, with chapter references) - What was your hypothesis?
2. Method (500–700 words) - Participants: Who, how many, what were their relevant characteristics? - Materials: What did they study? How did you choose it? - Procedure: What exactly happened, step by step? (Write this clearly enough that someone else could replicate your study.) - Measures: What did you test and how?
3. Results (400–600 words) - Present your data in at least one table - Report the key numbers: means, ranges, differences between conditions - Include the confidence/calibration data — this is often the most revealing part - Describe any patterns you noticed
4. Discussion (500–800 words) - Did your results support your hypothesis? Be specific. - How do your findings compare to the published research you cited? - What are the limitations of your study? (Be thorough here — this section separates strong papers from weak ones.) - What surprised you? - If you could run this study again with unlimited resources, what would you do differently?
5. Personal Reflection (400–600 words) - What did you learn about the research process itself? - How has designing and conducting this study changed the way you read research findings? - What did this experience teach you about your own learning — beyond what the data showed? - How does this connect to the progressive project you've been building throughout the book?
6. References - Cite at least 3 chapters from this textbook - If you consulted any outside sources, cite those too
Timeline at a Glance
| Week | Phase | Key Deliverable |
|---|---|---|
| 1 | Choose strategy & begin design | Hypothesis paragraph |
| 1–2 | Finalize study design | Complete design document (answers to all 5 design questions) |
| 2–4 | Data collection | Completed data collection templates for all participants |
| 4–5 | Analysis | Data tables, descriptive statistics, pattern analysis |
| 5–6 | Write-up | Final 2,500–3,500 word paper |
Examples of Good vs. Weak Interventions
Understanding the difference between a well-designed and poorly-designed study will save you significant time. Here are contrasting examples:
Example A: Strong Design
Question: Does retrieval practice produce better retention than rereading for unfamiliar scientific content?
Design: 6 participants, within-subjects. Each person studies two comparable passages about astronomy (one on stellar evolution, one on planetary formation — pre-tested for equal difficulty). For one passage, they reread it three times over 25 minutes. For the other, they read it once, then spend the remaining time doing free recall (writing everything they remember without looking) followed by checking their answers and recalling again. Which passage gets which treatment is counterbalanced (half the participants get retrieval practice on stellar evolution, half get it on planetary formation). Everyone takes the same 20-question test 5 days later — 10 questions on each passage.
Why it works: Same participants, same total study time, comparable materials, counterbalanced assignment, adequate delay before testing, clear measure.
Example B: Weak Design
Question: Is interleaving better than blocking?
Design: I study math problems using interleaving for two weeks. My friend studies math problems using blocking for two weeks. We compare our grades on the next math exam.
Why it's weak: Different people (prior knowledge confound), different courses or topics possibly, no control for study time, too many uncontrolled variables, a single test score is unreliable, no way to isolate the effect of the strategy from everything else happening in two weeks of life.
Example C: Decent Design with Honest Limitations
Question: Do delayed judgments of learning (JOLs) improve calibration compared to immediate JOLs?
Design: 4 participants, within-subjects. Each person studies 30 vocabulary terms from a foreign language. After studying the first 15, they immediately rate their confidence (1–5) for each term. After studying the second 15, they wait 24 hours and then rate their confidence. The next day (48 hours after initial study for the first set, 24 hours after JOL for the second set), everyone takes a recall test on all 30 terms. I compare calibration accuracy (confidence-performance match) for the immediate-JOL set vs. the delayed-JOL set.
Why it's decent: Within-subjects design, clear measure (calibration accuracy), same materials, adequate delay. Honest limitations: The two sets of vocabulary might not be equally difficult. The order effect (first 15 vs. second 15) is confounded with the JOL condition. With 4 participants, individual differences could swamp the effect.
✅ The takeaway: A good study doesn't have to be perfect. It has to be thoughtfully designed with its limitations honestly acknowledged.
Grading Criteria
Your work on this capstone will be evaluated across five dimensions:
| Dimension | Weight | What We're Looking For |
|---|---|---|
| Study Design | 25% | Clear hypothesis, appropriate strategy choice, reasonable controls, ethical treatment of participants, feasible within constraints |
| Data Collection & Analysis | 20% | Complete data for all participants, accurate calculations, appropriate use of descriptive statistics, data presented clearly in tables |
| Integration of Course Concepts | 20% | Meaningful connection to book content, accurate representation of the research literature, correct use of key terms, understanding of why the strategy should work (mechanism) |
| Intellectual Honesty | 20% | Forthright about limitations, doesn't overclaim, distinguishes between patterns and proof, acknowledges alternative explanations, honest about surprises and disappointments |
| Reflection Quality | 15% | Genuine metacognitive reflection on the research process, insight into how conducting research changes one's relationship to reading research, connection to personal learning system |
A note on "negative" results: If the evidence-based strategy does not outperform the comparison in your small study, that is absolutely fine. You will not be penalized for "wrong" results. What matters is how you interpret them. Did you consider why? Was your sample too small? Did implementation fidelity break down? Did confounds overwhelm the signal? Some of the most insightful papers in this capstone come from students whose results didn't go as expected, because those students are forced to think harder about what they found.
Connection to Your Learning Operating System
This capstone connects directly to the Learning Operating System you've been building throughout the book. By designing and running your own study, you're practicing the highest level of metacognition: you're not just monitoring your own learning — you're systematically investigating the mechanisms that drive learning itself.
When you finish this project, add a new section to your Learning Operating System document: "What I Know from My Own Evidence." Record your findings, your confidence in them, and how they've shaped the strategies you've chosen for your system. The best learning systems are updated not just from reading but from testing.
🔗 Cross-References: - For a refresher on the strategies you might test, see Chapter 7 (The Learning Strategies That Work) and Chapter 10 (Desirable Difficulties) - For guidance on calibration and confidence measurement, see Chapter 15 (Calibration) - For research methods background, see Appendix A (Research Methods Primer) - For templates you can adapt for data collection, see Appendix C (Templates & Worksheets) - For the rubric that applies to all three capstone projects, see the Capstone Rubric document