Capstone Rubric

Capstone Rubric

Comprehensive Assessment Guide for All Three Capstone Projects

How to Use This Rubric

This rubric covers all three capstone projects for Metacognition and the Science of Learning. It's organized in two sections:

Common Criteria — four dimensions that apply to all three projects, evaluated the same way regardless of which capstone you chose
Project-Specific Criteria — dimensions unique to each capstone, reflecting the distinct skills each project is designed to develop

Each dimension is evaluated on a 4-level scale:

Level	Label	What It Means
4	Exemplary	Exceeds expectations. Demonstrates sophisticated understanding, exceptional execution, and genuine insight. Work at this level would serve as a model for future students.
3	Proficient	Meets expectations. Demonstrates solid understanding, competent execution, and meaningful reflection. This is the target for a well-prepared student who engages seriously with the project.
2	Developing	Approaches expectations but with notable gaps. Understanding is partial, execution has significant weaknesses, or reflection remains superficial. Shows effort but needs substantial improvement in key areas.
1	Beginning	Falls well below expectations. Understanding is minimal or contains major errors, execution is incomplete, or reflection is absent. May indicate insufficient engagement with the project or with the course material.

💡 A note to students: This rubric is designed to be transparent and formative, not punitive. Read through it before you begin your project — not just when you're finished. The descriptors tell you exactly what we're looking for, which means they also tell you exactly what to aim for as you work. Think of this rubric as a coach, not a judge.

💡 A note to instructors: These rubrics are calibrated for a course that covers all 28 chapters of this textbook. If you're teaching a shorter version of the course, adjust the "Integration of Course Concepts" expectations accordingly. The reflection criteria should remain constant regardless of course length — deep reflection doesn't require having read every chapter.

Part 1: Common Criteria (All Three Capstones)

These four dimensions are evaluated identically across all three projects.

Criterion 1: Research Quality and Integration of Course Concepts

How well does the student understand and apply the learning science content from this book?

Exemplary (4)

Demonstrates deep understanding of learning science principles, including mechanisms (not just labels)
Cites specific research findings from at least 4–5 chapters, with accurate representation of what the evidence shows
Uses key terms correctly and precisely throughout (e.g., distinguishes between retrieval practice and recognition, between spacing and interleaving, between calibration and confidence)
Acknowledges nuance and complexity in the research (e.g., notes the growth mindset replication debate, distinguishes between desirable and undesirable difficulty, recognizes the difference between lab findings and real-world application)
Connects multiple concepts from the book into an integrated understanding (e.g., explains how metacognitive monitoring enables better strategy selection, or how the testing effect and the spacing effect combine for compounding benefits)
Goes beyond the textbook where appropriate, incorporating at least one outside source that adds genuine value

Proficient (3)

Demonstrates solid understanding of relevant learning science principles
Cites specific findings from at least 3 chapters, mostly accurate
Uses key terms correctly in most instances
Acknowledges some nuance (e.g., mentions limitations of certain studies or strategies)
Shows connections between concepts from different parts of the book
Relies primarily on the textbook, which is appropriate — outside sources optional

Developing (2)

Demonstrates partial understanding; some concepts are correct but others are oversimplified or slightly inaccurate
Cites findings from 1–2 chapters or cites more chapters but superficially (e.g., "Chapter 7 says retrieval practice works" without explaining why or how)
Uses some key terms but occasionally misuses them or uses colloquial substitutes
Little acknowledgment of nuance — presents research findings as more certain or more simple than they actually are
Concepts from the book appear in isolation rather than as part of an integrated understanding

Beginning (1)

Demonstrates minimal or inaccurate understanding of learning science
Few or no specific citations from the textbook
Key terms are absent, misused, or confused with one another
No acknowledgment of nuance — may present myths as facts or oversimplify to the point of inaccuracy
Little evidence that the student has engaged deeply with the course material

Criterion 2: Reflection Depth

How deeply and honestly does the student reflect on their own learning, process, and metacognitive growth?

Exemplary (4)

Reflection goes beyond surface description to genuine analysis: not just "I learned a lot" but what specifically changed in their understanding and why
Demonstrates metacognitive awareness about the project process itself — identifies moments where their thinking shifted, where they were surprised, where they struggled
Connects the project experience to their ongoing Learning Operating System development with specific examples
Honest about failures, limitations, and disappointments — treats setbacks as data, not just obstacles
Shows evidence that the student's understanding of learning science deepened through the act of applying it (the protege effect / generation effect in action)
Reflects on how the project changed their relationship to reading research, helping others learn, or evaluating claims about learning

Proficient (3)

Reflection includes both description and analysis
Identifies at least 2–3 specific insights or moments of shifted understanding
Connects the project to the Learning Operating System with at least one concrete example
Acknowledges challenges honestly
Shows some evidence of deepened understanding through application

Developing (2)

Reflection is primarily descriptive rather than analytical — summarizes what happened but doesn't fully analyze what it means
Insights are generic (e.g., "I learned that learning science is really useful") rather than specific
Connection to Learning Operating System is mentioned but vague
Challenges are acknowledged but not examined for what they reveal

Beginning (1)

Reflection is minimal, absent, or purely summary
No specific insights — reads as an obligation fulfilled rather than a genuine examination of the experience
No connection to Learning Operating System or broader metacognitive development
Challenges are either not mentioned or blamed on external factors without self-examination

Criterion 3: Writing and Communication Quality

How clearly, effectively, and appropriately does the student communicate their work?

Exemplary (4)

Writing is clear, well-organized, and engaging
Structure supports comprehension: sections flow logically, transitions guide the reader, key points are easy to identify
Tone is appropriate for the context (academic rigor for research write-ups, accessibility for public-facing guides, warmth and specificity for coaching documentation)
Data, tables, and evidence are presented clearly and accurately
Grammar, spelling, and formatting are polished — the work has been revised, not just drafted
Length is within the target range and every section earns its space (no padding, no rushed sections)

Proficient (3)

Writing is clear and organized with minor issues
Structure is logical and easy to follow
Tone is mostly appropriate
Data and evidence are presented clearly
Minor grammar or formatting issues that don't impede understanding
Length is within or close to the target range

Developing (2)

Writing is understandable but disorganized or unclear in places
Structure has gaps — some sections feel disconnected or out of order
Tone is inconsistent or occasionally inappropriate (e.g., too casual for a research write-up, or too formal for a public guide)
Data presentation is confusing or incomplete
Multiple grammar, spelling, or formatting issues
Length is significantly over or under the target range

Beginning (1)

Writing is unclear, disorganized, or difficult to follow
Structure is missing or ineffective
Tone is inappropriate for the context
Data is absent, inaccurate, or uninterpretable
Pervasive grammar, spelling, or formatting issues that impede comprehension
Length is far outside the target range, suggesting incomplete work

Criterion 4: Ethical Practice and Intellectual Honesty

Does the student demonstrate integrity, respect for others, and honest engagement with evidence?

Exemplary (4)

Demonstrates thorough ethical awareness throughout the project
Treats participants, clients, or audience members with genuine respect and care
Distinguishes carefully between what the evidence supports and what it does not — no overclaiming
Acknowledges limitations proactively and thoroughly, treating them as important features of the work rather than embarrassing footnotes
Sources are cited accurately and completely
If results were unexpected or disappointing, presents them honestly and explores what they might mean rather than hiding or explaining them away
Privacy and consent are handled appropriately

Proficient (3)

Demonstrates ethical awareness in most aspects of the project
Treats others respectfully
Mostly distinguishes between evidence and speculation
Acknowledges major limitations
Sources are cited
Unexpected results are presented honestly
Privacy and consent are addressed

Developing (2)

Some ethical awareness, but gaps in practice (e.g., consent is mentioned but not clearly obtained; limitations are acknowledged but minimized)
Occasional overclaiming — presenting small or ambiguous findings as more conclusive than warranted
Sources are partially cited or cited inconsistently
Unexpected results are acknowledged but not fully examined

Beginning (1)

Minimal ethical awareness — ethical considerations are absent or treated as an afterthought
Overclaiming is pervasive — draws strong conclusions from weak evidence
Sources are not cited or are cited inaccurately
Negative results are hidden, distorted, or blamed entirely on circumstances
Privacy or consent may have been handled inappropriately

Part 2: Project-Specific Criteria

Each capstone has one or two additional dimensions that reflect the unique skills the project is designed to develop.

Capstone 1: The Learning Intervention Study

Criterion 5a: Study Design

How well did the student design a controlled, feasible, and meaningful intervention study?

Level	Descriptor
Exemplary (4)	Hypothesis is clear, specific, and grounded in the research literature. The comparison is fair (same study time, comparable materials, appropriate controls). The design reflects awareness of common confounds and makes reasonable efforts to minimize them. Measurement tools are well-constructed (multiple question types, adequate number of items, confidence ratings included). The study is feasible within the 4–6 week timeframe and was actually conducted as designed (or thoughtful adaptations were documented).
Proficient (3)	Hypothesis is stated and reasonable. Comparison is mostly fair with some uncontrolled variables acknowledged. Measurement is adequate (at least 10 questions, at least one confidence measure). Study was conducted with minor deviations from the plan.
Developing (2)	Hypothesis is vague or weakly connected to the literature. Comparison has significant fairness issues (e.g., unequal study time between conditions, poorly matched materials). Measurement is limited (too few questions, no confidence measure, or test questions don't align with study material). Significant deviations from the plan without adequate documentation.
Beginning (1)	Hypothesis is absent or untestable. Comparison is not controlled in any meaningful way. Measurement is inadequate or absent. Study was not fully conducted or data is incomplete.

Criterion 6a: Data Collection and Analysis

How completely and accurately did the student collect, organize, and interpret their data?

Level	Descriptor
Exemplary (4)	Data is complete for all participants with no unexplained gaps. Results are organized in clear tables. Descriptive statistics are calculated correctly. Patterns are identified and described accurately. The calibration/confidence data is analyzed and interpreted (often the most interesting finding). Alternative explanations are considered thoughtfully. The student demonstrates clear understanding that a small sample cannot prove anything definitively but can reveal interesting patterns.
Proficient (3)	Data is mostly complete. Results are organized. Descriptive statistics are present and mostly correct. Key patterns are identified. Some alternative explanations are considered. Appropriate humility about sample size.
Developing (2)	Data has gaps or inconsistencies. Organization is unclear. Statistics may contain errors. Patterns are described superficially or inaccurately. Limited consideration of alternative explanations. May overclaim from small sample or dismiss unexpected findings.
Beginning (1)	Data is substantially incomplete. Little or no organization. Statistics are absent or incorrect. Analysis is minimal. No consideration of alternative explanations. Overclaims or shows no understanding of why a small sample limits conclusions.

Capstone 2: The Learning Myths Debunking Guide

Criterion 5b: Science Communication Effectiveness

How well does the public-facing guide translate complex scientific findings into accessible, accurate, and engaging content?

Level	Descriptor
Exemplary (4)	The guide is genuinely accessible to the target audience — a non-expert could read/watch it and understand both what the evidence shows and why the myth is wrong. Scientific accuracy is maintained throughout; nothing is oversimplified to the point of inaccuracy. The chosen format is used skillfully (strong visual design for infographics, compelling narrative for blog posts, engaging pacing for video scripts). The opening hooks the audience immediately. Jargon is translated without being dumbed down — readers learn the real terms and what they mean. The tone strikes the right balance: confident about the evidence, humble about complexity, never condescending toward people who believe the myths.
Proficient (3)	The guide is accessible to the target audience with minor comprehension barriers. Accuracy is maintained in most cases. Format is used competently. Opening is engaging. Most jargon is translated. Tone is appropriate.
Developing (2)	The guide would be partially accessible to the target audience but includes sections that are confusing, overly technical, or oversimplified. Some accuracy issues. Format is underutilized (e.g., a blog post that reads like a research paper, or infographics that are text-heavy). Tone may be preachy, condescending, or too tentative.
Beginning (1)	The guide is not accessible to the target audience — reads like a course assignment rather than a public-facing resource. Significant accuracy issues. Format is poorly executed. Tone is inappropriate (dismissive, smug, or unclear).

Criterion 6b: Counterargument Handling and Audience Testing

How well does the student anticipate resistance, address counterarguments, and iterate based on real audience feedback?

Level	Descriptor
Exemplary (4)	Each myth is stated fairly and sympathetically — a believer would recognize their own view in the description. The reasons for each myth's persistence are explained psychologically (confirmation bias, identity investment, fluency illusion, authority endorsement). Nuances and grains of truth are acknowledged where they exist. The guide was tested on 3+ members of the target audience, feedback was collected systematically, specific revisions are documented and justified, and the reflective companion shows genuine learning about the challenges of science communication.
Proficient (3)	Most myths are stated fairly. Reasons for persistence are mentioned for most myths. Some nuance is acknowledged. Guide was tested on at least 3 audience members, feedback is reported, and at least some revisions are documented.
Developing (2)	Myths are sometimes strawmanned or stated dismissively. Reasons for persistence are mentioned for a few myths but not consistently. Limited nuance. Testing was conducted but feedback collection was informal or incomplete, or revisions based on feedback are vague.
Beginning (1)	Myths are consistently strawmanned or dismissed without sympathy. No analysis of why myths persist. No nuance. Testing was not conducted, or feedback was not collected, or no revisions were made.

Capstone 3: Teach Someone Else to Learn

Criterion 5c: Assessment and Plan Design

How thoroughly did the student assess their client and how well did the coaching plan match the assessment findings?

Level	Descriptor
Exemplary (4)	Assessment is thorough, covering strategies, beliefs, metacognitive awareness, and context. The student clearly listened to their client (interview notes show follow-up questions, not just checklist responses). Strategy audit is complete and accurate. Baseline measurement is appropriate and documented. The coaching plan directly addresses the assessment findings — interventions are justified by specific client needs, not by what the student found easiest to teach. Plan is realistic (2–3 focused interventions, not a complete overhaul). Theory of change is articulated: "Because my client struggles with X, I chose strategy Y, which should help because of mechanism Z."
Proficient (3)	Assessment covers the main areas. Interview shows genuine engagement. Strategy audit is mostly complete. Baseline is documented. Coaching plan connects to assessment findings. Plan is realistic. Theory of change is present if not fully articulated.
Developing (2)	Assessment is superficial or incomplete — some areas are covered but others are skipped. Interview feels formulaic. Strategy audit has gaps. Baseline may be missing or poorly chosen. Coaching plan has a weak connection to the assessment — interventions seem chosen based on the student's preferences rather than the client's needs. Plan may be overly ambitious or unfocused.
Beginning (1)	Assessment is minimal or absent. No meaningful baseline. Coaching plan shows little connection to the client's actual situation. Plan is unrealistic, generic, or shows little understanding of how to translate learning science into coaching practice.

Criterion 6c: Implementation, Adaptation, and Ethical Practice

How effectively did the student implement the coaching plan, adapt to challenges, and handle the relationship ethically?

Level	Descriptor
Exemplary (4)	All 4 weekly sessions were conducted (or reasonable accommodations are documented). Session descriptions are vivid and specific — the reader can picture what happened. The student demonstrates genuine skill in asking rather than telling, in normalizing difficulty, and in modeling metacognition. Adaptations are documented with clear reasoning (e.g., "My client found retrieval practice too frustrating without scaffolding, so in Week 2 I shifted to supported recall with answer-checking"). Ethical principles are respected throughout: consent is documented, confidentiality is maintained, boundaries are honored, the client's autonomy is respected. The student has a clearly non-dependent ending — the client leaves with a plan for independence.
Proficient (3)	At least 3–4 sessions were conducted. Descriptions include specific examples. The student shows effort to ask rather than tell. At least one adaptation is documented with reasoning. Ethical principles are mostly followed. Client has a path to independence.
Developing (2)	2–3 sessions were conducted, or sessions are described in generic terms without specific examples. Coaching style leans heavily toward lecturing. Adaptations are made but not justified. Ethical principles are inconsistently followed (e.g., consent is mentioned but not clearly obtained). No clear plan for client independence.
Beginning (1)	Fewer than 2 sessions were conducted, or sessions are not documented. No specific examples. No meaningful adaptation. Ethical considerations are absent. The project reads as an obligation completed, not a coaching relationship conducted.

Scoring Summary

How to Calculate Your Score

Step 1: Identify which capstone project you completed.

Step 2: Score yourself (or be scored) on the 6 applicable criteria:

Criterion	All Projects?	Capstone 1	Capstone 2	Capstone 3
1. Research Quality & Concept Integration	Yes	Yes	Yes	Yes
2. Reflection Depth	Yes	Yes	Yes	Yes
3. Writing & Communication Quality	Yes	Yes	Yes	Yes
4. Ethical Practice & Intellectual Honesty	Yes	Yes	Yes	Yes
5a. Study Design	—	Yes	—	—
5b. Science Communication Effectiveness	—	—	Yes	—
5c. Assessment & Plan Design	—	—	—	Yes
6a. Data Collection & Analysis	—	—	Yes	—
6b. Counterargument Handling & Audience Testing	—	—	Yes	—
6c. Implementation, Adaptation, & Ethics	—	—	—	Yes

Step 3: Apply the weights from the individual capstone project descriptions to calculate a weighted score. (Each capstone project document specifies the weight for each dimension.)

Step 4: Convert to your institution's grading scale. As a general guideline:

Weighted Average	Interpretation
3.5–4.0	Exceptional work — demonstrates mastery of both the content and the capstone skill
3.0–3.4	Strong work — meets all major expectations with minor areas for growth
2.5–2.9	Adequate work — meets some expectations but has significant areas for improvement
2.0–2.4	Below expectations — fundamental aspects of the project need substantial revision
Below 2.0	Incomplete or fundamentally insufficient — consider revision and resubmission

Instructor Notes

Calibration Guidance

To ensure consistent grading across sections and graders, here are some calibration notes:

The most common scoring error is inflating Criterion 2 (Reflection). Students often write lengthy reflections that feel deep but are actually generic. The key differentiator between Proficient (3) and Exemplary (4) is specificity. Does the student name a particular moment where their thinking shifted? Can they articulate what they now understand differently? Or are they saying "I learned so much from this experience" without demonstrating what "so much" actually means?

The second most common error is undervaluing Criterion 4 (Ethical Practice and Intellectual Honesty). Students who acknowledge limitations thoroughly should be rewarded, even if their projects produced less impressive results. A student who runs a flawed study and honestly analyzes why it was flawed is demonstrating more sophisticated thinking than a student who runs a slightly better study and claims strong conclusions.

For Capstone 1 specifically: "Negative" results (the evidence-based strategy did not outperform the comparison) should not be penalized. Evaluate the quality of the design, the honesty of the analysis, and the depth of the reflection — not whether the results matched the hypothesis. Some of the best student papers will have null or unexpected findings accompanied by thoughtful analysis.

For Capstone 2 specifically: Evaluate the final version of the guide (after audience feedback and revision), not the first draft. The draft-feedback-revise cycle is a core part of this project. Students who made substantial revisions based on genuine audience feedback should be evaluated on their revised work.

For Capstone 3 specifically: Do not evaluate based on whether the client showed measurable improvement. Four weeks is not enough time for many interventions to show effects, and client factors (motivation, life circumstances, baseline skill level) are outside the student's control. Evaluate the quality of the coaching process, the responsiveness of the adaptation, the depth of the relationship, and the honesty of the reflection.

Accommodations

Students who cannot find a client for Capstone 3 (e.g., due to social anxiety, geographic isolation, or disability) may propose an alternative: coaching themselves through a new learning challenge and documenting the self-coaching process using the same framework. This alternative should be approved in advance and held to the same reflective standards.
Students who cannot test their Capstone 2 guide on a live audience (e.g., due to time constraints or accessibility barriers) may submit the guide with a detailed hypothetical audience analysis and a written description of how they would incorporate feedback. This is acceptable but cannot receive higher than Proficient (3) on Criterion 6b, since actual audience testing is a core skill the project develops.
For all three capstones, word count targets are guidelines, not rigid requirements. Quality matters more than quantity. A brilliant 2,200-word paper should be scored higher than a padded 3,500-word paper. However, work substantially below the minimum (e.g., under 2,000 words) likely indicates insufficient depth and should be evaluated accordingly.

Self-Assessment Checklist

Before you submit your capstone, use this checklist to review your own work. This is a metacognitive exercise in itself — you're monitoring the quality of your own output, the same skill we've been building all semester.

For All Capstones

[ ] I have cited specific findings from at least 3 chapters of this textbook
[ ] I have used key terms correctly and consistently
[ ] I have acknowledged limitations and nuances in the research where relevant
[ ] My reflection includes specific examples, not just general statements
[ ] My reflection connects to my Learning Operating System
[ ] I have been honest about what didn't work as well as what did
[ ] My writing is clear, organized, and within the target word count
[ ] I have cited all sources
[ ] I have treated everyone involved (participants, clients, audience) with respect
[ ] I have not overclaimed — my conclusions match the strength of my evidence

Capstone 1 Additional

[ ] My hypothesis was written before I collected data
[ ] My study design includes clear controls and a fair comparison
[ ] I designed the test before anyone studied (to avoid bias)
[ ] I have complete data for all participants
[ ] I included confidence/calibration data, not just test scores
[ ] I have considered alternative explanations for my results

Capstone 2 Additional

[ ] I have included at least 5 myths, with at least 3 from Chapter 8
[ ] Each myth is stated fairly, not strawmanned
[ ] I have explained why each myth persists
[ ] I have provided evidence-based alternatives for each myth
[ ] I tested my guide on at least 3 members of my target audience
[ ] I revised my guide based on their feedback and documented what I changed

Capstone 3 Additional

[ ] I conducted a thorough initial assessment with my client
[ ] I established a baseline measurement before coaching began
[ ] My coaching plan directly addresses my client's specific needs
[ ] I conducted at least 3–4 weekly sessions (or documented why I couldn't)
[ ] I documented at least one adaptation I made to my plan and why
[ ] I obtained informed consent and maintained confidentiality
[ ] My client has a plan for continuing independently after coaching ended

🔗 Cross-References: - For the full project descriptions, see Capstone 1 (The Learning Intervention Study), Capstone 2 (The Learning Myths Debunking Guide), and Capstone 3 (Teach Someone Else to Learn) - For research methods background relevant to all three projects, see Appendix A (Research Methods Primer) - For templates you can adapt, see Appendix C (Templates & Worksheets) - For the core strategies referenced across all capstones, see Chapter 7 (The Learning Strategies That Work) - For metacognitive monitoring skills that underpin all three projects, see Chapter 13 (Metacognitive Monitoring) and Chapter 15 (Calibration)