Chapter 10: The Evaluation Plan — Proving Your Project Will Produce Measurable Results

DataField.Dev

34 min read

There was a time when a grant proposal could promise to "serve the community" and "make a difference," and a funder would write the check on faith. That time is gone. Today, across nearly every sector, funders want to know not just what you will do...

Prerequisites

8
9
5

Learning Objectives

Explain why funders increasingly treat the evaluation plan as decisive
Construct a complete logic model linking inputs, activities, outputs, outcomes, and impact
Distinguish process from outcome evaluation and formative from summative
Write SMART objectives with indicators, targets, data sources, and methods
Choose data-collection and analysis methods appropriate to each indicator
Decide between internal and external evaluation and design a credible plan

In This Chapter

10.1 Why Evaluation Is Increasingly Decisive
10.2 The Logic Model, Built in Full
10.3 Process vs. Outcome Evaluation
10.4 SMART Objectives and Indicators
10.5 Data Collection and Analysis
10.6 Who Evaluates, and Formative vs. Summative
10.7 Evaluation for Research: The Analysis Plan
Spaced Review
Chapter Summary
Looking Ahead

Exercises Quiz Case Study 01 Case Study 02 Key Takeaways Further Reading

Chapter 10: The Evaluation Plan — Proving Your Project Will Produce Measurable Results

There was a time when a grant proposal could promise to "serve the community" and "make a difference," and a funder would write the check on faith. That time is gone. Today, across nearly every sector, funders want to know not just what you will do but what will change because you did it — and how you will know. The evaluation plan is your answer: the section that proves your project will produce measurable results and that you have a credible way to find out whether it did. Once an afterthought tacked on at the end, the evaluation plan has become, for many funders, one of the most decisive sections in the proposal.

This chapter teaches you to build one reviewers trust. We will construct the logic model in full (the connective tissue you met in Chapter 5, now built piece by piece), distinguish the kinds of evaluation and when each applies, write SMART objectives with the indicators and data that make them real, choose methods to collect and analyze the data, and decide who should do the evaluating. By the end you will have a logic model and a set of measurable objectives for your own project — the proof that your promised outcomes are more than hopes.

Research proposals frame evaluation differently — as an analysis plan with endpoints and statistical power rather than a program evaluation with outcomes and indicators — so we treat the research version in its own section (10.7). But the core idea is universal: you must specify, in advance, how you will know whether the project worked.

A reassurance before we start, because evaluation intimidates non-specialists who fear it requires research-methods expertise they lack. It does not require you to be a statistician or a professional evaluator. It requires you to think clearly about a simple sequence of questions: What change am I trying to produce? How would I know if it happened? What would I measure, and how, and when? Could I tell my change apart from what would have happened anyway? An applicant who answers those questions honestly and specifically has written a sound evaluation plan, whatever their methodological background. For more rigorous designs you may bring in an evaluator (Section 10.6), but the core logic — outcomes, indicators, measurement, comparison — is accessible to anyone willing to think it through. This chapter gives you that logic; you supply the clear thinking about your own project.

10.1 Why Evaluation Is Increasingly Decisive

Understanding why evaluation has risen in importance tells you how to write it. Three forces have converged. First, funders are accountable — to boards, to Congress, to donors — for results, and they cannot demonstrate results from projects that did not measure them; a grantee who cannot report outcomes is a grantee who makes the funder look ineffective. Second, the evidence-based-practice movement has raised expectations across health, education, and social services: funders increasingly want to fund what works, which requires measuring whether it does. Third, in an era of scarce resources and rising competition, the ability to demonstrate impact is a competitive advantage — the proposal that can credibly promise to show its results beats the one that merely promises to do good.

The practical upshot: a strong evaluation plan signals that you are a serious, accountable partner who will produce the results-evidence the funder needs, and it makes your outcome promises (Chapters 6–7) credible. A weak or vague evaluation plan ("we will track our progress and report on our success") signals the opposite — that you have not thought about how you will know whether the project worked, which makes a reviewer doubt that it will.

🚪 Threshold Concept: A logic model is the spine that connects need, activities, and impact — and the evaluation plan is how you prove the spine holds. Funders no longer fund activity; they fund measurable change. Your evaluation plan is your promise to show whether that change happened. This reframes the whole section: it is not a bureaucratic requirement to satisfy but the mechanism by which you convert your project's promised outcomes from hopes into commitments you can be held to — which is exactly what makes a funder trust them. The applicant who embraces evaluation as the proof of impact, rather than enduring it as a chore, writes a section that wins.

This shift has a name in the sector: the move from outputs to outcomes, from counting activity to measuring change. It has reshaped grantmaking over the past two decades, and it explains why a section that older proposals could treat lightly now carries real weight. Funders adopted outcome thinking because they grew tired of funding projects that reported lots of activity and no demonstrable difference — and once a funder thinks in outcomes, an output-only evaluation plan reads as a relic, a sign that the applicant has not kept up with how serious funders now operate. Writing a genuine outcome evaluation is, in part, a signal that you are a current, sophisticated partner who speaks the funder's modern language.

🗣️ From the Review Panel: I have watched the evaluation section move from the back of my mind to the front over my years of reviewing. Early on, I skimmed it. Now it is often where I decide between two otherwise-comparable proposals, because it tells me which applicant has actually thought about whether their project will work, not just whether it will happen. When I read "we will measure success through participant satisfaction surveys," I worry — satisfaction is not impact, and the vagueness suggests they haven't thought it through. When I read a clear logic model, SMART objectives with specific indicators and targets, and a credible plan to collect and analyze the data, I trust that this applicant will deliver results I can point to. In a competitive round, that trust is often the deciding factor.

There is also a quieter reason the evaluation plan matters, one that serves you beyond the proposal: the evaluation you design now is the data you'll report later (Chapter 26), and good outcome data becomes the evidence base for your next proposal. A project that measures its outcomes well produces, at its end, exactly the preliminary data or track-record evidence that makes the next grant easier to win (Chapter 9). A project that measures only outputs ends with nothing to show but activity, and has to make its next case from scratch. So a rigorous evaluation is an investment in your funding future, not just a requirement of the present grant — the well-evaluated project compounds into the next one, while the poorly-evaluated project leaves you starting over. Funders, who think in terms of long relationships (Chapter 3), can sense which kind of grantee they're funding.

10.2 The Logic Model, Built in Full

🧩 Productive Struggle: Before reading, try to articulate the difference between an output and an outcome for a project you know. A tutoring program: is "200 students tutored" an output or an outcome? What about "reading scores improved"? Most people sense the distinction but struggle to state it crisply. Hold your attempt against the definitions below — the output/outcome line is the single most important distinction in this chapter, and the place beginners most often blur doing into achieving. If you can reliably sort your project's measures into the two columns, you have most of what the logic model requires.

You met the logic model in Chapter 5 as the connective tissue of the whole proposal. Now we build it in full, because the evaluation plan is where it lives and does its hardest work. A logic model lays out, in sequence, the causal chain of your project:

Inputs → Activities → Outputs → Outcomes → Impact

Inputs are the resources you put in: funding, staff, partners, materials, facilities. (These map to your budget and capacity.)
Activities are what you do with the inputs: the program, services, or research you carry out. (These map to your approach.)
Outputs are the direct, countable products of the activities: number of people served, sessions held, materials distributed, samples analyzed. Outputs measure what you did, not yet what changed.
Outcomes are the changes that result — in knowledge, behavior, condition, or status — usually divided into short-term, intermediate, and long-term. Outcomes measure what changed as a result of your activities. This is what funders most care about.
Impact is the long-term, broad change you contribute to: reduced disease burden, a more skilled workforce, a stronger community. (This maps to your significance — the "why it matters.")

The crucial distinction, and the one beginners most often miss, is outputs versus outcomes. An output is "we held 30 workshops and served 200 people"; an outcome is "participants' knowledge increased, and 60% changed their behavior." Outputs are necessary to report but are not what the funder is buying — a project can have impressive outputs (lots of activity) and zero outcomes (nothing actually changed). Funders fund outcomes; outputs are merely evidence that the activities happened. A logic model that stops at outputs, or blurs outputs into outcomes, signals that the applicant has confused doing with achieving.

📋 Template — A logic model: | Inputs | Activities | Outputs | Outcomes (short → long) | Impact | |---|---|---|---|---| | [funding, staff, partners, materials] | [the program/services/research you do] | [counts: served, held, produced] | [changes: knowledge → behavior → condition] | [broad long-term change] |

Build it left to right, then check it right to left: does each impact follow from outcomes you'll produce? Do those outcomes follow from your activities? Are the activities supported by your inputs? Any broken link is a flaw a reviewer will see — and a flaw in your actual project logic, not just its description.

🔍 Why Does This Work?: Why does the logic model persuade and discipline at once? Because it forces the causal logic of your project into the open, where both you and the reviewer can inspect it. Most weak projects have a hidden gap between activities and outcomes — an unstated assumption that "if we do X, good things will happen" — and the logic model exposes it: when you try to fill in the outcomes column, you discover whether your activities plausibly produce the changes you promise. If the chain from activities to outcomes to impact holds up, you have a fundable theory of change; if it has a gap (your activities can't actually produce the outcomes you claim), the logic model shows you before the reviewer finds it. It is, as Chapter 5 said, a coherence X-ray — and the evaluation plan is where you put it to work.

Here is RYCC's logic model, built in full (composite). Inputs: \$50,000 grant, two part-time instructors, a coordinator, partner-school space, donated laptops, the curriculum. Activities: twice-weekly after-school coding classes at three schools, 30 weeks, following a sequenced curriculum, plus family-engagement events. Outputs: 90 students enrolled; ~5,400 student-contact-hours delivered; three sites operating; 90 families engaged. Outcomes: short-term — measurable gains in coding proficiency and increased confidence/interest in technology; intermediate — students enrolling in high-school CS courses; long-term — students progressing toward technology education and careers. Impact: a narrowed neighborhood digital-skills gap and expanded youth opportunity (the funder's mission). Now read it right-to-left as a reviewer would: the impact (opportunity) follows from the long-term outcome (progression to tech pathways), which follows from the intermediate outcome (enrolling in HS CS), which follows from the short-term outcome (proficiency gains), which follows from the activities (the classes), which the inputs (funding, staff, space) support. Every link holds — and notice that the outputs (90 served, 5,400 hours) are clearly distinguished from the outcomes (proficiency gains, progression). A reviewer can see at a glance both that the project's logic is sound and that RYCC knows the difference between activity and achievement.

10.3 Process vs. Outcome Evaluation

A complete evaluation plan typically includes two kinds of evaluation, answering two different questions:

Process (or implementation) evaluation asks: Did you do what you said you would do? It tracks whether the activities happened as planned, reached the intended population, and were delivered with fidelity and quality. It measures outputs and implementation. Process evaluation matters because an outcome failure could mean either that the program doesn't work or that it was never properly delivered — and you cannot tell which without process data. ("We saw no improvement" means something very different if attendance was 90% than if it was 20%.)
Outcome (or impact) evaluation asks: Did it work? It measures whether the intended changes — the outcomes — actually occurred. This is what funders most want to know, and it is the harder, more important half.

A strong plan includes both: process evaluation to confirm the program was delivered as intended, and outcome evaluation to determine whether it produced the promised changes. Together they let you tell the full story — not just "we did the work and outcomes improved" but "we delivered the program with fidelity to the intended population, and therefore the outcome improvements can be attributed to it."

⚠️ Common Pitfall: Measuring only outputs and calling it an outcome evaluation. "We will evaluate the program by tracking the number of participants served and sessions held" is a process measure dressed up as evaluation — it tells you the activities happened but nothing about whether they worked. Reviewers see through this immediately. Every evaluation plan needs genuine outcome measures: the changes in knowledge, behavior, or condition that the project promises. If your plan measures only what you did and not what changed, you have not yet written an evaluation plan.

A worked example shows the two working together. RYCC's process evaluation tracks: were the 90 students enrolled, did the classes run as scheduled at all three sites, was attendance maintained, was the curriculum delivered with fidelity? Its outcome evaluation tracks: did students' coding proficiency increase (pre/post assessment), did interest in tech pathways rise (survey), did students enroll in high-school CS (follow-up records)? Now imagine the outcome data come back disappointing — proficiency gains are small. Without process data, RYCC cannot interpret this: did the program not work, or was it never properly delivered? With process data, the story is clear — if attendance was 85% and the curriculum was delivered with fidelity, a small gain suggests the program itself needs strengthening; if attendance was 30% and two sites started late, the disappointing outcome reflects implementation problems, not the model. Process evaluation is what lets you tell the difference, which is why a reviewer wants both: outcome evaluation to know if it worked, process evaluation to know why.

🔄 Check Your Understanding: A literacy program reports: "We held 50 tutoring sessions and served 120 children." Is this a process measure or an outcome measure, and what outcome measure should the program add?

Answer
It's a process/output measure — it reports activity (sessions held, children served), not change. The program should add an outcome measure of the change it exists to produce: e.g., "children's reading levels improved by [X] on a standardized assessment from baseline to program end, compared to [target or comparison group]." Outputs show the program happened; outcomes show whether it worked — and funders fund outcomes.

10.4 SMART Objectives and Indicators

Outcomes become measurable through SMART objectives — objectives that are Specific, Measurable, Achievable, Relevant, and Time-bound. A SMART objective turns a vague aspiration ("improve students' coding skills") into a commitment you can be held to ("by the end of the program year, at least 70% of enrolled students will demonstrate a measurable increase in coding proficiency on a pre/post assessment").

Walk through the elements: Specific (coding proficiency, not "skills"), Measurable (a pre/post assessment with a defined increase), Achievable (70%, grounded in your track record — not 100%, which would be implausible), Relevant (coding proficiency is the outcome your project exists to produce), and Time-bound (by the end of the program year). Each SMART objective should correspond to an outcome in your logic model, and together your objectives should cover the outcomes you promised in your aims or executive summary (Chapters 6–7) — coherence again.

For each objective, you need an indicator — the specific, observable measure that tells you whether the objective was met — along with a target (the level that counts as success), a data source (where the measure comes from), and a method (how you'll collect it). "Coding proficiency increase" is the indicator; "70% of students" is the target; "pre/post assessment scores" is the data source; "administered in week 1 and week 30" is the method.

📋 Template — SMART objective + measurement: - Objective: "By [time], [number/percent] of [population] will [specific, measurable change]." - Indicator: [the observable measure of the change] - Target: [the level that counts as success] - Data source: [where the measure comes from] - Method/timing: [how and when you collect it] Write one of these for each outcome in your logic model. The set of them is the operational heart of your evaluation plan.

Here is RYCC's set, mapped to its logic-model outcomes (composite). Short-term: "By program end, ≥70% of enrolled students will demonstrate a measurable increase in coding proficiency on a validated pre/post assessment." (Indicator: proficiency-score change; target: 70%; source: pre/post assessment; method: administered weeks 1 and 30.) Short-term: "By program end, ≥60% of students will report increased interest in pursuing technology coursework." (Indicator: self-reported interest; target: 60%; source: pre/post survey; method: same.) Intermediate: "Within one year, ≥40% of program completers will enroll in a high-school computer-science course." (Indicator: CS enrollment; target: 40%; source: school records; method: follow-up at the next school year.) Each objective ties to a specific outcome in the logic model, names a concrete indicator and a target grounded in RYCC's track record, and specifies exactly how and when it's measured. A reviewer reading these knows precisely what RYCC is promising to achieve and exactly how RYCC will know whether it did — the difference between a measurable commitment and a hope.

🔄 Check Your Understanding: Convert this vague objective into a SMART one (invent plausible specifics): "The program will help participants find jobs."

Answer
Something like: "Within six months of program completion, at least 50% of participants will be employed in a job paying at least [wage]." Specific (employed at a wage threshold), Measurable (employment rate via follow-up survey/records), Achievable (50%, grounded in track record or evidence base), Relevant (employment is the outcome), Time-bound (within six months). Indicator: employment rate; target: 50%; data source: follow-up survey or employment records; method: contact participants at 6 months post-completion.

Setting the target — the "70%," the "40%" — is its own small art, and reviewers notice when it's done thoughtlessly. A target should be ambitious but defensible: high enough to matter, grounded in evidence that you can reach it. Where does the number come from? Ideally from your own track record ("our existing site achieves 75% proficiency gains, so 70% across three new sites is realistic") or from the evidence base of comparable programs. A target with no justification ("90% will achieve the outcome") invites the reviewer's skepticism — why 90%, and what makes you think you'll hit it? — while a target that's too modest ("10% will show improvement") suggests the program barely works. The strongest targets are explicitly justified: "Based on our four-year track record of [X]% and the literature on comparable programs, we project [target]." This turns a number that could look arbitrary into a defensible projection, and it signals that you understand your own program well enough to predict its results — which is exactly the command a funder wants to see. Avoid, too, the opposite trap of sandbagging — setting targets so low you're sure to exceed them — because sophisticated reviewers recognize it, and because you'll have to report against these targets later (Chapter 26), where wildly-exceeded modest targets look like you didn't understand your own program.

📊 From the Field: A few indicator traps to avoid. Unmeasurable indicators: "participants will feel more empowered" — empowerment is real but you must define how you'll measure it (a validated scale, specific behaviors), or it's not an indicator. Vanity indicators: measures that look good but don't reflect the outcome you care about — "social-media followers" for a program meant to change behavior, or "satisfaction" as a stand-in for "it worked" (people can enjoy a program that changes nothing). Indicators you can't actually collect: a beautiful outcome measure that requires data you have no way to obtain (long-term follow-up of participants you'll lose contact with). Too many indicators: a dozen measures that no one will realistically track, signaling you haven't prioritized. The discipline mirrors the "earn its place" rule (Chapter 5): each indicator should genuinely measure a promised outcome, be feasibly collectable, and be one you'll actually use. A few real, measurable, collectable indicators beat a long list of aspirational ones — the same lesson the needs section taught about statistics, applied to measurement.

10.5 Data Collection and Analysis

An objective with an indicator is still only a promise until you specify how you will actually collect and analyze the data. This is where many evaluation plans go soft — they name outcomes but never say how they'll be measured, leaving a reviewer to doubt the measurement is real.

For each indicator, specify the data-collection method (survey, assessment, observation, administrative records, interviews, focus groups, biological measures), who collects it, when (baseline, midpoint, end, follow-up), and from whom. Match the method to the indicator: a knowledge change calls for a pre/post test; a behavior change calls for observation or self-report (with its known limitations); a condition change (health, employment) calls for clinical measures or records. Then specify the analysis: how you'll turn the data into a finding (compare pre to post, compare to a target or comparison group, test for significance where appropriate).

✅ Best Practice: Wherever possible, include a comparison — a baseline (pre/post), a target, or a comparison group — because change is only meaningful against a reference. "80% of participants scored proficient at the end" means little without knowing where they started or what would have happened anyway. "Proficiency rose from 20% at baseline to 80% at end" tells a story; a comparison group ("versus 30% in a comparison group") tells a stronger one. The strength of your evaluation design — from simple pre/post up to randomized comparison — should match the funder's expectations and your resources, but some basis for comparison is what separates a real outcome measure from an uninterpretable number.

📊 From the Field: Funders increasingly want evaluation data they can aggregate and report, which is why government programs often specify required performance measures you must use (Chapter 7) and why foundations increasingly ask for outcomes in a particular form. Read the announcement: if the funder names the measures or the framework they expect, use them. Aligning your indicators to the funder's reporting needs does double duty — it strengthens your evaluation plan and signals that you understand what the funder is accountable for, the same "speak the funder's metric language" lesson from Chapter 7.

It helps to lay the measurement out as a table, which is also a clean way to present it in the proposal:

Outcome	Indicator	Target	Data source	Method / timing
Coding proficiency rises	pre/post score change	≥70% improve	validated assessment	weeks 1 & 30
Interest in tech rises	self-reported interest	≥60%	survey	weeks 1 & 30
Progression to HS CS	CS course enrollment	≥40%	school records	next school year
Program delivered (process)	sessions held; attendance	90% of planned	attendance logs	weekly

A table like this does a great deal of persuasive work at a glance: it shows the reviewer that every promised outcome has a real, specific measurement plan, that the targets are concrete and grounded, and that RYCC has thought through how and when each piece of data is collected — not just what it hopes to achieve. Presenting the evaluation as a clear matrix rather than a paragraph of prose makes it scannable for a tired reviewer (Chapter 2) and signals operational seriousness. When a funder requires specific performance measures, add a column mapping your indicators to theirs, showing explicitly that you'll report what they need.

🔄 Check Your Understanding: A proposal states: "We will measure the program's success and report our outcomes to the funder." Name three things missing that a reviewer needs, and why each matters.

Answer
Missing: (1) what specifically will be measured — the indicators for each promised outcome (without them, "success" is undefined); (2) how and when — the data sources, methods, and timing (without them, the measurement may not be real or feasible); (3) a comparison basis — baseline, target, or comparison group (without it, the numbers are uninterpretable). The vagueness signals the applicant hasn't thought through how they'll actually know whether the project worked — the exact doubt a strong evaluation plan removes.

10.6 Who Evaluates, and Formative vs. Summative

Two more design choices shape the evaluation plan.

Formative vs. summative. Formative evaluation happens during the project and feeds back to improve it — early data that lets you course-correct. Summative evaluation happens at the end and judges overall success. A strong plan often includes both: formative evaluation to show you'll learn and adapt as you go (reassuring to funders, who like grantees that improve), and summative evaluation to determine final impact. Mentioning formative evaluation signals maturity — that you see evaluation as a tool for doing the work better, not just a final report card.

A concrete example: RYCC's formative evaluation reviews attendance and early proficiency data at the midpoint of the program year. If one site shows low attendance, RYCC can intervene — adjusting scheduling, strengthening family engagement — while there is still time to improve outcomes, rather than discovering the problem only in the summative end-of-year report when it's too late to act. Building this in tells a funder two valuable things: that RYCC will catch and fix problems during the grant (protecting the funder's investment), and that RYCC treats the grant period as a chance to learn and improve, not just to execute a fixed plan. Funders increasingly value this adaptive posture — sometimes called "developmental" or "learning" evaluation in its fuller forms — because real-world projects rarely run exactly as planned, and a grantee who monitors and adjusts is a safer bet than one who only finds out at the end whether things worked.

Internal vs. external evaluators. Internal evaluation is conducted by your own staff; external evaluation by an independent third party. External evaluation costs more (it's a budget line, Chapter 11) but carries more credibility, because the results aren't graded by the people with a stake in them. Larger grants and more rigorous funders often expect or require an external evaluator; smaller projects may reasonably use internal evaluation with credible methods. Choose based on the funder's expectations, the stakes, and your budget — and if you use an external evaluator, naming a qualified one (with a letter, Chapter 13) strengthens the plan considerably.

How do you decide? Weigh three things. Funder expectation: read the announcement — some programs require an independent evaluator outright, and ignoring that requirement is a compliance failure. Stakes: the larger the grant and the more the funder's reputation rides on the results, the more independence matters. Budget: a credible external evaluation is a real cost (often a meaningful percentage of the budget), and a tiny grant cannot bear a five-figure evaluation contract — for such projects, rigorous internal evaluation with honest methods is both expected and appropriate. The mistake to avoid is at the extremes: proposing a casual internal survey for a high-stakes, large grant (under-rigorous for the stakes), or loading a small community grant with an expensive external evaluation that crowds out the program itself (over-engineered for the scale). When you do engage an external evaluator, involve them early — ideally during proposal development, so the evaluation design is sound from the start and their letter of commitment reflects a real plan, not a name borrowed for credibility. A thoughtfully matched evaluation approach signals judgment; a mismatched one signals that you didn't think about the question at all.

🪞 Learning Check-In: Notice if you're tempted to treat the evaluation plan as paperwork — the boring section you'll fill in to satisfy the funder. That attitude produces the vague, output-only plans reviewers distrust. The reframe that makes the difference: evaluation is how you will find out whether your project actually works — which is something you should genuinely want to know, for the sake of the people you're trying to help, not just the funder. An applicant who is sincerely curious whether their program works writes a rigorous evaluation; one who sees it as a hoop writes a vague one. Bring the curiosity, and the rigor follows.

🗣️ From the Review Panel: When a proposal names a qualified external evaluator with a clear, independent plan, my confidence rises noticeably — not because internal evaluation is always inadequate, but because I know the results won't be graded by the people who want the program to succeed. The same applies in research with a pre-registered analysis plan or an independent data-safety board: independence is credibility. That said, I don't penalize a small project for using rigorous internal evaluation it can afford; what I penalize is a mismatch — a high-stakes, large grant proposing to evaluate its own success with a casual internal survey. Match the rigor of your evaluation to the stakes of your project and the expectations of the funder. An evaluation that's too weak for the stakes reads as an applicant hoping not to be measured.

A budget note that the next chapters will pick up: every choice in this section has a cost. An external evaluator is a budget line (often a meaningful one). Data collection takes staff time, instruments, incentives for participants, and sometimes specialized software. A rigorous comparison-group design costs more than a simple pre/post. These costs are not optional add-ons to apologize for — they are the price of the credible results the funder wants, and a budget that funds a serious evaluation signals that you take outcomes seriously. When you reach Chapter 11, your evaluation plan becomes specific budget lines; make sure they're there, because an evaluation plan the budget doesn't fund is not a real plan (the coherence principle again).

10.7 Evaluation for Research: The Analysis Plan

Research proposals frame "evaluation" differently, but the underlying demand — specify in advance how you'll know whether it worked — is the same. For research, the equivalent of the evaluation plan is the analysis plan, woven into the approach (Chapter 9) and sometimes its own section. Its elements:

Endpoints / outcome measures: the specific, pre-specified variables that define success or answer the question (the research analogue of indicators).
Statistical analysis plan: how the data will be analyzed to test the hypothesis — the specific tests, models, and comparisons.
Power analysis / sample size justification: evidence that the study is large enough to detect the effect if it exists. An underpowered study — too small to find the effect it's looking for — is a fatal flaw reviewers actively hunt for, because it wastes money on a study that cannot answer its own question.
Handling of confounds, missing data, and multiple comparisons: the rigor elements (Chapter 9) that show the analysis is sound.

Hernandez's analysis plan illustrates (composite). Her primary endpoint for Aim 1 is medication adherence at 12 months (pre-specified measure); for Aim 2, change in HbA1c. Her analysis: an intention-to-treat comparison between intervention and usual-care groups using [appropriate model], adjusting for baseline. Her power analysis: "a sample of [N] provides 80% power to detect a [effect size] difference in adherence at α=0.05, based on the variability observed in our pilot" — the justification that her study is large enough to find the effect if it exists. Her handling of missing data (pre-specified method) and multiple comparisons (how she'll guard against false positives across aims) shows rigor. Notice that this is the same discipline as RYCC's evaluation table — pre-specified measures, a defined method to turn data into a conclusion, and justification that the design can actually detect what it's looking for — just in statistical clothes. The power analysis is the research analogue of a defensible target: it answers the reviewer's "will this study even be able to find the effect?" before they can raise it, and an underpowered design (no power justification, or a sample too small) is the research equivalent of a vague, unmeasurable outcome — a fatal flaw reviewers actively hunt for.

The discipline is identical to the program evaluation plan: pre-specify your measures and your analysis, justify that they can actually detect what you're looking for, and don't leave the reviewer wondering how you'll turn data into a conclusion. A research proposal with vague analysis ("data will be analyzed using appropriate statistical methods") fails for the same reason a program plan with vague outcomes fails — it suggests the applicant hasn't thought through how they'll know whether the project worked.

🔗 Connection: Whether you call it an evaluation plan or an analysis plan, this section is bound to the rest of the proposal by the coherence principle (Chapter 5): it must measure exactly the outcomes your aims promised (Chapters 6–7), test whether your approach (Chapter 9) produced them, and require resources your budget funds (Chapters 11–12, where an external evaluator or data-collection costs appear). The logic model is the thread tying need → activities → outcomes → impact across all of these. Build it well here and the whole proposal tightens.

Step back and notice what the evaluation plan has accomplished within the proposal's argument. The needs section established that the problem matters; the aims promised specific outcomes; the approach showed you can do the work; and now the evaluation plan closes the loop by specifying exactly how you'll know whether the promised outcomes were achieved. It is the proposal holding itself accountable — the applicant saying, in effect, "don't take my word that this will work; here is precisely how we'll measure whether it did, and what success looks like." That self-imposed accountability is profoundly reassuring to a funder, because it signals an applicant who is more interested in actually helping than in merely getting funded. The vague applicant promises good things will happen; the rigorous applicant promises to find out whether they did, and to report honestly either way. In a world where funders have been burned by projects that produced activity but no demonstrable change, the applicant who builds genuine measurement into their plan stands out — and increasingly, wins.

📐 Project Checkpoint — Build your logic model and SMART objectives: For your project, (1) construct a complete logic model (inputs → activities → outputs → outcomes → impact), and check it right-to-left for broken links. (2) For each outcome, write a SMART objective with its indicator, target, data source, and method/timing. (3) Include both process (did you deliver it?) and outcome (did it work?) measures. (4) Decide internal vs. external evaluation and note any evaluator. (5) For research, draft the analysis plan (endpoints, statistical plan, power/sample-size justification). (6) Check coherence: do your objectives measure exactly the outcomes your aims/summary promised? Save it in your "My Proposal" document; your budget (next chapters) will fund whatever the evaluation requires.

Spaced Review

Retrieve these from earlier chapters without looking back.

(From Chapter 5) You met the logic model as "connective tissue." How does this chapter use it, and what does each link map to in the proposal?
(From Chapter 9) The approach proved you can do the work; how does the evaluation plan relate to it, and what does process evaluation add?
(From Chapters 6–7) Your aims/summary promised outcomes. What must the evaluation plan do with respect to those promises (coherence)?

Answers
1. The logic model (inputs → activities → outputs → outcomes → impact) is the spine; its links map to budget/capacity (inputs), approach (activities), near-term evaluation targets (outputs), promised outcomes (outcomes), and significance (impact). The evaluation plan builds it in full and proves it holds. 2. The approach showed you can do the activities; the evaluation shows whether they work. Process evaluation adds confirmation that the activities were delivered as planned and to the intended population — needed to interpret outcome results (a null outcome could mean the program failed or was never properly delivered). 3. The evaluation must measure exactly the outcomes the aims/summary promised — each promised outcome should have a SMART objective, indicator, and method. Promising an outcome you don't measure, or measuring one you didn't promise, breaks coherence.

Chapter Summary

Key Takeaways

Funders fund measurable change, not activity (threshold concept). The evaluation plan is your promise to show whether the change happened, and it is increasingly decisive — often the tiebreaker between comparable proposals.
The logic model (inputs → activities → outputs → outcomes → impact) is the spine. Build it in full and check it right-to-left for broken links. The key distinction is outputs (what you did) vs. outcomes (what changed) — funders buy outcomes.
Include both process evaluation (did you deliver it, with fidelity, to the right population?) and outcome evaluation (did it work?). Output-only "evaluation" is a process measure in disguise.
Write SMART objectives (Specific, Measurable, Achievable, Relevant, Time-bound), each with an indicator, target, data source, and method, covering the outcomes your aims/summary promised.
Specify data collection and analysis concretely, and include a comparison (baseline, target, or comparison group) so the change is interpretable. Align indicators to any funder-required measures.
Choose formative + summative and internal vs. external evaluation to fit the stakes, funder expectations, and budget. An external evaluator adds credibility (and a budget line).
For research, the analysis plan plays the same role: pre-specified endpoints, a statistical plan, and a power/sample-size justification (an underpowered study is a fatal flaw).

Action Items

Build and check your logic model.
Write a SMART objective (with indicator/target/source/method) for each outcome.
Include process and outcome measures, and a comparison basis.
Decide internal vs. external evaluation; draft the research analysis plan if applicable.

Common Mistakes to Avoid

Measuring only outputs and calling it outcome evaluation.
Vague objectives ("track our progress and report success") or vague analysis.
A logic model with a broken activities→outcomes link.
An evaluation that doesn't measure the outcomes you promised, or (research) an underpowered design.

Decision Framework: Is your evaluation plan ready?

Ask: (1) Does your logic model hold right-to-left? (2) Do you measure outcomes, not just outputs? (3) Does each promised outcome have a SMART objective with indicator/target/source/method? (4) Is there a comparison basis? (5) Have you chosen who evaluates and (research) justified power? Any "no" is your next revision.

Looking Ahead

You have proven the project matters (Chapter 8), that you can do it (Chapter 9), and that it will produce measurable results (Chapter 10). Now you must translate the whole plan into dollars. Chapter 11: The Budget teaches budgeting as strategy, not arithmetic: personnel and effort, direct versus indirect costs, the categories reviewers expect, multi-year budgets with escalation, and the templates for NIH, NSF, foundation, and government formats. Your logic model's inputs become budget lines, your activities become costs, and your evaluation — including any external evaluator and data-collection costs — becomes a number. The budget is your plan, told in dollars, and like every section before it, it must cohere with all the others: every activity in your approach and every measure in your evaluation has to be funded, or the proposal contradicts itself.

Continue to the Exercises, the Quiz, and the two Case Studies (1, 2). The Key Takeaways card is your quick-review anchor.

Next: Chapter 11 — The Budget: The Numbers That Tell Your Story in Dollar Signs.