Appendix E: Templates and Worksheets

These templates provide structured frameworks for the core tasks in statistical analysis. Photocopy them, print them, or recreate them in your notebook. They are designed to slow you down just enough to think carefully before, during, and after each analysis.

E.1 Hypothesis Test Template

Use this template for ANY hypothesis test — proportions, means, chi-square, ANOVA, nonparametric, or regression coefficients. Fill in every field before drawing a conclusion.

HYPOTHESIS TEST WORKSHEET

1. Research Question (in plain English):

2. Hypotheses:

H0: _______ (null — status quo, no effect, no difference)
Ha: _______ (alternative — the claim you're testing)
Test direction: [ ] Two-tailed [ ] Left-tailed [ ] Right-tailed

3. Significance Level:

alpha = ___ (set BEFORE looking at results)
Justification for this alpha: _______

4. Conditions / Assumptions Check:

Condition	Met?	Evidence
Random sample or random assignment	[ ] Yes [ ] No [ ] Unclear	______
Independence (10% condition or separate groups)	[ ] Yes [ ] No [ ] N/A	______
Sample size / normality condition	[ ] Yes [ ] No	______
Equal variances (if applicable)	[ ] Yes [ ] No [ ] N/A	______

If conditions are not met, what should you do? _________

5. Test Information:

Test name: _____
Test statistic formula: _____
Observed test statistic value: _____
Degrees of freedom (if applicable): _____
p-value: _____

6. Decision:

[ ] Reject H0 (p-value <= alpha)
[ ] Fail to reject H0 (p-value > alpha)

7. Conclusion (in context — use the words of the research question):

8. Effect Size and Practical Significance:

Effect size measure: ___ Value: _____
Is the effect practically meaningful? ___________
95% CI for the parameter: ( __ , __ )

9. Limitations and Caveats:

E.2 Confidence Interval Template

CONFIDENCE INTERVAL WORKSHEET

1. Parameter of Interest (in plain English):

2. Parameter Symbol: ____

3. Point Estimate:

Symbol: ____
Value: ____

4. Conditions Check:

Condition	Met?	Evidence
Random sample	[ ] Yes [ ] No [ ] Unclear	______
Independence (10% condition)	[ ] Yes [ ] No	______
Normality / sample size condition	[ ] Yes [ ] No	______

5. Confidence Level: ___ %

6. CI Formula:

Formula: point estimate +/- (critical value) x (standard error)
Standard error formula: _____
Standard error value: _____
Critical value (z or t): _____
Degrees of freedom (if t): _____
Margin of error: _____

7. Confidence Interval:

( __ , __ )

8. Interpretation (fill in the blanks):

"We are _% confident that the true ____ is between __ and _____."

9. What this does NOT mean (check your understanding):

[ ] It does NOT mean there is a _____% probability the parameter is in this interval.
[ ] It means that if we repeated the sampling process many times, approximately _____% of the resulting intervals would contain the true parameter.

10. Practical Interpretation:

Is the interval narrow enough to be useful? _ What decisions can you make based on this range? _________

E.3 Study Design Evaluation Checklist

Use this checklist to evaluate ANY study — whether you're reading it in a news article, a journal paper, or designing your own.

STUDY DESIGN EVALUATION

Study Title / Source: _____________

Date of Evaluation: _____

A. Basic Classification

[ ] Observational study
[ ] Experiment (randomized)
[ ] Natural experiment / quasi-experiment
[ ] Survey

B. Sampling

How were participants selected? ____________
Sampling method: [ ] Simple random [ ] Stratified [ ] Cluster [ ] Convenience [ ] Other
Sample size: n = _____
Is the sample representative of the target population? [ ] Yes [ ] No [ ] Unclear
Potential sampling biases:
[ ] Selection bias
[ ] Nonresponse bias
[ ] Survivorship bias
[ ] Volunteer/self-selection bias
[ ] Other: ___

C. Variables

Explanatory variable(s): _____________
Response variable(s): _________
Potential confounding variables: ____________
Were confounders controlled for? [ ] Yes (how?) [ ] No

D. Experimental Design (if applicable)

Was there random assignment to groups? [ ] Yes [ ] No
Was there a control group? [ ] Yes [ ] No
Was blinding used? [ ] Single-blind [ ] Double-blind [ ] No
Was a placebo used? [ ] Yes [ ] No [ ] N/A

E. Causal Claims

Does the study claim a causal relationship? [ ] Yes [ ] No
Is a causal claim justified? [ ] Yes [ ] No
Reasoning: __________

F. Ethical Considerations

Was informed consent obtained? [ ] Yes [ ] No [ ] Unclear [ ] N/A
Was IRB approval mentioned? [ ] Yes [ ] No [ ] N/A
Are there privacy concerns? [ ] Yes [ ] No
Could the findings be used to harm the study population? [ ] Yes [ ] No

G. Overall Assessment

Strengths of the study: ____________
Weaknesses: _________
Confidence in the conclusions (1-5): _____
What additional information would strengthen the study? _______

E.4 Data Cleaning Log Template

Every data cleaning decision changes the story your data tells. Document every decision for reproducibility and transparency.

DATA CLEANING LOG

Dataset: _____ Date: ___

Raw dataset dimensions: _ rows x _ columns

Step	Action	Columns Affected	Rows Changed	Justification	Decision Made By
1
2
3
4
5
6
7
8
9
10

Common actions to log: - Removed duplicate rows - Dropped rows with missing values in column(s) ___ - Imputed missing values in ___ using ___ method - Recoded variable ___ (original values -> new values) - Created new variable ___ from ___ - Removed outliers in ___ (criteria: ___) - Fixed inconsistent entries in ___ (e.g., "CA" and "California") - Changed data type of ___ from ___ to ___ - Filtered to subset where ___

Final dataset dimensions: _ rows x _ columns

Rows removed (total): ___ ( _____% of original)

Sensitivity check: Would different cleaning decisions change the main conclusions? - [ ] Yes (describe how: _________) - [ ] No - [ ] Not yet checked

E.5 Statistical Analysis Report Template

Use this structure for the Data Detective Portfolio and any formal statistical report.

STATISTICAL ANALYSIS REPORT

Title: ___________

Author(s): _____ Date: _____

1. Introduction (1/2 to 1 page)

What question are you investigating?
Why does this question matter?
What dataset are you using and where did it come from?
What is the scope of your analysis? (What are you including/excluding?)

2. Data Description (1/2 to 1 page)

Source and collection method
Sample size (n)
Key variables: name, type (categorical/numerical), and brief description
Data dictionary (table format)

3. Data Cleaning and Preparation (1/2 page + cleaning log)

Summary of cleaning steps (attach full cleaning log as appendix)
Missing data: how much, what patterns, how handled
Any variables created or recoded
Final dataset dimensions

4. Exploratory Data Analysis (1-2 pages)

Visualizations: histograms, box plots, bar charts, scatterplots
Summary statistics: center, spread, shape
Notable patterns, outliers, or unexpected findings
Each figure should have a title, axis labels, and a one-sentence interpretation

5. Statistical Analysis (2-3 pages)

State each hypothesis test formally (H0, Ha, alpha)
Show conditions checks
Report test statistics, p-values, and confidence intervals
Report effect sizes
For regression: report the model equation, R-squared, residual diagnostics
Interpret every result in context

6. Discussion and Conclusions (1 page)

What did you find? (Summary of key results)
What do the results mean in practical terms?
What are the limitations of your analysis?
What can you NOT conclude? (Correlation vs. causation, generalizability)
What would you do differently with more time or data?

7. Ethical Considerations (1/2 page)

Who collected this data and why?
Whose voices are included/excluded?
Could your analysis be misused? How?
What biases might affect your conclusions?

8. References

Cite the dataset source
Cite any external references used

E.6 Presentation Planning Worksheet

For presenting statistical findings to a non-technical audience.

PRESENTATION PLANNING WORKSHEET

Topic: ____________

Audience: ____ Time Limit: _

Audience Analysis

What does my audience already know about statistics? ____
What do they care about? ___________
What decision will they make based on my presentation? ________
What is the ONE thing I want them to remember? ________

Structure

Opening Hook (30 seconds — 1 minute):

Context / Why This Matters (1-2 minutes):

Key Finding 1:

Result: _____________
Visual: _____________
Plain-language explanation: _______

Key Finding 2:

Result: _____________
Visual: _____________
Plain-language explanation: _______

Key Finding 3 (if applicable):

Result: _____________
Visual: _____________
Plain-language explanation: _______

Limitations and Caveats (1 minute):

Recommendation / Call to Action:

Visualization Checklist

For each graph or table in the presentation:

[ ] Title is clear and descriptive
[ ] Axes are labeled with units
[ ] Font is large enough to read from the back of the room
[ ] Colors are colorblind-friendly
[ ] No 3D effects or chartjunk
[ ] The main message is obvious within 5 seconds
[ ] Source is cited

E.7 Ethical Review Checklist

Use this checklist BEFORE collecting data, during analysis, and before publishing results.

ETHICAL REVIEW CHECKLIST

Project: _____ Date: ___

Before Data Collection

[ ] Is this research covered by an IRB protocol (if applicable)?
[ ] Have participants given informed consent?
[ ] Is participation voluntary? Can participants withdraw?
[ ] Have you explained how the data will be used, stored, and shared?
[ ] Are you collecting only the data you need (data minimization)?
[ ] Could this data be used to identify individuals? If so, what protections are in place?
[ ] Are you compensating participants fairly?

During Analysis

[ ] Have you pre-registered your hypotheses, or are you being transparent about which analyses are exploratory?
[ ] Are you testing only the hypotheses you planned, or are you searching for significant results (p-hacking)?
[ ] Are you reporting ALL analyses you ran, not just the ones with significant results?
[ ] Have you checked whether your results look different for different demographic subgroups?
[ ] Are you using appropriate statistical methods for your data type and research question?
[ ] Are you interpreting p-values correctly (probability of data given H0, NOT probability of H0)?
[ ] Are you distinguishing between statistical significance and practical significance?

Before Reporting Results

[ ] Does your visualization accurately represent the data (no truncated axes, misleading scales, or cherry-picked time windows)?
[ ] Are you honest about the limitations of your analysis?
[ ] Are you making causal claims only when the study design supports them?
[ ] Have you considered who might be harmed by your conclusions?
[ ] Are you using language that is precise and avoids sensationalism?
[ ] Is your analysis reproducible? Could someone else follow your steps and get the same results?
[ ] Have you acknowledged potential biases in the data collection and analysis process?

Special Considerations for Algorithmic / AI Applications

[ ] Have you evaluated model performance separately for different demographic groups?
[ ] Are there proxy variables that could introduce discrimination?
[ ] Who bears the cost of false positives? False negatives? Is that distribution fair?
[ ] Is there a human review mechanism for high-stakes decisions?
[ ] Have you considered the Chouldechova impossibility result (you can't equalize all fairness metrics simultaneously when base rates differ)?