Key Takeaways: Analysis of Variance (ANOVA)

Contributors

Key Takeaways: Analysis of Variance (ANOVA)

One-Sentence Summary

Analysis of variance (ANOVA) compares means across three or more groups by decomposing total data variability into between-group (explained) and within-group (unexplained) components, using the F-statistic ratio to determine whether group differences exceed what random variation alone would produce — followed by Tukey's HSD post-hoc tests to identify which specific groups differ while controlling the family-wise error rate.

Core Concepts at a Glance

Concept	Definition	Why It Matters
Multiple comparisons problem	Running many tests inflates the probability of at least one false positive far beyond $\alpha$	Explains why multiple t-tests are dangerous and ANOVA is necessary
Between-group variability	Variability in the data explained by group membership ($SS_B$)	The "signal" — differences among group means
Within-group variability	Variability in the data unexplained by groups ($SS_W$) — natural noise within each group	The "noise" — individual differences within groups
F-statistic	Ratio $MS_B / MS_W$ — signal divided by noise	Large F means group differences are unlikely due to chance alone
Decomposing variability	$SS_T = SS_B + SS_W$ — total variation splits exactly into explained and unexplained parts	The threshold concept; foundation for regression $R^2$ and all statistical modeling

The ANOVA Procedure

Step by Step

State hypotheses: - $H_0: \mu_1 = \mu_2 = \cdots = \mu_k$ (all group means equal) - $H_a$: Not all $\mu_i$ are equal (at least one group differs)
Check assumptions: - Independence (study design) - Normality within each group (histograms, QQ-plots, Shapiro-Wilk) - Equal variances (Levene's test, SD ratio $< 2$)
Compute the ANOVA table:

Source	SS	df	MS	F
Between	$\sum n_i(\bar{x}_i - \bar{x})^2$	$k - 1$	$SS_B / (k-1)$	$MS_B / MS_W$
Within	$\sum\sum(x_{ij} - \bar{x}_i)^2$	$N - k$	$SS_W / (N-k)$
Total	$\sum\sum(x_{ij} - \bar{x})^2$	$N - 1$

Find the p-value from the F-distribution with $df_1 = k-1$ and $df_2 = N-k$
Compute effect size: $\eta^2 = SS_B / SS_T$
If significant: run Tukey's HSD for pairwise comparisons
Interpret in context with descriptive statistics, F-statistic, p-value, effect size, and post-hoc results

Key Python Code

from scipy import stats
from statsmodels.stats.multicomp import pairwise_tukeyhsd
import numpy as np

# One-way ANOVA
F_stat, p_value = stats.f_oneway(group1, group2, group3)

# Check equal variances
stat, p_levene = stats.levene(group1, group2, group3)

# Effect size (eta-squared) — manual calculation
all_data = np.concatenate([group1, group2, group3])
grand_mean = np.mean(all_data)
ss_between = sum(len(g) * (np.mean(g) - grand_mean)**2
                 for g in [group1, group2, group3])
ss_total = np.sum((all_data - grand_mean)**2)
eta_squared = ss_between / ss_total

# Post-hoc: Tukey's HSD
data = np.concatenate([group1, group2, group3])
labels = ['G1']*len(group1) + ['G2']*len(group2) + ['G3']*len(group3)
tukey = pairwise_tukeyhsd(endog=data, groups=labels, alpha=0.05)
print(tukey)

Excel: Data Analysis ToolPak

Data tab → Data Analysis → Anova: Single Factor
Set Input Range to all data columns
Grouped By: Columns
Output includes ANOVA table with SS, df, MS, F, p-value, and $F_{\text{critical}}$

The Threshold Concept: Decomposing Variability

Total variation = Explained variation + Unexplained variation

$$SS_T = SS_B + SS_W$$

This is not just an ANOVA formula. It's a universal principle:

Context	Total	Explained	Unexplained
ANOVA	$SS_T$	$SS_B$ (group differences)	$SS_W$ (within-group noise)
Regression (Ch.22)	$SS_T$	$SS_{\text{Reg}}$ (predictor)	$SS_{\text{Res}}$ (residuals)
Effect size	100%	$\eta^2$ or $R^2$ (% explained)	$1 - \eta^2$ (% unexplained)

Getting this concept — really getting it — prepares you for regression, multiple regression, and the $R^2$ interpretation in Chapters 22-23.

The Multiple Comparisons Problem

Groups ($k$)	Pairwise Tests	$P(\geq 1 \text{ false positive})$
2	1	5.0%
3	3	14.3%
5	10	40.1%
10	45	90.1%

ANOVA solves this by testing all groups in a single test with a single p-value, keeping the Type I error rate at exactly $\alpha$.

Key Formulas

Formula	Description
$SS_T = \sum\sum(x_{ij} - \bar{x})^2$	Total sum of squares
$SS_B = \sum n_i(\bar{x}_i - \bar{x})^2$	Between-group sum of squares
$SS_W = \sum\sum(x_{ij} - \bar{x}_i)^2$	Within-group sum of squares
$MS_B = SS_B / (k-1)$	Mean square between
$MS_W = SS_W / (N-k)$	Mean square within
$F = MS_B / MS_W$	F-statistic
$\eta^2 = SS_B / SS_T$	Eta-squared (proportion of variance explained)
$\binom{k}{2} = k(k-1)/2$	Number of pairwise comparisons

Effect Size Benchmarks (Cohen, 1988)

$\eta^2$	Cohen's $f$	Interpretation
0.01	0.10	Small
0.06	0.25	Medium
0.14	0.40	Large

Always interpret effect sizes in the context of your field — these benchmarks are starting points, not absolute standards.

Post-Hoc Tests: When and How

Test	When to Use
Tukey's HSD	Default for ANOVA follow-up; all pairwise comparisons; less conservative than Bonferroni
Bonferroni	When you want to test only a few pre-planned comparisons; simpler but more conservative
Neither	When ANOVA is not significant — do not fish for pairwise differences

Assumptions Checklist

Assumption	How to Check	If Violated
Independence	Study design (random sampling, random assignment)	Use repeated-measures ANOVA or mixed models
Normality	Shapiro-Wilk, QQ-plots, histograms per group	Robust if $n \geq 15$-$20$ per group and balanced design; otherwise Kruskal-Wallis (Ch.21)
Equal variances	Levene's test, SD ratio $< 2$	Welch's ANOVA; robust if group sizes are equal

Common Mistakes

Mistake	Correction
Running multiple t-tests instead of ANOVA	Use one-way ANOVA to test all groups simultaneously
"ANOVA is significant, so all groups differ"	ANOVA only tells you at least one group differs; use Tukey's HSD for specifics
Running post-hoc tests after non-significant ANOVA	Only run post-hoc tests after a significant omnibus test
Ignoring effect size	Always report $\eta^2$ alongside $F$ and $p$
Reporting $F$ without both degrees of freedom	Correct format: $F(df_B, df_W) = \text{value}$, $p = \text{value}$, $\eta^2 = \text{value}$

Reporting Template (APA Style)

"A one-way ANOVA revealed a statistically significant difference in [outcome variable] across the [k] [grouping variable] groups, $F(df_B, df_W) = [F\text{-value}]$, $p [= \text{or} <] [p\text{-value}]$, $\eta^2 = [value]$. Tukey's HSD post-hoc comparisons indicated that [specific group differences with means and adjusted p-values]."

Connections

Connection	Details
Ch.6 (Variance)	ANOVA literally analyzes variance; $MS_W$ is a pooled version of the sample variance
Ch.16 (Two-sample t-test)	ANOVA generalizes to $k \geq 2$ groups; when $k = 2$, $F = t^2$
Ch.17 (Effect sizes, multiple comparisons)	$\eta^2$ parallels Cohen's $d$; FWER and Bonferroni correction applied to ANOVA context
Ch.21 (Nonparametric methods)	Kruskal-Wallis test is the nonparametric alternative when ANOVA assumptions fail
Ch.22 (Regression)	$R^2$ is the regression analogue of $\eta^2$; same decomposition of variability