Key Takeaways: Analysis of Variance (ANOVA)

One-Sentence Summary

Analysis of variance (ANOVA) compares means across three or more groups by decomposing total data variability into between-group (explained) and within-group (unexplained) components, using the F-statistic ratio to determine whether group differences exceed what random variation alone would produce — followed by Tukey's HSD post-hoc tests to identify which specific groups differ while controlling the family-wise error rate.

Core Concepts at a Glance

Concept Definition Why It Matters
Multiple comparisons problem Running many tests inflates the probability of at least one false positive far beyond $\alpha$ Explains why multiple t-tests are dangerous and ANOVA is necessary
Between-group variability Variability in the data explained by group membership ($SS_B$) The "signal" — differences among group means
Within-group variability Variability in the data unexplained by groups ($SS_W$) — natural noise within each group The "noise" — individual differences within groups
F-statistic Ratio $MS_B / MS_W$ — signal divided by noise Large F means group differences are unlikely due to chance alone
Decomposing variability $SS_T = SS_B + SS_W$ — total variation splits exactly into explained and unexplained parts The threshold concept; foundation for regression $R^2$ and all statistical modeling

The ANOVA Procedure

Step by Step

  1. State hypotheses: - $H_0: \mu_1 = \mu_2 = \cdots = \mu_k$ (all group means equal) - $H_a$: Not all $\mu_i$ are equal (at least one group differs)

  2. Check assumptions: - Independence (study design) - Normality within each group (histograms, QQ-plots, Shapiro-Wilk) - Equal variances (Levene's test, SD ratio $< 2$)

  3. Compute the ANOVA table:

Source SS df MS F
Between $\sum n_i(\bar{x}_i - \bar{x})^2$ $k - 1$ $SS_B / (k-1)$ $MS_B / MS_W$
Within $\sum\sum(x_{ij} - \bar{x}_i)^2$ $N - k$ $SS_W / (N-k)$
Total $\sum\sum(x_{ij} - \bar{x})^2$ $N - 1$
  1. Find the p-value from the F-distribution with $df_1 = k-1$ and $df_2 = N-k$

  2. Compute effect size: $\eta^2 = SS_B / SS_T$

  3. If significant: run Tukey's HSD for pairwise comparisons

  4. Interpret in context with descriptive statistics, F-statistic, p-value, effect size, and post-hoc results

Key Python Code

from scipy import stats
from statsmodels.stats.multicomp import pairwise_tukeyhsd
import numpy as np

# One-way ANOVA
F_stat, p_value = stats.f_oneway(group1, group2, group3)

# Check equal variances
stat, p_levene = stats.levene(group1, group2, group3)

# Effect size (eta-squared) — manual calculation
all_data = np.concatenate([group1, group2, group3])
grand_mean = np.mean(all_data)
ss_between = sum(len(g) * (np.mean(g) - grand_mean)**2
                 for g in [group1, group2, group3])
ss_total = np.sum((all_data - grand_mean)**2)
eta_squared = ss_between / ss_total

# Post-hoc: Tukey's HSD
data = np.concatenate([group1, group2, group3])
labels = ['G1']*len(group1) + ['G2']*len(group2) + ['G3']*len(group3)
tukey = pairwise_tukeyhsd(endog=data, groups=labels, alpha=0.05)
print(tukey)

Excel: Data Analysis ToolPak

  1. Data tab → Data AnalysisAnova: Single Factor
  2. Set Input Range to all data columns
  3. Grouped By: Columns
  4. Output includes ANOVA table with SS, df, MS, F, p-value, and $F_{\text{critical}}$

The Threshold Concept: Decomposing Variability

Total variation = Explained variation + Unexplained variation

$$SS_T = SS_B + SS_W$$

This is not just an ANOVA formula. It's a universal principle:

Context Total Explained Unexplained
ANOVA $SS_T$ $SS_B$ (group differences) $SS_W$ (within-group noise)
Regression (Ch.22) $SS_T$ $SS_{\text{Reg}}$ (predictor) $SS_{\text{Res}}$ (residuals)
Effect size 100% $\eta^2$ or $R^2$ (% explained) $1 - \eta^2$ (% unexplained)

Getting this concept — really getting it — prepares you for regression, multiple regression, and the $R^2$ interpretation in Chapters 22-23.

The Multiple Comparisons Problem

Groups ($k$) Pairwise Tests $P(\geq 1 \text{ false positive})$
2 1 5.0%
3 3 14.3%
5 10 40.1%
10 45 90.1%

ANOVA solves this by testing all groups in a single test with a single p-value, keeping the Type I error rate at exactly $\alpha$.

Key Formulas

Formula Description
$SS_T = \sum\sum(x_{ij} - \bar{x})^2$ Total sum of squares
$SS_B = \sum n_i(\bar{x}_i - \bar{x})^2$ Between-group sum of squares
$SS_W = \sum\sum(x_{ij} - \bar{x}_i)^2$ Within-group sum of squares
$MS_B = SS_B / (k-1)$ Mean square between
$MS_W = SS_W / (N-k)$ Mean square within
$F = MS_B / MS_W$ F-statistic
$\eta^2 = SS_B / SS_T$ Eta-squared (proportion of variance explained)
$\binom{k}{2} = k(k-1)/2$ Number of pairwise comparisons

Effect Size Benchmarks (Cohen, 1988)

$\eta^2$ Cohen's $f$ Interpretation
0.01 0.10 Small
0.06 0.25 Medium
0.14 0.40 Large

Always interpret effect sizes in the context of your field — these benchmarks are starting points, not absolute standards.

Post-Hoc Tests: When and How

Test When to Use
Tukey's HSD Default for ANOVA follow-up; all pairwise comparisons; less conservative than Bonferroni
Bonferroni When you want to test only a few pre-planned comparisons; simpler but more conservative
Neither When ANOVA is not significant — do not fish for pairwise differences

Assumptions Checklist

Assumption How to Check If Violated
Independence Study design (random sampling, random assignment) Use repeated-measures ANOVA or mixed models
Normality Shapiro-Wilk, QQ-plots, histograms per group Robust if $n \geq 15$-$20$ per group and balanced design; otherwise Kruskal-Wallis (Ch.21)
Equal variances Levene's test, SD ratio $< 2$ Welch's ANOVA; robust if group sizes are equal

Common Mistakes

Mistake Correction
Running multiple t-tests instead of ANOVA Use one-way ANOVA to test all groups simultaneously
"ANOVA is significant, so all groups differ" ANOVA only tells you at least one group differs; use Tukey's HSD for specifics
Running post-hoc tests after non-significant ANOVA Only run post-hoc tests after a significant omnibus test
Ignoring effect size Always report $\eta^2$ alongside $F$ and $p$
Reporting $F$ without both degrees of freedom Correct format: $F(df_B, df_W) = \text{value}$, $p = \text{value}$, $\eta^2 = \text{value}$

Reporting Template (APA Style)

"A one-way ANOVA revealed a statistically significant difference in [outcome variable] across the [k] [grouping variable] groups, $F(df_B, df_W) = [F\text{-value}]$, $p [= \text{or} <] [p\text{-value}]$, $\eta^2 = [value]$. Tukey's HSD post-hoc comparisons indicated that [specific group differences with means and adjusted p-values]."

Connections

Connection Details
Ch.6 (Variance) ANOVA literally analyzes variance; $MS_W$ is a pooled version of the sample variance
Ch.16 (Two-sample t-test) ANOVA generalizes to $k \geq 2$ groups; when $k = 2$, $F = t^2$
Ch.17 (Effect sizes, multiple comparisons) $\eta^2$ parallels Cohen's $d$; FWER and Bonferroni correction applied to ANOVA context
Ch.21 (Nonparametric methods) Kruskal-Wallis test is the nonparametric alternative when ANOVA assumptions fail
Ch.22 (Regression) $R^2$ is the regression analogue of $\eta^2$; same decomposition of variability