Key Takeaways: Inference for Means

One-Sentence Summary

The one-sample t-test — the workhorse of statistical inference — tests claims about a population mean using the t-distribution (which honestly accounts for our uncertainty about $\sigma$), is robust to non-normality for moderate-to-large samples, and should be your default whenever $\sigma$ is unknown (which is almost always).

Core Concepts at a Glance

Concept Definition Why It Matters
One-sample t-test Tests whether a population mean $\mu$ equals a specific value $\mu_0$, using the t-distribution The most commonly used statistical test in practice; applies whenever you have quantitative data and a reference value
Robustness A procedure's ability to give approximately correct results even when assumptions aren't perfectly met The t-test is remarkably robust to non-normality for $n \geq 30$, making it practical for real-world data
Paired data (preview) Data that come in natural pairs (before/after, matched subjects); analyzed by computing differences and running a one-sample t-test on those differences Eliminates person-to-person variability, often dramatically increasing power

The One-Sample t-Test Formula

$$\boxed{t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}}$$

where: - $\bar{x}$ = sample mean - $\mu_0$ = hypothesized population mean (from $H_0$) - $s$ = sample standard deviation - $n$ = sample size - $df = n - 1$ (degrees of freedom)

In plain English: The t-statistic measures how many standard errors the sample mean is from the hypothesized value. Large values (positive or negative) mean the data are far from $H_0$; values near zero mean the data are consistent with $H_0$.

The Five-Step Procedure

Step Action Key Question
1 State $H_0$ and $H_a$ What's the claim? One- or two-tailed?
2 Check conditions Random? Independent? Normal enough?
3 Compute $t = (\bar{x} - \mu_0) / (s/\sqrt{n})$ How far from $H_0$, in SE units?
4 Find p-value from $t_{n-1}$ distribution How surprising are these data if $H_0$ is true?
5 Conclude in context Reject or fail to reject — and what does it mean?

Three Conditions for the t-Test

Condition What to Check What Happens If Violated
1. Randomness Data from a random sample or random assignment Results cannot be generalized; no statistical fix
2. Independence Observations don't influence each other; 10% condition for sampling without replacement Standard error is wrong; p-values unreliable
3. Normality Sampling distribution of $\bar{x}$ is approximately normal P-values may be inaccurate, especially for small $n$

The Normality Condition: Quick Guide

Sample Size Requirement Rationale
$n < 15$ Population must be approximately normal; no outliers or skewness CLT can't compensate; t-test relies on normality directly
$15 \leq n < 30$ Tolerate moderate skewness; check for extreme outliers CLT partially compensates; extreme outliers still distort
$n \geq 30$ CLT handles most population shapes; only extreme outliers are a concern CLT nearly guarantees normality of $\bar{x}$

Robustness Summary

The t-test IS robust to... The t-test is NOT robust to...
Moderate skewness (especially $n \geq 30$) Extreme outliers (any sample size)
Light-tailed distributions Heavy-tailed distributions with small $n$
Bimodal distributions (moderate $n$) Strong skewness with small $n$

z-Test vs. t-Test

Feature z-Test t-Test
Uses $\sigma$ (known) $s$ (estimated from data)
Distribution Standard normal t with $df = n - 1$
When to use Almost never Almost always
Effect of small $n$ No extra penalty Heavier tails → more conservative
Bottom line Training wheels The real deal

Rule of thumb: Default to the t-test. You'll be right 99% of the time.

Confidence Interval for a Mean

$$\boxed{\bar{x} \pm t^*_{n-1} \cdot \frac{s}{\sqrt{n}}}$$

CI-Test Duality:

$$\text{Reject } H_0: \mu = \mu_0 \text{ at } \alpha = 0.05 \iff \mu_0 \text{ is NOT in the 95\% CI}$$

Always report the CI alongside the test — the CI tells you how large the effect might be, not just whether it exists.

P-Value Calculation

Alternative Hypothesis P-Value
$H_a: \mu > \mu_0$ (right-tailed) $P(T_{df} \geq t)$
$H_a: \mu < \mu_0$ (left-tailed) $P(T_{df} \leq t)$
$H_a: \mu \neq \mu_0$ (two-tailed) $2 \times P(T_{df} \geq |t|)$

Python Quick Reference

import numpy as np
from scipy import stats

# --- One-sample t-test (raw data) ---
data = np.array([...])  # your data
result = stats.ttest_1samp(data, popmean=mu_0)
# result.statistic = t-value
# result.pvalue = two-tailed p-value

# For one-tailed (SciPy ≥ 1.7):
result = stats.ttest_1samp(data, popmean=mu_0, alternative='greater')
result = stats.ttest_1samp(data, popmean=mu_0, alternative='less')

# --- t-test from summary statistics ---
t_stat = (x_bar - mu_0) / (s / np.sqrt(n))
p_two = 2 * stats.t.sf(abs(t_stat), df=n-1)    # two-tailed
p_right = stats.t.sf(t_stat, df=n-1)             # right-tailed
p_left = stats.t.cdf(t_stat, df=n-1)             # left-tailed

# --- Confidence interval ---
t_star = stats.t.ppf(0.975, df=n-1)              # 95% CI
margin = t_star * s / np.sqrt(n)
ci = (x_bar - margin, x_bar + margin)

# --- Normality check ---
stat, p = stats.shapiro(data)                     # Shapiro-Wilk test
stats.probplot(data, dist="norm", plot=plt)       # QQ-plot

Excel Quick Reference

Task Formula
t-statistic =(AVERAGE(range) - mu_0) / (STDEV.S(range) / SQRT(COUNT(range)))
p-value (two-tailed) =T.DIST.2T(ABS(t), df)
p-value (right-tailed) =T.DIST.RT(t, df)
p-value (left-tailed) =T.DIST(t, df, TRUE)
Critical value (95% CI) =T.INV.2T(0.05, df)
Margin of error =CONFIDENCE.T(0.05, STDEV.S(range), COUNT(range))

Common Misconceptions

Misconception Reality
"Use z-test for large $n$, t-test for small $n$" Use z when $\sigma$ is known (rare), t when $\sigma$ is estimated (usual)
"The t-test requires normally distributed data" It requires a normal sampling distribution of $\bar{x}$ — which the CLT provides for $n \geq 30$
"The t-distribution's wider tails are a problem" They're a feature — honest acknowledgment of uncertainty from estimating $\sigma$
"A small p-value means a large effect" P-values measure evidence, not effect size; report the CI for magnitude
"Fail to reject = the null is true" It means insufficient evidence to reject; the effect might exist but be undetectable with your sample size
"Borderline p-values ($\approx$ 0.05) are meaningless" They indicate suggestive but inconclusive evidence; context matters

How This Chapter Connects

This Chapter Builds On Leads To
t-test formula z-test (Ch.13), SE (Ch.11), $\bar{x}$ and $s$ (Ch.6) Two-sample t-test, paired t-test (Ch.16)
t-distribution Introduced in Ch.12 for CIs Deepened here for hypothesis testing; used through Ch.24
Conditions Normal model (Ch.10), CLT (Ch.11) Same conditions apply to all future t-based procedures
Robustness Distribution thinking (Ch.5), normality assessment (Ch.10) Nonparametric alternatives (Ch.21), bootstrap (Ch.18)
CI-test duality Established in Ch.13 Applied in every inference chapter going forward
Paired data preview One-sample t-test on differences Full treatment in Ch.16

The Key Theme

The t-distribution embodies a fundamental statistical virtue: honesty about uncertainty. When $\sigma$ is unknown and must be estimated from data, the t-distribution widens its tails to say: "We're less certain than we'd be if we knew $\sigma$." As $n$ grows and our estimate improves, the t-distribution relaxes toward the normal. This isn't a weakness — it's intellectual integrity. The t-distribution doesn't pretend to know more than it does.

The One Thing to Remember

If you forget everything else from this chapter, remember this:

When you want to test whether a population mean equals a specific value, use the one-sample t-test: $t = (\bar{x} - \mu_0) / (s / \sqrt{n})$, with $df = n - 1$. Use it whenever $\sigma$ is unknown, which is almost always. Check three conditions (random, independent, normal enough), and remember that the t-test is robust to non-normality for $n \geq 30$. Always pair the hypothesis test with a confidence interval — the test tells you WHETHER an effect exists; the CI tells you HOW LARGE it might be. The t-distribution's wider tails aren't a limitation — they're the honest price of admitting we don't know $\sigma$.

Key Terms

Term Definition
One-sample t-test A hypothesis test for a population mean that uses the t-distribution because $\sigma$ is estimated by $s$; test statistic: $t = (\bar{x} - \mu_0)/(s/\sqrt{n})$
t-distribution (deepened) A symmetric, bell-shaped distribution with heavier tails than the normal; indexed by degrees of freedom; used when $\sigma$ is estimated; converges to normal as $df \to \infty$
Degrees of freedom (deepened) For a one-sample t-test: $df = n - 1$; connects to $n - 1$ in the sample variance formula; determines which t-distribution to use; smaller df = heavier tails = more uncertainty
Robustness A statistical procedure's ability to give approximately correct results even when its assumptions are not perfectly satisfied
Normality assumption The condition that the sampling distribution of $\bar{x}$ is approximately normal; satisfied by population normality (small $n$) or by the CLT ($n \geq 30$)
Paired data (preview) Observations that come in natural pairs; analyzed by computing within-pair differences and applying a one-sample t-test to the differences