Case Study 1: Maya's Confidence Interval — Estimating Disease Prevalence in a Community

Contributors

Case Study 1: Maya's Confidence Interval — Estimating Disease Prevalence in a Community

The Scenario

Dr. Maya Chen has been called to a meeting with the county health board. They have a simple question — or so they think.

"Dr. Chen, what is the prevalence of hypertension in our county?"

Maya pauses. She's been doing public health research long enough to know that "simple" questions in epidemiology are never simple. She has data from a random sample of 500 adults screened at community health fairs over the past year. Of those 500, 185 had systolic blood pressure readings of 130 mmHg or higher — the American Heart Association's threshold for Stage 1 hypertension (the 2017 ACC/AHA guideline defines Stage 1 as a systolic reading of 130–139 mmHg).

"The prevalence in our sample is 37%," she says.

"Great. So 37% of the county has hypertension."

"Not so fast," Maya says. "37% is our estimate. The true prevalence could be somewhat higher or lower. Let me show you exactly how uncertain that estimate is."

This is the moment Maya has been building toward — the first time she gets to use confidence intervals in a real public health decision.

Building the Confidence Interval

The Data

Quantity	Value
Sample size ($n$)	500
Number with hypertension	185
Sample proportion ($\hat{p}$)	$185/500 = 0.370$
County adult population ($N$)	$\approx 500{,}000$

Step 1: Check the Conditions

Random sample? The screening was conducted at 12 community health fairs across the county, with participants encouraged to attend regardless of health status. Maya worked with the county to advertise broadly and offer incentives for participation. While not a perfect random sample, the fairs were distributed across geographic and demographic groups, and Maya believes the sample is approximately representative. She notes this limitation in her report.

Independence (10% condition)? $500 \leq 0.10 \times 500{,}000 = 50{,}000$. Yes. ✓

Success-failure condition? $n\hat{p} = 500 \times 0.370 = 185 \geq 10$ ✓ and $n(1-\hat{p}) = 500 \times 0.630 = 315 \geq 10$ ✓

All conditions met.

Step 2: Choose the Confidence Level

Maya chooses 95% confidence. This is standard in public health — it balances precision with appropriate caution. For the board's purposes, 95% is sufficient.

$z^* = 1.960$

Step 3: Calculate the Standard Error

$$\text{SE} = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} = \sqrt{\frac{0.370 \times 0.630}{500}} = \sqrt{\frac{0.2331}{500}} = \sqrt{0.0004662} = 0.02159$$

Step 4: Calculate the Margin of Error

$$E = z^* \times \text{SE} = 1.960 \times 0.02159 = 0.04232$$

The margin of error is about 4.2 percentage points.

Step 5: Construct the Interval

$$\hat{p} \pm E = 0.370 \pm 0.042$$

$$\text{95% CI: } (0.328, 0.412)$$

Step 6: Interpret

"We are 95% confident that the true prevalence of hypertension among adults in this county is between 32.8% and 41.2%."

The Presentation

Maya presents her findings to the health board with a visual:

National average: 47%
                    │
                    │
     ┌──────────────┤
     │              │
     │    32.8%     │     41.2%
─────┼──────┤═══════╪═══════├──────┼──────
     │      ├───────┼───────┤      │
     │      │  95%  │  CI   │      │
     │      │       │       │      │
    30%    33%     37%     41%    45%     47%

     ◄──── Our estimate ────►     National
            of county                avg
           prevalence

"Here's what this means in practical terms," Maya tells the board.

In terms of people:

County adult population: ~500,000
Estimated hypertension prevalence: 32.8% to 41.2%
Estimated number of adults with hypertension: 164,000 to 206,000

"That's a range of 42,000 people," a board member observes. "Can you narrow it down?"

"Absolutely," Maya says. "But it requires more data. Let me show you how much."

The Sample Size Conversation

Maya shows the board the relationship between sample size and precision:

Sample Size	Margin of Error	Prevalence Range	Range in People
500 (current)	±4.2%	32.8% – 41.2%	~42,000
1,000	±3.0%	34.0% – 40.0%	~30,000
2,000	±2.1%	34.9% – 39.1%	~21,000
5,000	±1.3%	35.7% – 38.3%	~13,000
10,000	±0.9%	36.1% – 37.9%	~9,000

"To get within ±2 percentage points, I'd need about 2,200 people," Maya explains. "To get within ±1 percentage point, I'd need nearly 9,000."

She shows the calculation:

$$n = \left(\frac{z^*}{E}\right)^2 \hat{p}(1-\hat{p}) = \left(\frac{1.960}{0.02}\right)^2 \times 0.370 \times 0.630 = 9604 \times 0.2331 = 2238.7 \approx 2{,}239$$

"That's the diminishing returns principle," Maya adds. "To cut the margin of error in half — from ±4.2% to ±2.1% — I need four times as many people. To cut it in half again — to ±1% — I'd need four times more again."

The board decides to fund a larger study of 2,000 adults for next year's assessment.

Comparing to the National Rate

A board member asks: "How do we compare to the national average?"

The national prevalence of hypertension in U.S. adults is approximately 47% (according to the CDC). Maya's 95% CI for the county is (32.8%, 41.2%) — entirely below the national average.

"Our county's hypertension prevalence appears to be significantly lower than the national average," Maya reports. "The entire confidence interval is below 47%. This isn't random sampling variation — our county genuinely has lower hypertension rates."

But Maya adds a caveat: "Remember, our sample came from health fair attendees. People who attend health fairs may be more health-conscious than the general population. This could bias our estimate downward. The true county prevalence might be higher than our interval suggests."

This is an important lesson: confidence intervals protect against random sampling error, but not against systematic bias. The 95% confidence level means that 95% of intervals from random samples would capture the truth — but if the sample is systematically different from the population (selection bias), even a very narrow interval might miss the mark entirely.

Subgroup Analysis: Where CIs Get Interesting

Maya also collected data by age group:

Age Group	$n$	Hypertensive	$\hat{p}$	95% CI
18-39	150	24	0.160	(0.101, 0.219)
40-64	200	72	0.360	(0.293, 0.427)
65+	150	89	0.593	(0.515, 0.672)

Several observations:

The CIs don't overlap between the youngest and oldest groups. The 18-39 CI tops out at 21.9%, and the 65+ CI starts at 51.5%. This strongly suggests a real difference — formal testing in Chapter 16 will confirm it.
The 40-64 and 65+ CIs do not overlap either. The 40-64 CI goes up to 42.7%, and the 65+ CI starts at 51.5%, leaving a clear gap between them — more evidence of a real age difference. But a word of caution about the reverse situation: if two CIs had overlapped, we could not automatically conclude the groups are the same. Overlapping individual CIs do not necessarily mean "no difference" — the correct comparison is a CI for the difference between the two proportions, which requires a formal two-sample procedure (Chapter 16).
The subgroup CIs are wider than the overall CI. This makes sense — each subgroup has a smaller $n$, so the standard error is larger. Precision costs data, and splitting the data into subgroups costs precision.
The youngest group's MOE is largest relative to $\hat{p}$. With only 24 out of 150 classified as hypertensive, the CI ranges from 10.1% to 21.9% — that's more than a 2:1 ratio. If Maya needs a more precise estimate for young adults, she needs to sample more of them specifically.

Python Implementation

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

# Overall data
n_total = 500
x_total = 185
p_hat = x_total / n_total

# 95% CI
z_star = stats.norm.ppf(0.975)
se = np.sqrt(p_hat * (1 - p_hat) / n_total)
moe = z_star * se
ci = (p_hat - moe, p_hat + moe)

print(f"Overall prevalence: {p_hat:.3f}")
print(f"95% CI: ({ci[0]:.3f}, {ci[1]:.3f})")
print(f"Margin of error: ±{moe:.3f} ({moe*100:.1f} percentage points)")

# Subgroup analysis
subgroups = {
    '18-39': {'n': 150, 'x': 24},
    '40-64': {'n': 200, 'x': 72},
    '65+':   {'n': 150, 'x': 89},
}

print("\n--- Subgroup Analysis ---")
fig, ax = plt.subplots(figsize=(10, 4))

for i, (group, data) in enumerate(subgroups.items()):
    n_g = data['n']
    x_g = data['x']
    p_g = x_g / n_g
    se_g = np.sqrt(p_g * (1 - p_g) / n_g)
    moe_g = z_star * se_g
    ci_g = (p_g - moe_g, p_g + moe_g)

    print(f"{group}: p̂ = {p_g:.3f}, 95% CI = ({ci_g[0]:.3f}, {ci_g[1]:.3f})")

    # Plot
    ax.errorbar(p_g, i, xerr=moe_g, fmt='o', color='steelblue',
                capsize=5, capthick=2, linewidth=2, markersize=8)
    ax.text(p_g + moe_g + 0.01, i,
            f'{p_g:.1%} ({ci_g[0]:.1%}, {ci_g[1]:.1%})',
            va='center', fontsize=10)

ax.axvline(x=0.47, color='red', linestyle='--', alpha=0.7,
           label='National average (47%)')
ax.set_yticks(range(len(subgroups)))
ax.set_yticklabels(subgroups.keys())
ax.set_xlabel('Hypertension Prevalence')
ax.set_title('95% Confidence Intervals by Age Group',
             fontweight='bold')
ax.legend()
ax.set_xlim(0, 0.8)
plt.tight_layout()
plt.show()

# Sample size planning
print("\n--- Sample Size Planning ---")
for target_moe in [0.04, 0.03, 0.02, 0.01]:
    n_needed = int(np.ceil((z_star / target_moe)**2 * p_hat * (1-p_hat)))
    print(f"MOE = ±{target_moe:.0%}: n = {n_needed}")

Lessons from This Case Study

A single number isn't enough. "37% prevalence" sounds precise, but the CI reveals the true prevalence could reasonably be anywhere from 33% to 41% — a range of 42,000 people in resource planning terms.
Context determines precision requirements. For a rough county profile, ±4% might be fine. For allocating specific clinic resources, the board needed ±2% — requiring a much larger study.
The diminishing returns are real. Going from ±4% to ±2% required 4x the sample. Going from ±2% to ±1% required another 4x. Budget constraints force practical decisions about "how precise is precise enough."
Subgroup analysis costs precision. Splitting 500 people into three age groups gives less precise estimates for each group. Planning ahead for subgroup analysis means recruiting more participants.
CIs protect against sampling error, not bias. The health fair recruitment method might systematically underrepresent unhealthy adults who don't attend community events. No confidence interval, no matter how narrow, can fix this. Study design (Chapter 4) and statistical technique are complementary, not substitutes.

Discussion Questions

Maya's sample came from health fair attendees. How might this introduce selection bias? Would you expect the bias to make the prevalence estimate too high or too low?
The board wants to allocate $\$10$ million for hypertension treatment programs. Using the CI of (32.8%, 41.2%), what's the range of cost-per-person if they budget for the entire affected population?
If Maya used a 99% confidence level instead of 95%, how would that change her recommendation to the board? Would it be more or less useful for their decision?
The subgroup analysis reveals that hypertension prevalence increases dramatically with age. If you were Maya, would you recommend age-targeted screening programs? What additional data would you want?