Case Study 1: MediCore Multi-Method Analysis — IPW, IV, and DiD Applied to Drug X Efficacy
Context
MediCore Pharmaceuticals is under pressure. Drug X — the novel antihypertensive approved 18 months ago — shows strong real-world prescription rates (180,000 patients across 340 hospitals), but the pharmacovigilance team has flagged a concerning pattern: the naive analysis of EHR data shows Drug X recipients have higher 30-day readmission rates than non-recipients. A regulatory inquiry is imminent.
The causal inference team, led by Dr. Ananya Patel, knows from Chapters 15-17 that the naive comparison is confounded. Sicker patients receive Drug X more often, and they have higher readmission rates regardless of treatment. But the team needs more than theoretical arguments — they need rigorous estimates that will survive regulatory scrutiny.
Dr. Patel's strategy: apply three independent causal estimation methods, each relying on different identifying assumptions. If all three converge, the evidence is strong. If they diverge, the team must understand why.
Method 1: Inverse Probability Weighting (IPW)
Identification Strategy
The EHR contains rich patient-level data: demographics (age, sex, race), clinical characteristics (severity score, comorbidity index, creatinine, ejection fraction, blood pressure), utilization history (prior admissions, ED visits, length of stay), and prescribing patterns (concurrent medications). The team argues that these variables capture the physician's prescribing decision process.
Assumption: Conditional on the 14 measured covariates, Drug X prescription is as-good-as-random (conditional ignorability).
Analysis
import numpy as np
import pandas as pd
import statsmodels.api as sm
from sklearn.linear_model import LogisticRegression
def medicore_ipw_full(seed: int = 42) -> dict:
"""MediCore IPW analysis with rich covariate set.
Simulates 50,000 patients with 8 measured confounders and
applies IPW with trimming and diagnostics.
"""
rng = np.random.RandomState(seed)
n = 50000
# Patient covariates
age = rng.normal(65, 12, n).clip(30, 95)
severity = rng.exponential(1.0, n)
comorbidities = rng.poisson(2.5, n).clip(0, 10)
creatinine = rng.lognormal(0.3, 0.4, n).clip(0.5, 6.0)
ejection_fraction = rng.normal(55, 12, n).clip(15, 75)
prior_admissions = rng.poisson(1.2, n)
systolic_bp = rng.normal(140, 20, n).clip(90, 220)
n_medications = rng.poisson(4, n).clip(0, 15)
X = np.column_stack([
age, severity, comorbidities, creatinine,
ejection_fraction, prior_admissions, systolic_bp, n_medications,
])
cov_names = [
"age", "severity", "comorbidities", "creatinine",
"ejection_fraction", "prior_admissions", "systolic_bp", "n_medications",
]
# Treatment (confounded)
logit_t = (
-1.5 + 0.02 * (age - 65) + 0.8 * severity + 0.3 * comorbidities
+ 0.5 * creatinine - 0.02 * ejection_fraction + 0.2 * prior_admissions
+ 0.01 * (systolic_bp - 140) + 0.05 * n_medications
+ rng.normal(0, 0.5, n)
)
treatment = rng.binomial(1, 1 / (1 + np.exp(-logit_t)))
# Potential outcomes (true effect: ~5 pp reduction)
logit_y0 = (
-2.0 + 0.01 * (age - 65) + 0.6 * severity + 0.2 * comorbidities
+ 0.4 * creatinine - 0.01 * ejection_fraction
+ 0.15 * prior_admissions + rng.normal(0, 0.3, n)
)
y0 = rng.binomial(1, 1 / (1 + np.exp(-logit_y0)))
y1 = rng.binomial(1, 1 / (1 + np.exp(-logit_y0 - 0.5)))
y_obs = treatment * y1 + (1 - treatment) * y0
# Propensity score estimation
ps_model = LogisticRegression(C=1.0, max_iter=1000, solver="lbfgs")
ps_model.fit(X, treatment)
ps = ps_model.predict_proba(X)[:, 1]
# Trim and compute IPW
ps_trimmed = np.clip(ps, 0.02, 0.98)
w1 = treatment / ps_trimmed
w0 = (1 - treatment) / (1 - ps_trimmed)
mu1 = np.sum(w1 * y_obs) / np.sum(w1)
mu0 = np.sum(w0 * y_obs) / np.sum(w0)
ate_ipw = mu1 - mu0
# Standard error via influence function
scores = (
treatment * (y_obs - mu1) / ps_trimmed
- (1 - treatment) * (y_obs - mu0) / (1 - ps_trimmed)
)
se_ipw = np.sqrt(np.mean(scores ** 2) / n)
true_ate = (y1 - y0).mean()
naive = y_obs[treatment == 1].mean() - y_obs[treatment == 0].mean()
return {
"true_ate": float(true_ate),
"naive": float(naive),
"ipw_ate": float(ate_ipw),
"ipw_se": float(se_ipw),
"ess_treated": float(np.sum(w1) ** 2 / np.sum(w1 ** 2)),
"ess_control": float(np.sum(w0) ** 2 / np.sum(w0 ** 2)),
}
result_ipw = medicore_ipw_full()
print("Method 1: IPW")
print(f" True ATE: {result_ipw['true_ate']:.4f}")
print(f" Naive: {result_ipw['naive']:.4f}")
print(f" IPW ATE: {result_ipw['ipw_ate']:.4f} (SE: {result_ipw['ipw_se']:.4f})")
print(f" ESS ratio: treated={result_ipw['ess_treated']/31550:.2f}, "
f"control={result_ipw['ess_control']/18450:.2f}")
Method 1: IPW
True ATE: -0.0482
Naive: 0.0328
IPW ATE: -0.0461 (SE: 0.0058)
ESS ratio: treated=0.79, control=0.63
The IPW estimate reverses the sign of the naive analysis, confirming that Drug X reduces readmission. The effective sample size ratios are acceptable (both above 0.5), indicating that the weights are not dominated by extreme values.
Limitation acknowledged: If physician judgment captures information about patient prognosis beyond the 8 measured covariates, conditional ignorability fails and the IPW estimate is biased.
Method 2: Instrumental Variables (Hospital Distance)
Identification Strategy
Not all prescribing variation is driven by patient characteristics. Some variation is geographic: patients at certain hospitals receive Drug X at much higher rates due to institutional formulary preferences, physician habits, and supply chain factors. The team uses distance to the nearest high-Drug-X-prescribing hospital as an instrument.
Assumptions: (1) Distance affects Drug X receipt (relevance — testable). (2) Distance does not directly affect readmission (exclusion restriction — arguable). (3) Distance is uncorrelated with unmeasured patient health factors (independence — arguable after controlling for measured covariates).
Analysis
def medicore_iv_analysis(seed: int = 42) -> dict:
"""MediCore IV analysis using hospital distance."""
rng = np.random.RandomState(seed)
n = 50000
age = rng.normal(65, 12, n).clip(30, 95)
severity = rng.exponential(1.0, n)
comorbidities = rng.poisson(2.5, n).clip(0, 10)
# Unmeasured confounder: physician's private prognosis assessment
physician_prognosis = 0.5 * severity + rng.normal(0, 1, n)
# Instrument: distance to nearest Drug-X hospital
distance = rng.exponential(15, n).clip(1, 100)
logit_t = (
0.5 - 0.03 * distance + 0.4 * severity + 0.3 * comorbidities
+ 0.6 * physician_prognosis + rng.normal(0, 0.5, n)
)
treatment = rng.binomial(1, 1 / (1 + np.exp(-logit_t)))
logit_y0 = (
-1.5 + 0.01 * (age - 65) + 0.5 * severity + 0.15 * comorbidities
+ 0.4 * physician_prognosis + rng.normal(0, 0.3, n)
)
y0 = rng.binomial(1, 1 / (1 + np.exp(-logit_y0)))
y1 = rng.binomial(1, 1 / (1 + np.exp(-logit_y0 - 0.4)))
y_obs = treatment * y1 + (1 - treatment) * y0
# Stage 1
controls = np.column_stack([age, severity, comorbidities])
X_s1 = sm.add_constant(np.column_stack([distance, controls]))
s1 = sm.OLS(treatment, X_s1).fit()
d_hat = s1.fittedvalues
f_stat = s1.f_test("x1 = 0").fvalue
# Stage 2
X_s2 = sm.add_constant(np.column_stack([d_hat, controls]))
s2 = sm.OLS(y_obs, X_s2).fit(cov_type="HC1")
true_ate = (y1 - y0).mean()
return {
"true_ate": float(true_ate),
"iv_estimate": float(s2.params[1]),
"iv_se": float(s2.bse[1]),
"first_stage_f": float(f_stat),
"first_stage_coef": float(s1.params[1]),
}
result_iv = medicore_iv_analysis()
print("\nMethod 2: Instrumental Variables (Distance)")
print(f" True ATE: {result_iv['true_ate']:.4f}")
print(f" IV estimate: {result_iv['iv_estimate']:.4f} (SE: {result_iv['iv_se']:.4f})")
print(f" First-stage F: {result_iv['first_stage_f']:.1f}")
print(f" Distance coef: {result_iv['first_stage_coef']:.5f}")
Method 2: Instrumental Variables (Distance)
True ATE: -0.0398
IV estimate: -0.0371 (SE: 0.0081)
First-stage F: 289.7
Distance coef: -0.00312
The IV estimate is $-0.037$ (3.7 percentage point reduction), consistent with IPW. The first-stage F-statistic of 290 rules out weak instrument concerns. Each additional mile from the hospital decreases the probability of Drug X receipt by 0.31 percentage points.
Limitation acknowledged: The IV estimate is LATE, not ATE — it applies to compliers (patients whose treatment depends on hospital proximity). The exclusion restriction cannot be tested; the team documents the argument and conducts sensitivity analysis.
Method 3: Difference-in-Differences (Policy Change)
Identification Strategy
Six months into real-world use, MediCore's hospital partner network (100 hospitals) mandated Drug X as the default discharge prescription for heart failure patients. The remaining 240 hospitals maintained physician discretion. The policy created a sharp before-after contrast within the policy hospitals, with non-policy hospitals serving as controls.
Assumption: Absent the mandate, readmission trends at policy hospitals would have paralleled those at non-policy hospitals (parallel trends).
Analysis
def medicore_did_analysis(seed: int = 42) -> dict:
"""MediCore DiD around the Drug X mandate policy."""
rng = np.random.RandomState(seed)
n_hospitals = 200
n_periods = 12
policy_time = 6
rows = []
for h in range(n_hospitals):
is_policy = h < 100 # First 100 hospitals adopted policy
hosp_fe = rng.normal(0, 0.02)
for t in range(n_periods):
time_trend = -0.002 * t # Common declining trend
post = int(t >= policy_time)
treat_effect = -0.042 * post * is_policy # True effect: 4.2 pp
noise = rng.normal(0, 0.008)
rate = 0.175 + hosp_fe + time_trend + treat_effect + noise
rows.append({
"hospital_id": h, "period": t, "policy": int(is_policy),
"post": post, "event_time": t - policy_time,
"readmission_rate": np.clip(rate, 0, 1),
})
df = pd.DataFrame(rows)
# DiD regression with clustered SEs
df["policy_x_post"] = df["policy"] * df["post"]
X_did = sm.add_constant(df[["policy", "post", "policy_x_post"]].values)
model = sm.OLS(
df["readmission_rate"].values, X_did
).fit(cov_type="cluster", cov_kwds={"groups": df["hospital_id"].values})
# Event study (pre-trend check)
pre_coefs = []
for k in [-3, -2, -1]:
pre_data = df[df["event_time"] == k]
diff = (
pre_data.loc[pre_data["policy"] == 1, "readmission_rate"].mean()
- pre_data.loc[pre_data["policy"] == 0, "readmission_rate"].mean()
)
pre_coefs.append({"event_time": k, "diff": float(diff)})
return {
"did_estimate": float(model.params[3]),
"did_se": float(model.bse[3]),
"did_ci_lower": float(model.conf_int()[3, 0]),
"did_ci_upper": float(model.conf_int()[3, 1]),
"pre_trend_diffs": pre_coefs,
}
result_did = medicore_did_analysis()
print("\nMethod 3: Difference-in-Differences (Policy Mandate)")
print(f" DiD estimate: {result_did['did_estimate']:.4f} "
f"(SE: {result_did['did_se']:.4f})")
print(f" 95% CI: [{result_did['did_ci_lower']:.4f}, "
f"{result_did['did_ci_upper']:.4f}]")
print(" Pre-trend gaps (should be stable):")
for p in result_did["pre_trend_diffs"]:
print(f" k={p['event_time']}: {p['diff']:+.4f}")
Method 3: Difference-in-Differences (Policy Mandate)
DiD estimate: -0.0419 (SE: 0.0021)
95% CI: [-0.0461, -0.0378]
Pre-trend gaps (should be stable):
k=-3: +0.0012
k=-2: -0.0008
k=-1: +0.0005
The DiD estimate is $-0.042$ (4.2 percentage point reduction). Pre-trend differences are small and stable (all below 0.2 percentage points), supporting the parallel trends assumption.
Convergence and Conclusions
| Method | Estimate | 95% CI | Estimand | Key Assumption |
|---|---|---|---|---|
| IPW | $-0.046$ | $[-0.057, -0.035]$ | ATE | Conditional ignorability |
| IV (distance) | $-0.037$ | $[-0.053, -0.021]$ | LATE (compliers) | Exclusion restriction |
| DiD (policy) | $-0.042$ | $[-0.046, -0.038]$ | ATT (policy hospitals) | Parallel trends |
All three methods converge on the same qualitative conclusion: Drug X reduces 30-day readmission by approximately 4-5 percentage points. The naive estimate of +3.3 percentage points was entirely driven by confounding.
The estimates differ in magnitude because they target different populations (ATE vs. LATE vs. ATT), but the confidence intervals overlap substantially. The convergence across three methods with different identifying assumptions provides strong evidence for the causal effect.
Dr. Patel's team presents the results to the regulatory inquiry panel with the following conclusion: "The observational data, analyzed using three independent causal estimation strategies, consistently show that Drug X reduces 30-day readmission by 3.7 to 4.6 percentage points. The naive analysis that triggered the inquiry was confounded by prescribing patterns that channel sicker patients toward Drug X. This is a textbook example of confounding bias reversing the sign of a beneficial treatment effect."