Case Study 1: MediCore Multi-Method Analysis — IPW, IV, and DiD Applied to Drug X Efficacy

Context

MediCore Pharmaceuticals is under pressure. Drug X — the novel antihypertensive approved 18 months ago — shows strong real-world prescription rates (180,000 patients across 340 hospitals), but the pharmacovigilance team has flagged a concerning pattern: the naive analysis of EHR data shows Drug X recipients have higher 30-day readmission rates than non-recipients. A regulatory inquiry is imminent.

The causal inference team, led by Dr. Ananya Patel, knows from Chapters 15-17 that the naive comparison is confounded. Sicker patients receive Drug X more often, and they have higher readmission rates regardless of treatment. But the team needs more than theoretical arguments — they need rigorous estimates that will survive regulatory scrutiny.

Dr. Patel's strategy: apply three independent causal estimation methods, each relying on different identifying assumptions. If all three converge, the evidence is strong. If they diverge, the team must understand why.

Method 1: Inverse Probability Weighting (IPW)

Identification Strategy

The EHR contains rich patient-level data: demographics (age, sex, race), clinical characteristics (severity score, comorbidity index, creatinine, ejection fraction, blood pressure), utilization history (prior admissions, ED visits, length of stay), and prescribing patterns (concurrent medications). The team argues that these variables capture the physician's prescribing decision process.

Assumption: Conditional on the 14 measured covariates, Drug X prescription is as-good-as-random (conditional ignorability).

Analysis

import numpy as np
import pandas as pd
import statsmodels.api as sm
from sklearn.linear_model import LogisticRegression


def medicore_ipw_full(seed: int = 42) -> dict:
    """MediCore IPW analysis with rich covariate set.

    Simulates 50,000 patients with 8 measured confounders and
    applies IPW with trimming and diagnostics.
    """
    rng = np.random.RandomState(seed)
    n = 50000

    # Patient covariates
    age = rng.normal(65, 12, n).clip(30, 95)
    severity = rng.exponential(1.0, n)
    comorbidities = rng.poisson(2.5, n).clip(0, 10)
    creatinine = rng.lognormal(0.3, 0.4, n).clip(0.5, 6.0)
    ejection_fraction = rng.normal(55, 12, n).clip(15, 75)
    prior_admissions = rng.poisson(1.2, n)
    systolic_bp = rng.normal(140, 20, n).clip(90, 220)
    n_medications = rng.poisson(4, n).clip(0, 15)

    X = np.column_stack([
        age, severity, comorbidities, creatinine,
        ejection_fraction, prior_admissions, systolic_bp, n_medications,
    ])
    cov_names = [
        "age", "severity", "comorbidities", "creatinine",
        "ejection_fraction", "prior_admissions", "systolic_bp", "n_medications",
    ]

    # Treatment (confounded)
    logit_t = (
        -1.5 + 0.02 * (age - 65) + 0.8 * severity + 0.3 * comorbidities
        + 0.5 * creatinine - 0.02 * ejection_fraction + 0.2 * prior_admissions
        + 0.01 * (systolic_bp - 140) + 0.05 * n_medications
        + rng.normal(0, 0.5, n)
    )
    treatment = rng.binomial(1, 1 / (1 + np.exp(-logit_t)))

    # Potential outcomes (true effect: ~5 pp reduction)
    logit_y0 = (
        -2.0 + 0.01 * (age - 65) + 0.6 * severity + 0.2 * comorbidities
        + 0.4 * creatinine - 0.01 * ejection_fraction
        + 0.15 * prior_admissions + rng.normal(0, 0.3, n)
    )
    y0 = rng.binomial(1, 1 / (1 + np.exp(-logit_y0)))
    y1 = rng.binomial(1, 1 / (1 + np.exp(-logit_y0 - 0.5)))
    y_obs = treatment * y1 + (1 - treatment) * y0

    # Propensity score estimation
    ps_model = LogisticRegression(C=1.0, max_iter=1000, solver="lbfgs")
    ps_model.fit(X, treatment)
    ps = ps_model.predict_proba(X)[:, 1]

    # Trim and compute IPW
    ps_trimmed = np.clip(ps, 0.02, 0.98)
    w1 = treatment / ps_trimmed
    w0 = (1 - treatment) / (1 - ps_trimmed)
    mu1 = np.sum(w1 * y_obs) / np.sum(w1)
    mu0 = np.sum(w0 * y_obs) / np.sum(w0)
    ate_ipw = mu1 - mu0

    # Standard error via influence function
    scores = (
        treatment * (y_obs - mu1) / ps_trimmed
        - (1 - treatment) * (y_obs - mu0) / (1 - ps_trimmed)
    )
    se_ipw = np.sqrt(np.mean(scores ** 2) / n)

    true_ate = (y1 - y0).mean()
    naive = y_obs[treatment == 1].mean() - y_obs[treatment == 0].mean()

    return {
        "true_ate": float(true_ate),
        "naive": float(naive),
        "ipw_ate": float(ate_ipw),
        "ipw_se": float(se_ipw),
        "ess_treated": float(np.sum(w1) ** 2 / np.sum(w1 ** 2)),
        "ess_control": float(np.sum(w0) ** 2 / np.sum(w0 ** 2)),
    }


result_ipw = medicore_ipw_full()
print("Method 1: IPW")
print(f"  True ATE:  {result_ipw['true_ate']:.4f}")
print(f"  Naive:     {result_ipw['naive']:.4f}")
print(f"  IPW ATE:   {result_ipw['ipw_ate']:.4f} (SE: {result_ipw['ipw_se']:.4f})")
print(f"  ESS ratio: treated={result_ipw['ess_treated']/31550:.2f}, "
      f"control={result_ipw['ess_control']/18450:.2f}")
Method 1: IPW
  True ATE:  -0.0482
  Naive:     0.0328
  IPW ATE:   -0.0461 (SE: 0.0058)
  ESS ratio: treated=0.79, control=0.63

The IPW estimate reverses the sign of the naive analysis, confirming that Drug X reduces readmission. The effective sample size ratios are acceptable (both above 0.5), indicating that the weights are not dominated by extreme values.

Limitation acknowledged: If physician judgment captures information about patient prognosis beyond the 8 measured covariates, conditional ignorability fails and the IPW estimate is biased.

Method 2: Instrumental Variables (Hospital Distance)

Identification Strategy

Not all prescribing variation is driven by patient characteristics. Some variation is geographic: patients at certain hospitals receive Drug X at much higher rates due to institutional formulary preferences, physician habits, and supply chain factors. The team uses distance to the nearest high-Drug-X-prescribing hospital as an instrument.

Assumptions: (1) Distance affects Drug X receipt (relevance — testable). (2) Distance does not directly affect readmission (exclusion restriction — arguable). (3) Distance is uncorrelated with unmeasured patient health factors (independence — arguable after controlling for measured covariates).

Analysis

def medicore_iv_analysis(seed: int = 42) -> dict:
    """MediCore IV analysis using hospital distance."""
    rng = np.random.RandomState(seed)
    n = 50000

    age = rng.normal(65, 12, n).clip(30, 95)
    severity = rng.exponential(1.0, n)
    comorbidities = rng.poisson(2.5, n).clip(0, 10)

    # Unmeasured confounder: physician's private prognosis assessment
    physician_prognosis = 0.5 * severity + rng.normal(0, 1, n)

    # Instrument: distance to nearest Drug-X hospital
    distance = rng.exponential(15, n).clip(1, 100)

    logit_t = (
        0.5 - 0.03 * distance + 0.4 * severity + 0.3 * comorbidities
        + 0.6 * physician_prognosis + rng.normal(0, 0.5, n)
    )
    treatment = rng.binomial(1, 1 / (1 + np.exp(-logit_t)))

    logit_y0 = (
        -1.5 + 0.01 * (age - 65) + 0.5 * severity + 0.15 * comorbidities
        + 0.4 * physician_prognosis + rng.normal(0, 0.3, n)
    )
    y0 = rng.binomial(1, 1 / (1 + np.exp(-logit_y0)))
    y1 = rng.binomial(1, 1 / (1 + np.exp(-logit_y0 - 0.4)))
    y_obs = treatment * y1 + (1 - treatment) * y0

    # Stage 1
    controls = np.column_stack([age, severity, comorbidities])
    X_s1 = sm.add_constant(np.column_stack([distance, controls]))
    s1 = sm.OLS(treatment, X_s1).fit()
    d_hat = s1.fittedvalues
    f_stat = s1.f_test("x1 = 0").fvalue

    # Stage 2
    X_s2 = sm.add_constant(np.column_stack([d_hat, controls]))
    s2 = sm.OLS(y_obs, X_s2).fit(cov_type="HC1")

    true_ate = (y1 - y0).mean()

    return {
        "true_ate": float(true_ate),
        "iv_estimate": float(s2.params[1]),
        "iv_se": float(s2.bse[1]),
        "first_stage_f": float(f_stat),
        "first_stage_coef": float(s1.params[1]),
    }


result_iv = medicore_iv_analysis()
print("\nMethod 2: Instrumental Variables (Distance)")
print(f"  True ATE:        {result_iv['true_ate']:.4f}")
print(f"  IV estimate:     {result_iv['iv_estimate']:.4f} (SE: {result_iv['iv_se']:.4f})")
print(f"  First-stage F:   {result_iv['first_stage_f']:.1f}")
print(f"  Distance coef:   {result_iv['first_stage_coef']:.5f}")
Method 2: Instrumental Variables (Distance)
  True ATE:        -0.0398
  IV estimate:     -0.0371 (SE: 0.0081)
  First-stage F:   289.7
  Distance coef:   -0.00312

The IV estimate is $-0.037$ (3.7 percentage point reduction), consistent with IPW. The first-stage F-statistic of 290 rules out weak instrument concerns. Each additional mile from the hospital decreases the probability of Drug X receipt by 0.31 percentage points.

Limitation acknowledged: The IV estimate is LATE, not ATE — it applies to compliers (patients whose treatment depends on hospital proximity). The exclusion restriction cannot be tested; the team documents the argument and conducts sensitivity analysis.

Method 3: Difference-in-Differences (Policy Change)

Identification Strategy

Six months into real-world use, MediCore's hospital partner network (100 hospitals) mandated Drug X as the default discharge prescription for heart failure patients. The remaining 240 hospitals maintained physician discretion. The policy created a sharp before-after contrast within the policy hospitals, with non-policy hospitals serving as controls.

Assumption: Absent the mandate, readmission trends at policy hospitals would have paralleled those at non-policy hospitals (parallel trends).

Analysis

def medicore_did_analysis(seed: int = 42) -> dict:
    """MediCore DiD around the Drug X mandate policy."""
    rng = np.random.RandomState(seed)

    n_hospitals = 200
    n_periods = 12
    policy_time = 6

    rows = []
    for h in range(n_hospitals):
        is_policy = h < 100  # First 100 hospitals adopted policy
        hosp_fe = rng.normal(0, 0.02)

        for t in range(n_periods):
            time_trend = -0.002 * t  # Common declining trend
            post = int(t >= policy_time)
            treat_effect = -0.042 * post * is_policy  # True effect: 4.2 pp
            noise = rng.normal(0, 0.008)
            rate = 0.175 + hosp_fe + time_trend + treat_effect + noise
            rows.append({
                "hospital_id": h, "period": t, "policy": int(is_policy),
                "post": post, "event_time": t - policy_time,
                "readmission_rate": np.clip(rate, 0, 1),
            })

    df = pd.DataFrame(rows)

    # DiD regression with clustered SEs
    df["policy_x_post"] = df["policy"] * df["post"]
    X_did = sm.add_constant(df[["policy", "post", "policy_x_post"]].values)
    model = sm.OLS(
        df["readmission_rate"].values, X_did
    ).fit(cov_type="cluster", cov_kwds={"groups": df["hospital_id"].values})

    # Event study (pre-trend check)
    pre_coefs = []
    for k in [-3, -2, -1]:
        pre_data = df[df["event_time"] == k]
        diff = (
            pre_data.loc[pre_data["policy"] == 1, "readmission_rate"].mean()
            - pre_data.loc[pre_data["policy"] == 0, "readmission_rate"].mean()
        )
        pre_coefs.append({"event_time": k, "diff": float(diff)})

    return {
        "did_estimate": float(model.params[3]),
        "did_se": float(model.bse[3]),
        "did_ci_lower": float(model.conf_int()[3, 0]),
        "did_ci_upper": float(model.conf_int()[3, 1]),
        "pre_trend_diffs": pre_coefs,
    }


result_did = medicore_did_analysis()
print("\nMethod 3: Difference-in-Differences (Policy Mandate)")
print(f"  DiD estimate:    {result_did['did_estimate']:.4f} "
      f"(SE: {result_did['did_se']:.4f})")
print(f"  95% CI:          [{result_did['did_ci_lower']:.4f}, "
      f"{result_did['did_ci_upper']:.4f}]")
print("  Pre-trend gaps (should be stable):")
for p in result_did["pre_trend_diffs"]:
    print(f"    k={p['event_time']}: {p['diff']:+.4f}")
Method 3: Difference-in-Differences (Policy Mandate)
  DiD estimate:    -0.0419 (SE: 0.0021)
  95% CI:          [-0.0461, -0.0378]
  Pre-trend gaps (should be stable):
    k=-3: +0.0012
    k=-2: -0.0008
    k=-1: +0.0005

The DiD estimate is $-0.042$ (4.2 percentage point reduction). Pre-trend differences are small and stable (all below 0.2 percentage points), supporting the parallel trends assumption.

Convergence and Conclusions

Method Estimate 95% CI Estimand Key Assumption
IPW $-0.046$ $[-0.057, -0.035]$ ATE Conditional ignorability
IV (distance) $-0.037$ $[-0.053, -0.021]$ LATE (compliers) Exclusion restriction
DiD (policy) $-0.042$ $[-0.046, -0.038]$ ATT (policy hospitals) Parallel trends

All three methods converge on the same qualitative conclusion: Drug X reduces 30-day readmission by approximately 4-5 percentage points. The naive estimate of +3.3 percentage points was entirely driven by confounding.

The estimates differ in magnitude because they target different populations (ATE vs. LATE vs. ATT), but the confidence intervals overlap substantially. The convergence across three methods with different identifying assumptions provides strong evidence for the causal effect.

Dr. Patel's team presents the results to the regulatory inquiry panel with the following conclusion: "The observational data, analyzed using three independent causal estimation strategies, consistently show that Drug X reduces 30-day readmission by 3.7 to 4.6 percentage points. The naive analysis that triggered the inquiry was confounded by prescribing patterns that channel sicker patients toward Drug X. This is a textbook example of confounding bias reversing the sign of a beneficial treatment effect."