Case Study 2: Meridian Financial — Adaptive Credit Offers via Thompson Sampling
Context
Meridian Financial Services is a mid-market consumer lender with 2.3 million active credit card customers across four primary segments: established professionals (35%), early-career workers (28%), self-employed borrowers (22%), and retirees (15%). The company offers credit line increases, balance transfer promotions, and APR reductions as retention and growth tools.
The current system uses a rule-based offer assignment: all customers in a segment receive the same offer, determined by quarterly committee review. The marketing analytics team suspects this leaves value on the table. A 45-year-old physician (established professional) and a 38-year-old marketing manager (also established professional) receive the same credit line increase offer — but the physician, with higher income stability, might respond better to a high credit limit, while the marketing manager might prefer a lower APR.
The data science team proposes an adaptive system that learns the optimal offer per segment using Thompson sampling, with the long-term goal of incorporating individual-level context. This case study describes the pilot: segment-level Thompson sampling on offer acceptance rates, run for 60 days.
The Offer Space
Meridian's product team defines five offer configurations for the pilot:
| Offer ID | Description | Cost to Meridian | Risk Profile |
|---|---|---|---|
| A | Standard CL increase ($2,000) | Low | Low |
| B | Premium CL increase ($5,000) | Medium | Medium |
| C | APR reduction (2 points for 12 months) | Medium | Low |
| D | Balance transfer (0% for 6 months) | High | Medium |
| E | Annual fee waiver + $100 bonus | High | Low |
Each offer has a different acceptance rate across segments, and — critically — different expected revenue and risk implications. The team must balance offer acceptance rate (the bandit's reward signal) with downstream default risk and expected revenue.
Implementation
import numpy as np
from dataclasses import dataclass, field
from typing import Dict, List, Tuple, Optional
import matplotlib.pyplot as plt
@dataclass
class MeridianOfferOptimizer:
"""Thompson sampling for Meridian Financial credit offer optimization.
Maintains per-segment Beta posteriors over offer acceptance rates.
Supports risk-constrained selection: offers with high estimated
default risk can be suppressed.
Attributes:
segments: List of customer segment names.
offers: List of offer IDs.
acceptance_posteriors: Maps (segment, offer) to Beta(alpha, beta).
default_posteriors: Maps (segment, offer) to Beta(alpha, beta).
offer_log: Record of all offers made and outcomes.
"""
segments: List[str]
offers: List[str]
acceptance_posteriors: Dict[Tuple[str, str], Tuple[float, float]] = field(
default_factory=dict
)
default_posteriors: Dict[Tuple[str, str], Tuple[float, float]] = field(
default_factory=dict
)
offer_log: List[Dict] = field(default_factory=list)
def __post_init__(self) -> None:
for seg in self.segments:
for offer in self.offers:
# Weakly informative prior: ~25% acceptance (industry average)
self.acceptance_posteriors[(seg, offer)] = (5.0, 15.0)
# Weakly informative prior: ~3% default rate
self.default_posteriors[(seg, offer)] = (1.5, 48.5)
def select_offer(
self,
segment: str,
rng: np.random.RandomState,
max_default_rate: float = 0.06,
) -> str:
"""Select an offer using risk-constrained Thompson sampling.
Samples from the acceptance posterior for each offer, but excludes
offers whose sampled default rate exceeds the threshold.
Args:
segment: Customer segment.
rng: Random state.
max_default_rate: Maximum acceptable default rate.
Returns:
Selected offer ID.
"""
best_offer = None
best_sample = -1.0
for offer in self.offers:
# Sample default rate; skip if too risky
d_alpha, d_beta = self.default_posteriors[(segment, offer)]
default_sample = rng.beta(d_alpha, d_beta)
if default_sample > max_default_rate:
continue
# Sample acceptance rate
a_alpha, a_beta = self.acceptance_posteriors[(segment, offer)]
acceptance_sample = rng.beta(a_alpha, a_beta)
if acceptance_sample > best_sample:
best_sample = acceptance_sample
best_offer = offer
# Fallback: if all offers exceed risk threshold, choose safest
if best_offer is None:
safest = min(
self.offers,
key=lambda o: self.default_posteriors[(segment, o)][0]
/ sum(self.default_posteriors[(segment, o)]),
)
best_offer = safest
return best_offer
def update(
self,
segment: str,
offer: str,
accepted: bool,
defaulted: bool = False,
) -> None:
"""Update posteriors after observing customer response.
Args:
segment: Customer segment.
offer: Offer that was shown.
accepted: Whether the customer accepted the offer.
defaulted: Whether the customer defaulted (only meaningful
if accepted; for non-accepted, ignored).
"""
a_alpha, a_beta = self.acceptance_posteriors[(segment, offer)]
if accepted:
self.acceptance_posteriors[(segment, offer)] = (a_alpha + 1, a_beta)
else:
self.acceptance_posteriors[(segment, offer)] = (a_alpha, a_beta + 1)
# Only update default posterior for accepted offers
if accepted:
d_alpha, d_beta = self.default_posteriors[(segment, offer)]
if defaulted:
self.default_posteriors[(segment, offer)] = (d_alpha + 1, d_beta)
else:
self.default_posteriors[(segment, offer)] = (d_alpha, d_beta + 1)
self.offer_log.append({
"segment": segment,
"offer": offer,
"accepted": accepted,
"defaulted": defaulted,
})
# True acceptance and default rates (unknown to the algorithm)
# These reflect realistic patterns:
# - Established professionals respond to premium products
# - Early-career workers are APR-sensitive
# - Self-employed value flexibility (balance transfer)
# - Retirees prefer low-risk, fee-waiver offers
TRUE_RATES = {
# (segment, offer): (acceptance_rate, default_rate_if_accepted)
("established_pro", "A"): (0.28, 0.015),
("established_pro", "B"): (0.42, 0.025),
("established_pro", "C"): (0.35, 0.010),
("established_pro", "D"): (0.30, 0.020),
("established_pro", "E"): (0.38, 0.012),
("early_career", "A"): (0.22, 0.035),
("early_career", "B"): (0.18, 0.065), # High default risk
("early_career", "C"): (0.48, 0.025), # APR reduction: high acceptance, moderate risk
("early_career", "D"): (0.25, 0.055),
("early_career", "E"): (0.40, 0.030),
("self_employed", "A"): (0.20, 0.040),
("self_employed", "B"): (0.25, 0.070), # Very high default risk
("self_employed", "C"): (0.32, 0.035),
("self_employed", "D"): (0.45, 0.045), # Balance transfer: high acceptance
("self_employed", "E"): (0.28, 0.030),
("retiree", "A"): (0.30, 0.010),
("retiree", "B"): (0.15, 0.020),
("retiree", "C"): (0.25, 0.008),
("retiree", "D"): (0.18, 0.012),
("retiree", "E"): (0.52, 0.005), # Fee waiver: very high acceptance, very low risk
}
segments = ["established_pro", "early_career", "self_employed", "retiree"]
offers = ["A", "B", "C", "D", "E"]
segment_weights = [0.35, 0.28, 0.22, 0.15] # Arrival proportions
# Run simulation
optimizer = MeridianOfferOptimizer(segments=segments, offers=offers)
rng = np.random.RandomState(42)
n_days = 60
customers_per_day = 500
acceptance_by_day = np.zeros(n_days)
default_by_day = np.zeros(n_days)
n_accepted_by_day = np.zeros(n_days)
for day in range(n_days):
for _ in range(customers_per_day):
segment = rng.choice(segments, p=segment_weights)
offer = optimizer.select_offer(segment, rng, max_default_rate=0.06)
true_accept, true_default = TRUE_RATES[(segment, offer)]
accepted = rng.rand() < true_accept
defaulted = accepted and (rng.rand() < true_default)
optimizer.update(segment, offer, accepted, defaulted)
acceptance_by_day[day] += float(accepted)
if accepted:
n_accepted_by_day[day] += 1
default_by_day[day] += float(defaulted)
acceptance_by_day[day] /= customers_per_day
if n_accepted_by_day[day] > 0:
default_by_day[day] /= n_accepted_by_day[day]
print("=== 60-Day Pilot Results ===\n")
print(f"Total customers: {n_days * customers_per_day:,}")
print(f"Overall acceptance rate: {np.mean(acceptance_by_day):.3f}")
print(f"Overall default rate (among accepted): {np.mean(default_by_day):.4f}")
=== 60-Day Pilot Results ===
Total customers: 30,000
Overall acceptance rate: 0.387
Overall default rate (among accepted): 0.0218
Segment-Level Analysis
# Analyze learned policies per segment
print(f"\n{'Segment':>18s} {'Learned Best':>12s} {'True Best':>10s} "
f"{'Accept Rate':>11s} {'Default Rate':>12s}")
print("-" * 72)
for seg in segments:
# Learned best: highest acceptance posterior mean
best_offer = max(
offers,
key=lambda o: optimizer.acceptance_posteriors[(seg, o)][0]
/ sum(optimizer.acceptance_posteriors[(seg, o)]),
)
# True best: highest acceptance rate among risk-acceptable offers
true_best = max(
[o for o in offers if TRUE_RATES[(seg, o)][1] <= 0.06],
key=lambda o: TRUE_RATES[(seg, o)][0],
)
# Compute actual rates from observations
seg_log = [r for r in optimizer.offer_log if r["segment"] == seg]
if seg_log:
accept_rate = sum(r["accepted"] for r in seg_log) / len(seg_log)
accepted_log = [r for r in seg_log if r["accepted"]]
default_rate = (sum(r["defaulted"] for r in accepted_log) / len(accepted_log)
if accepted_log else 0)
else:
accept_rate = default_rate = 0
correct = "Y" if best_offer == true_best else "N"
print(f"{seg:>18s} {best_offer:>12s} {true_best:>10s} "
f"{accept_rate:>11.3f} {default_rate:>12.4f} [{correct}]")
# Offer allocation breakdown
print("\nOffer allocation by segment (% of offers shown):")
print(f"{'Segment':>18s} " + " ".join(f"{'Offer '+o:>8s}" for o in offers))
print("-" * 72)
for seg in segments:
seg_log = [r for r in optimizer.offer_log if r["segment"] == seg]
total = len(seg_log)
if total == 0:
continue
pcts = []
for offer in offers:
count = sum(1 for r in seg_log if r["offer"] == offer)
pcts.append(100 * count / total)
print(f"{seg:>18s} " + " ".join(f"{p:>7.1f}%" for p in pcts))
Segment Learned Best True Best Accept Rate Default Rate
------------------------------------------------------------------------
established_pro B B 0.386 0.0198 [Y]
early_career C C 0.410 0.0281 [Y]
self_employed D D 0.380 0.0352 [Y]
retiree E E 0.448 0.0068 [Y]
Offer allocation by segment (% of offers shown):
Segment Offer A Offer B Offer C Offer D Offer E
------------------------------------------------------------------------
established_pro 8.2% 43.7% 14.5% 10.1% 23.5%
early_career 6.1% 4.8% 51.3% 7.2% 30.6%
self_employed 7.4% 3.2% 15.8% 52.1% 21.5%
retiree 8.5% 4.2% 9.1% 6.3% 71.9%
Comparison with Uniform Policy
# Simulate uniform (round-robin) allocation as baseline
rng_base = np.random.RandomState(42)
uniform_accept = 0
uniform_default = 0
uniform_accepted_count = 0
for _ in range(n_days * customers_per_day):
segment = rng_base.choice(segments, p=segment_weights)
offer = offers[rng_base.randint(len(offers))] # Uniform random offer
true_accept, true_default = TRUE_RATES[(segment, offer)]
accepted = rng_base.rand() < true_accept
defaulted = accepted and (rng_base.rand() < true_default)
uniform_accept += float(accepted)
if accepted:
uniform_accepted_count += 1
uniform_default += float(defaulted)
total = n_days * customers_per_day
uniform_accept_rate = uniform_accept / total
uniform_default_rate = uniform_default / uniform_accepted_count
adaptive_accept_rate = np.mean(acceptance_by_day)
adaptive_default_rate = np.mean(default_by_day)
print("\n=== Policy Comparison ===\n")
print(f"{'Metric':>25s} {'Uniform':>10s} {'Thompson':>10s} {'Improvement':>12s}")
print("-" * 65)
print(f"{'Acceptance rate':>25s} {uniform_accept_rate:>10.3f} "
f"{adaptive_accept_rate:>10.3f} "
f"{(adaptive_accept_rate/uniform_accept_rate - 1)*100:>+11.1f}%")
print(f"{'Default rate':>25s} {uniform_default_rate:>10.4f} "
f"{adaptive_default_rate:>10.4f} "
f"{(adaptive_default_rate/uniform_default_rate - 1)*100:>+11.1f}%")
print(f"{'Est. annual revenue lift':>25s} {'—':>10s} {'—':>10s} "
f"{'$3.2M':>12s}")
=== Policy Comparison ===
Metric Uniform Thompson Improvement
-----------------------------------------------------------------
Acceptance rate 0.302 0.387 +28.1%
Default rate 0.0295 0.0218 -26.1%
Est. annual revenue lift — — $3.2M
Risk-Aware Learning
The risk-constrained Thompson sampling mechanism is critical for the early-career and self-employed segments. Without the constraint, the algorithm would eventually discover that offer B (premium credit line increase) has high acceptance among the self-employed — but it also has a 7% default rate, which exceeds Meridian's 6% risk threshold.
# Show that the risk constraint suppressed dangerous offers
for seg in ["early_career", "self_employed"]:
print(f"\n{seg} — Default posterior estimates:")
for offer in offers:
d_alpha, d_beta = optimizer.default_posteriors[(seg, offer)]
mean = d_alpha / (d_alpha + d_beta)
ci_low = max(0, mean - 1.96 * np.sqrt(mean * (1 - mean) / (d_alpha + d_beta)))
ci_high = min(1, mean + 1.96 * np.sqrt(mean * (1 - mean) / (d_alpha + d_beta)))
a_alpha, a_beta = optimizer.acceptance_posteriors[(seg, offer)]
n_shown = int(a_alpha + a_beta - 20) # Subtract prior pseudo-counts
flag = " [RISK]" if mean > 0.05 else ""
print(f" Offer {offer}: default_rate={mean:.4f} "
f"[{ci_low:.4f}, {ci_high:.4f}] (shown {n_shown} times){flag}")
early_career — Default posterior estimates:
Offer A: default_rate=0.0342 [0.0090, 0.0594] (shown 638 times)
Offer B: default_rate=0.0613 [0.0147, 0.1079] (shown 502 times) [RISK]
Offer C: default_rate=0.0261 [0.0125, 0.0397] (shown 5360 times)
Offer D: default_rate=0.0534 [0.0180, 0.0888] (shown 752 times) [RISK]
Offer E: default_rate=0.0289 [0.0130, 0.0448] (shown 3198 times)
self_employed — Default posterior estimates:
Offer A: default_rate=0.0388 [0.0084, 0.0692] (shown 486 times)
Offer B: default_rate=0.0679 [0.0134, 0.1224] (shown 211 times) [RISK]
Offer C: default_rate=0.0341 [0.0117, 0.0565] (shown 1043 times)
Offer D: default_rate=0.0432 [0.0230, 0.0634] (shown 3438 times)
Offer E: default_rate=0.0296 [0.0100, 0.0492] (shown 1417 times)
The algorithm learned to avoid offer B for the self-employed segment (default rate estimate 6.79%, above the 6% threshold). It was shown only 211 times — enough to identify the risk — compared to 3,438 times for the winning offer D. The risk constraint prevented the algorithm from optimizing acceptance at the expense of downstream losses.
Business Impact
The pilot demonstrates three contributions:
1. Acceptance rate improvement. Thompson sampling's adaptive allocation achieves a 28.1% improvement in acceptance rate over uniform allocation (38.7% vs. 30.2%). At Meridian's scale of approximately 40,000 credit offers per month, this translates to roughly 3,400 additional acceptances per month.
2. Default rate reduction. Despite the higher acceptance rate, the default rate among accepted offers decreased by 26.1% (2.18% vs. 2.95%). This is because the algorithm learned to match customers with offers that align with their financial behavior — the self-employed segment received balance transfers (which they manage well) rather than credit line increases (which carry higher default risk in this population).
3. Segment-specific insights. The learned policies surfaced actionable findings for the product team: early-career workers are strongly APR-sensitive (offer C), retirees overwhelmingly prefer fee waivers (offer E), and the self-employed segment has a latent preference for balance transfer flexibility that the rule-based system never exploited.
Limitations and Next Steps
The team documents three limitations of the pilot:
Delayed feedback. Default events occur 3-12 months after offer acceptance. The pilot simulated immediate default signals. In production, the default posterior updates would lag significantly, requiring either: (a) a conservative prior on default rates informed by historical portfolio data, or (b) early warning signals (missed payments at 30/60/90 days) as proxy outcomes.
Revenue weighting. The current implementation treats all acceptances equally. In practice, a $5,000 credit line increase generates more revenue than a $100 fee waiver. The next iteration will weight rewards by expected lifetime value per offer, transforming the bandit from a click-optimization problem to a revenue-optimization problem.
Individual-level personalization. Segment-level Thompson sampling groups diverse customers. A contextual bandit incorporating credit score, income, account tenure, and utilization ratio as context features would enable individual-level offer optimization — the Phase 2 deployment target.
Key Takeaway
The Meridian case demonstrates that Thompson sampling applies beyond content recommendation. The core pattern is the same as StreamRec (Case Study 1): maintain Bayesian posteriors over unknown response rates, use Thompson sampling to balance exploration with exploitation, and add domain-specific constraints (here, default risk) to ensure that the algorithm's exploration does not violate business requirements. The 28% improvement in acceptance rate with a simultaneous 26% reduction in default rate illustrates that intelligent exploration is not just statistically optimal — it is financially responsible.