Case Study 2: StreamFlow Unusual Usage Patterns

Detecting Early Churn Indicators Through Anomaly Detection


Background

StreamFlow is a SaaS video streaming platform with 2.1 million subscribers. The data science team has built a strong supervised churn prediction model (Chapters 11-19), but the VP of Customer Success has a different request:

"The churn model tells me who is likely to leave. But by the time the model is confident, the customer is already mentally gone. I need earlier signals. I need to know when a customer's behavior changes --- when something unusual happens --- not when the model is 85% sure they will churn."

The churn model predicts based on current feature values: low engagement, payment failures, support tickets. The VP wants something upstream: detecting the behavioral shift itself. A customer who goes from watching 30 hours a month to 8 hours a month is exhibiting anomalous behavior, even if 8 hours is not yet low enough to trigger the churn model.

This is an anomaly detection problem. Instead of modeling "who will churn," we model "whose behavior is unusual for them."

The Data

import numpy as np
import pandas as pd
from sklearn.ensemble import IsolationForest
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import roc_auc_score, average_precision_score
import matplotlib.pyplot as plt

np.random.seed(42)
n = 50000

streamflow = pd.DataFrame({
    'account_id': range(1, n + 1),
    'monthly_hours_watched': np.random.exponential(18, n).round(1),
    'sessions_last_30d': np.random.poisson(14, n),
    'avg_session_minutes': np.random.exponential(28, n).round(1),
    'unique_titles_watched': np.random.poisson(8, n),
    'content_completion_rate': np.random.beta(3, 2, n).round(3),
    'binge_sessions_30d': np.random.poisson(2, n),
    'weekend_ratio': np.random.beta(2.5, 3, n).round(3),
    'peak_hour_ratio': np.random.beta(3, 2, n).round(3),
    'hours_change_pct': np.random.normal(0, 30, n).round(1),
    'sessions_change_pct': np.random.normal(0, 25, n).round(1),
    'months_active': np.random.randint(1, 60, n),
    'plan_price': np.random.choice(
        [9.99, 14.99, 19.99, 24.99], n, p=[0.35, 0.35, 0.20, 0.10]
    ),
    'devices_used': np.random.randint(1, 6, n),
    'profiles_active': np.random.randint(1, 5, n),
    'payment_failures_6m': np.random.poisson(0.3, n),
    'support_tickets_90d': np.random.poisson(0.8, n),
    'days_since_last_session': np.random.exponential(5, n).round(0).clip(0, 60),
    'recommendation_click_rate': np.random.beta(2, 8, n).round(3),
    'search_frequency_30d': np.random.poisson(6, n),
    'download_count_30d': np.random.poisson(3, n),
    'share_count_30d': np.random.poisson(1, n),
    'rating_count_30d': np.random.poisson(2, n),
    'free_trial_convert': np.random.binomial(1, 0.65, n),
    'referral_source': np.random.choice(
        [0, 1, 2, 3], n, p=[0.50, 0.25, 0.15, 0.10]
    ),
})

# Generate churn outcome (used for evaluation, not for training)
churn_logit = (
    -3.0
    + 0.08 * streamflow['days_since_last_session']
    - 0.02 * streamflow['monthly_hours_watched']
    - 0.04 * streamflow['sessions_last_30d']
    + 0.15 * streamflow['payment_failures_6m']
    + 0.10 * streamflow['support_tickets_90d']
    - 0.01 * streamflow['months_active']
    + 0.03 * np.abs(streamflow['hours_change_pct'])
)
churn_prob = 1 / (1 + np.exp(-churn_logit))
streamflow['churned'] = np.random.binomial(1, churn_prob)

print(f"Dataset: {len(streamflow)} accounts")
print(f"Churn rate: {streamflow['churned'].mean():.3f}")

Step 1: Define "Unusual" for StreamFlow

Before running any algorithm, clarify what kinds of anomalies matter. Not all unusual accounts are pre-churn. Some anomalies are positive (a customer who suddenly increases usage after discovering a new show) and some are operational (a bot account, a shared corporate login).

The Customer Success team identifies three anomaly archetypes they care about:

  1. Disengagement shift: Previously active accounts showing sudden drops in engagement (hours, sessions, completion rate).
  2. Payment distress: Accounts with unusual payment patterns (multiple failures, plan downgrades combined with reduced usage).
  3. Behavioral outliers: Accounts whose combination of features does not match any normal usage pattern (potential account sharing, bot activity, or data quality issues).

We will build a detection system for all three, then filter the output to focus on churn-relevant anomalies.


Step 2: Feature Selection and Engineering

The raw 24 features contain a mix of usage, behavioral, and account characteristics. For anomaly detection, we want features that capture the behavioral patterns relevant to our three archetypes.

# Engagement features: current usage levels
engagement_features = [
    'monthly_hours_watched', 'sessions_last_30d', 'avg_session_minutes',
    'unique_titles_watched', 'content_completion_rate', 'binge_sessions_30d',
]

# Change features: behavioral shifts
change_features = [
    'hours_change_pct', 'sessions_change_pct',
]

# Distress signals
distress_features = [
    'payment_failures_6m', 'support_tickets_90d', 'days_since_last_session',
]

# Interaction patterns
interaction_features = [
    'recommendation_click_rate', 'search_frequency_30d',
    'download_count_30d', 'share_count_30d', 'rating_count_30d',
]

# Combined feature set for anomaly detection
anomaly_features = (
    engagement_features + change_features + distress_features + interaction_features
)

print(f"Features for anomaly detection: {len(anomaly_features)}")
for f in anomaly_features:
    print(f"  {f}: mean={streamflow[f].mean():.2f}, std={streamflow[f].std():.2f}")

Step 3: Isolation Forest Detection

scaler = StandardScaler()
X = scaler.fit_transform(streamflow[anomaly_features])

iso = IsolationForest(
    n_estimators=200,
    max_samples=256,
    contamination=0.03,   # expect ~3% unusual accounts
    random_state=42,
)
streamflow['iso_anomaly'] = iso.fit_predict(X) == -1
streamflow['iso_score'] = -iso.decision_function(X)

n_flagged = streamflow['iso_anomaly'].sum()
churn_rate_flagged = streamflow.loc[streamflow['iso_anomaly'], 'churned'].mean()
churn_rate_normal = streamflow.loc[~streamflow['iso_anomaly'], 'churned'].mean()

print(f"Isolation Forest results:")
print(f"  Accounts flagged: {n_flagged} ({n_flagged/len(streamflow)*100:.1f}%)")
print(f"  Churn rate (flagged):  {churn_rate_flagged:.3f}")
print(f"  Churn rate (normal):   {churn_rate_normal:.3f}")
print(f"  Churn lift: {churn_rate_flagged / churn_rate_normal:.1f}x")

Profiling Anomalous Accounts

What makes flagged accounts different? This is the information the Customer Success team needs.

# Compare flagged vs. normal across all features
profile = streamflow.groupby('iso_anomaly')[anomaly_features].agg(['mean', 'median'])

# Simplified comparison: mean values
comparison = streamflow.groupby('iso_anomaly')[anomaly_features].mean().T
comparison.columns = ['Normal', 'Anomalous']
comparison['Ratio'] = comparison['Anomalous'] / comparison['Normal']
comparison['Abs_Diff_Std'] = (
    (comparison['Anomalous'] - comparison['Normal']) /
    streamflow[anomaly_features].std()
).abs()

# Sort by standardized difference
comparison = comparison.sort_values('Abs_Diff_Std', ascending=False)

print("\nAnomaly Profile (sorted by standardized difference):")
print(comparison.round(3).to_string())

Step 4: Anomaly Archetypes via Clustering

Not all anomalies are the same. Cluster the flagged accounts to identify distinct anomaly types.

from sklearn.cluster import KMeans

flagged = streamflow[streamflow['iso_anomaly']].copy()
X_flagged = scaler.transform(flagged[anomaly_features])

# Cluster the anomalous accounts into archetypes
kmeans = KMeans(n_clusters=4, random_state=42, n_init=10)
flagged['archetype'] = kmeans.fit_predict(X_flagged)

# Profile each archetype
print("Anomaly Archetypes:")
print("=" * 80)
for arch in sorted(flagged['archetype'].unique()):
    subset = flagged[flagged['archetype'] == arch]
    print(f"\nArchetype {arch}: {len(subset)} accounts "
          f"(churn rate: {subset['churned'].mean():.3f})")

    # Top distinguishing features (vs. normal population)
    arch_means = subset[anomaly_features].mean()
    pop_means = streamflow[anomaly_features].mean()
    pop_stds = streamflow[anomaly_features].std()
    z_diff = ((arch_means - pop_means) / pop_stds).abs().nlargest(5)

    print(f"  Top distinguishing features:")
    for feat, z in z_diff.items():
        direction = "above" if arch_means[feat] > pop_means[feat] else "below"
        print(f"    {feat}: {arch_means[feat]:.2f} "
              f"({z:.1f} std {direction} population mean)")

Naming the Archetypes

# Generate a summary table for the Customer Success team
archetype_summary = flagged.groupby('archetype').agg(
    count=('account_id', 'count'),
    churn_rate=('churned', 'mean'),
    avg_hours=('monthly_hours_watched', 'mean'),
    avg_sessions=('sessions_last_30d', 'mean'),
    avg_hours_change=('hours_change_pct', 'mean'),
    avg_days_inactive=('days_since_last_session', 'mean'),
    avg_payment_failures=('payment_failures_6m', 'mean'),
    avg_support_tickets=('support_tickets_90d', 'mean'),
).round(3)

print("\nArchetype Summary:")
print(archetype_summary.to_string())

Practical Note --- The archetype labels (0, 1, 2, 3) are meaningless until a domain expert names them. The Customer Success team might label them: "Disengaging Power Users," "Payment-Distressed," "Ghost Accounts," and "Hyper-Active Outliers." The data science team provides the clusters and profiles; the business team provides the interpretation and action plan.


Step 5: Early Warning Value

The VP's original question was about early detection. Does flagging anomalous usage patterns provide advance warning of churn beyond the supervised model?

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score

# Model 1: Supervised churn model (uses engagement features directly)
supervised_features = anomaly_features + ['months_active', 'plan_price']
X_supervised = StandardScaler().fit_transform(streamflow[supervised_features])
y = streamflow['churned']

lr_supervised = LogisticRegression(random_state=42, max_iter=1000)
auc_supervised = cross_val_score(
    lr_supervised, X_supervised, y, cv=5, scoring='roc_auc'
).mean()

# Model 2: Supervised model + anomaly score as additional feature
X_with_anomaly = np.column_stack([X_supervised, streamflow['iso_score'].values])
lr_with_anomaly = LogisticRegression(random_state=42, max_iter=1000)
auc_with_anomaly = cross_val_score(
    lr_with_anomaly, X_with_anomaly, y, cv=5, scoring='roc_auc'
).mean()

# Model 3: Anomaly score alone
X_score_only = streamflow[['iso_score']].values
lr_score_only = LogisticRegression(random_state=42, max_iter=1000)
auc_score_only = cross_val_score(
    lr_score_only, X_score_only, y, cv=5, scoring='roc_auc'
).mean()

print("Churn Prediction Comparison:")
print(f"  Supervised model (features):        AUC = {auc_supervised:.4f}")
print(f"  Supervised + anomaly score:          AUC = {auc_with_anomaly:.4f}")
print(f"  Anomaly score only:                  AUC = {auc_score_only:.4f}")
print(f"  Lift from adding anomaly score:      {auc_with_anomaly - auc_supervised:+.4f}")

Time-Based Early Warning Analysis

The real test: among accounts that eventually churn, are the anomaly scores elevated before the churn event? We simulate this by examining the correlation between anomaly scores and the severity of behavioral change.

# Among churned accounts: how does anomaly score relate to engagement change?
churned = streamflow[streamflow['churned'] == 1].copy()

# Bin by engagement change severity
churned['change_severity'] = pd.cut(
    churned['hours_change_pct'],
    bins=[-np.inf, -50, -25, -10, 10, np.inf],
    labels=['Severe drop', 'Moderate drop', 'Mild drop', 'Stable', 'Increasing']
)

severity_analysis = churned.groupby('change_severity', observed=False).agg(
    count=('account_id', 'count'),
    mean_anomaly_score=('iso_score', 'mean'),
    pct_flagged=('iso_anomaly', 'mean'),
).round(3)

print("\nAnomaly Detection by Engagement Change Severity (churned accounts):")
print(severity_analysis.to_string())

Step 6: The Alert System

The final deliverable is a daily alert feed for the Customer Success team.

def generate_daily_alerts(streamflow, iso_model, scaler, anomaly_features,
                          top_k=50):
    """
    Generate a daily alert DataFrame for the Customer Success team.

    Parameters
    ----------
    streamflow : DataFrame with account data
    iso_model : fitted IsolationForest
    scaler : fitted StandardScaler
    anomaly_features : list of feature names
    top_k : number of alerts to generate

    Returns
    -------
    DataFrame with alert details
    """
    X = scaler.transform(streamflow[anomaly_features])
    scores = -iso_model.decision_function(X)

    # Top k accounts by anomaly score
    top_idx = np.argsort(scores)[-top_k:][::-1]
    alerts = streamflow.iloc[top_idx].copy()
    alerts['anomaly_score'] = scores[top_idx]
    alerts['anomaly_rank'] = range(1, top_k + 1)

    # Identify top 3 contributing features per account
    pop_means = streamflow[anomaly_features].mean()
    pop_stds = streamflow[anomaly_features].std()

    top_features_list = []
    for idx in top_idx:
        account_values = streamflow.iloc[idx][anomaly_features]
        z_scores = ((account_values - pop_means) / pop_stds).abs()
        top_3 = z_scores.nlargest(3)
        top_features_list.append(
            '; '.join([f"{feat} (z={z:.1f})" for feat, z in top_3.items()])
        )

    alerts['top_anomalous_features'] = top_features_list

    # Recommended action based on anomaly profile
    actions = []
    for idx in top_idx:
        row = streamflow.iloc[idx]
        if row['payment_failures_6m'] >= 2:
            actions.append('Payment outreach: resolve billing issues')
        elif row['hours_change_pct'] < -40:
            actions.append('Re-engagement: personalized content recommendation')
        elif row['days_since_last_session'] > 20:
            actions.append('Win-back: send "we miss you" campaign')
        elif row['support_tickets_90d'] >= 3:
            actions.append('Support follow-up: check issue resolution')
        else:
            actions.append('Review: unusual pattern, investigate manually')

    alerts['recommended_action'] = actions

    # Select output columns
    output_cols = [
        'account_id', 'anomaly_rank', 'anomaly_score',
        'monthly_hours_watched', 'sessions_last_30d',
        'hours_change_pct', 'days_since_last_session',
        'payment_failures_6m', 'support_tickets_90d',
        'top_anomalous_features', 'recommended_action',
    ]

    return alerts[output_cols].reset_index(drop=True)

# Generate today's alerts
alerts = generate_daily_alerts(streamflow, iso, scaler, anomaly_features, top_k=50)

print("Daily Alert Feed (top 10 of 50):")
print(alerts.head(10).to_string(index=False))

# Summary statistics
print(f"\nAlert Summary:")
print(f"  Total alerts: {len(alerts)}")
print(f"  Mean anomaly score: {alerts['anomaly_score'].mean():.3f}")
print(f"  Action breakdown:")
print(alerts['recommended_action'].value_counts().to_string())

Step 7: Measuring Alert Effectiveness

After deploying the alert system, the Customer Success team needs to track whether acting on alerts actually reduces churn.

# Simulate alert effectiveness tracking
# In production, this would use actual intervention and outcome data

def simulate_alert_effectiveness(streamflow, alerts, intervention_effect=0.30):
    """
    Simulate the effect of intervening on alerted accounts.

    Parameters
    ----------
    intervention_effect : float
        Fraction of churns prevented by successful intervention
    """
    alerted_ids = set(alerts['account_id'])

    alerted = streamflow[streamflow['account_id'].isin(alerted_ids)]
    not_alerted = streamflow[~streamflow['account_id'].isin(alerted_ids)]

    baseline_churn_alerted = alerted['churned'].mean()
    baseline_churn_not_alerted = not_alerted['churned'].mean()

    # Simulated effect: intervention prevents some churns
    prevented = int(alerted['churned'].sum() * intervention_effect)
    projected_churn = alerted['churned'].sum() - prevented

    print("Alert Effectiveness Simulation:")
    print(f"  Alerted accounts:       {len(alerted)}")
    print(f"  Baseline churn (alerted):    {baseline_churn_alerted:.3f}")
    print(f"  Baseline churn (not alerted): {baseline_churn_not_alerted:.3f}")
    print(f"  Churns in alerted group:      {alerted['churned'].sum()}")
    print(f"  Churns prevented (at {intervention_effect:.0%} rate): {prevented}")
    print(f"  Projected churn (alerted):    {projected_churn}")

    # Revenue impact
    avg_monthly_revenue = streamflow['plan_price'].mean()
    monthly_revenue_saved = prevented * avg_monthly_revenue
    annual_revenue_saved = monthly_revenue_saved * 12
    print(f"\n  Average monthly revenue per account: ${avg_monthly_revenue:.2f}")
    print(f"  Monthly revenue saved: ${monthly_revenue_saved:,.2f}")
    print(f"  Projected annual revenue saved: ${annual_revenue_saved:,.2f}")

simulate_alert_effectiveness(streamflow, alerts, intervention_effect=0.30)

Results and Recommendations

What the Anomaly Detection System Provides

  1. Earlier signals than the churn model. The anomaly detector flags behavioral shifts --- sudden drops in engagement, unusual payment patterns, erratic usage --- before the account reaches the low-engagement state that triggers the supervised churn model.

  2. Actionable archetypes. Clustering anomalous accounts into archetypes enables targeted interventions: billing outreach for payment-distressed accounts, content recommendations for disengaging users, account security checks for behavioral outliers.

  3. A daily alert feed with explanations. Each alert includes the anomaly score, the top contributing features, and a recommended action. The Customer Success team does not need to interpret raw scores.

Limitations and Caution

  1. Anomalous is not the same as pre-churn. Some anomalous accounts are power users with unusual patterns, new accounts exploring the platform, or seasonal variation. The alert system will have false positives. Track precision@50 over time and adjust the threshold.

  2. The 3% contamination rate is an assumption. If the actual rate of concerning anomalies is 1%, two-thirds of alerts are noise. If it is 5%, the system misses half. Use feedback from the Customer Success team to calibrate.

  3. Feature engineering matters more than algorithm choice. The behavioral change features (hours_change_pct, sessions_change_pct) are the most predictive of churn-relevant anomalies. Investing in richer change features (week-over-week trends, rolling averages, deviation from personal baseline) would improve detection more than switching to a fancier algorithm.

Integration with Existing Churn Model

The anomaly detection system complements, not replaces, the supervised churn model. The recommended integration:

  • Churn model: Runs weekly, scores all accounts, feeds the retention marketing pipeline.
  • Anomaly detection: Runs daily, flags the top 50 behavioral shifts, feeds the Customer Success team for manual outreach.
  • Combined scoring: Add the anomaly score as a feature in the next churn model retrain. The anomaly score captures multivariate unusualness that individual features may miss.

This case study uses the StreamFlow SaaS churn dataset from the textbook's running example. Return to the chapter for algorithm details.