21 min read

> Core Principle --- A model you cannot explain is a model you cannot trust --- and a model your stakeholders will not use. I have watched teams build excellent gradient boosting models with AUC above 0.92 and then fail to deploy them because no one...

Chapter 19: Model Interpretation

SHAP Values, Partial Dependence, Permutation Importance, and Explaining Your Model to Humans


Learning Objectives

By the end of this chapter, you will be able to:

  1. Compute and interpret SHAP values for global and local explanations
  2. Create and interpret partial dependence plots (PDP) and ICE plots
  3. Apply permutation importance as a model-agnostic feature importance method
  4. Build LIME explanations for individual predictions
  5. Communicate model behavior to non-technical stakeholders

A Model You Cannot Explain Is a Model You Cannot Trust

Core Principle --- A model you cannot explain is a model you cannot trust --- and a model your stakeholders will not use. I have watched teams build excellent gradient boosting models with AUC above 0.92 and then fail to deploy them because no one in the business could answer the simplest question: "Why did the model predict this?" I have seen a hospital reject a readmission model --- not because it was inaccurate, but because the clinicians could not understand its reasoning. I have reviewed production churn models where the VP of Product asked "What drives churn?" and the data scientist said "The model has 24 features and they all contribute" and the meeting ended with no action taken.

Here is the uncomfortable reality: a model's value is not determined by its cross-validation score. It is determined by whether a human being makes a better decision because of it. And humans do not trust black boxes. They trust explanations.

This chapter teaches you to open the black box. Not by replacing complex models with simple ones --- that trades accuracy for interpretability, and you should not have to choose. Instead, you will learn post-hoc interpretation methods that explain any model's behavior, from logistic regression to 500-tree gradient boosting ensembles:

Method Scope Model-Agnostic? Question It Answers
SHAP values Global + Local Yes (and model-specific variants) "How much did each feature contribute to this prediction?"
Partial Dependence Plots Global Yes "How does the predicted outcome change as one feature varies?"
ICE plots Local (many) Yes "Does the relationship vary across individual observations?"
Permutation Importance Global Yes "If I scramble this feature, how much does performance drop?"
LIME Local Yes "What is the local linear approximation around this prediction?"

By the end of this chapter, you will be able to generate every one of these explanations, know which to use when, and --- critically --- translate them into language that a product manager, a clinician, or an executive can act on.


Part 1: Why Interpretation Matters More Than You Think

The Interpretation Tax

Every model in production pays an interpretation tax. Someone --- a product manager, a loan officer, a doctor, a regulator --- will ask why the model made a particular prediction. If you cannot answer that question, one of three things happens:

  1. The model is not deployed. The team spent months building it, but the business does not trust it.
  2. The model is deployed and ignored. It produces predictions, but no one acts on them because no one understands them.
  3. The model is deployed and blindly trusted. This is the worst outcome. Errors go undetected because no one can inspect the reasoning.

The interpretation methods in this chapter are not academic luxuries. They are deployment requirements.

Interpretability vs. Explainability

These terms are used loosely in the literature, but a useful distinction exists:

  • Interpretability: The degree to which a human can understand the model's internal mechanism. A linear regression with 3 features is inherently interpretable --- you can read the coefficients.
  • Explainability: The degree to which a human can understand why the model made a specific prediction, using post-hoc methods. A 500-tree XGBoost model is not inherently interpretable, but it is explainable via SHAP.

This chapter focuses on explainability --- methods that work regardless of model complexity.

Global vs. Local Interpretation

Two fundamentally different questions:

Global Local
Question "How does the model behave overall?" "Why did the model make this prediction?"
Audience Data scientists, leadership, regulators End users, case reviewers, clinicians
Methods Feature importance, PDP, SHAP summary SHAP waterfall, LIME, ICE
Use case Model validation, feature prioritization Decision support, audit, appeal

Most practitioners need both. A product manager needs global interpretation to understand the model's logic. A customer success representative needs local interpretation to explain why a specific customer was flagged.


Part 2: SHAP Values --- The Foundation of Modern Model Interpretation

The Game Theory Origin

SHAP (SHapley Additive exPlanations) is built on Shapley values, a concept from cooperative game theory introduced by Lloyd Shapley in 1953. The core idea is elegant:

Imagine a prediction is a game. The "players" are the features. The "payout" is the difference between the model's prediction for this observation and the average prediction across all observations. The Shapley value for each feature answers: "What is this feature's fair contribution to the payout, considering all possible coalitions of features?"

More precisely, the Shapley value for feature $j$ is the weighted average of the feature's marginal contribution across all possible subsets of the other features. If there are $p$ features, this requires evaluating $2^p$ subsets --- exponential in the number of features.

Mathematical Foundation --- For a model $f$, feature $j$, and feature set $S$ that does not include $j$, the Shapley value is:

$\phi_j = \sum_{S \subseteq N \setminus \{j\}} \frac{|S|!(|N|-|S|-1)!}{|N|!} [f(S \cup \{j\}) - f(S)]$

where $N$ is the set of all features. This is computationally intractable for large feature sets, which is why SHAP uses optimized algorithms (TreeSHAP, KernelSHAP) to approximate or compute these values efficiently.

Why SHAP, Not Just Feature Importance?

You already know model.feature_importances_ from Chapter 13. Why do you need SHAP?

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from xgboost import XGBClassifier

np.random.seed(42)
n = 50000

# StreamFlow churn dataset (consistent with Chapters 11-18)
streamflow = pd.DataFrame({
    'monthly_hours_watched': np.random.exponential(18, n).round(1),
    'sessions_last_30d': np.random.poisson(14, n),
    'avg_session_minutes': np.random.exponential(28, n).round(1),
    'unique_titles_watched': np.random.poisson(8, n),
    'content_completion_rate': np.random.beta(3, 2, n).round(3),
    'binge_sessions_30d': np.random.poisson(2, n),
    'weekend_ratio': np.random.beta(2.5, 3, n).round(3),
    'peak_hour_ratio': np.random.beta(3, 2, n).round(3),
    'hours_change_pct': np.random.normal(0, 30, n).round(1),
    'sessions_change_pct': np.random.normal(0, 25, n).round(1),
    'months_active': np.random.randint(1, 60, n),
    'plan_price': np.random.choice(
        [9.99, 14.99, 19.99, 24.99], n, p=[0.35, 0.35, 0.20, 0.10]
    ),
    'devices_used': np.random.randint(1, 6, n),
    'profiles_active': np.random.randint(1, 5, n),
    'payment_failures_6m': np.random.poisson(0.3, n),
    'support_tickets_90d': np.random.poisson(0.8, n),
    'days_since_last_session': np.random.exponential(5, n).round(0).clip(0, 60),
    'recommendation_click_rate': np.random.beta(2, 8, n).round(3),
    'search_frequency_30d': np.random.poisson(6, n),
    'download_count_30d': np.random.poisson(3, n),
    'share_count_30d': np.random.poisson(1, n),
    'rating_count_30d': np.random.poisson(2, n),
    'free_trial_convert': np.random.binomial(1, 0.65, n),
    'referral_source': np.random.choice(
        [0, 1, 2, 3], n, p=[0.50, 0.25, 0.15, 0.10]
    ),
})

churn_logit = (
    -3.0
    + 0.08 * streamflow['days_since_last_session']
    - 0.02 * streamflow['monthly_hours_watched']
    - 0.04 * streamflow['sessions_last_30d']
    + 0.15 * streamflow['payment_failures_6m']
    + 0.10 * streamflow['support_tickets_90d']
    - 0.03 * streamflow['content_completion_rate'] * 10
    + 0.05 * (streamflow['hours_change_pct'] < -30).astype(int)
    - 0.01 * streamflow['months_active']
    + 0.08 * (streamflow['plan_price'] > 19.99).astype(int)
    - 0.02 * streamflow['unique_titles_watched']
    + np.random.normal(0, 0.3, n)
)
churn_prob = 1 / (1 + np.exp(-churn_logit))
streamflow['churned'] = np.random.binomial(1, churn_prob)

X = streamflow.drop(columns=['churned'])
y = streamflow['churned']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, stratify=y, random_state=42
)

model = XGBClassifier(
    n_estimators=500, learning_rate=0.05, max_depth=5,
    subsample=0.8, colsample_bytree=0.8, min_child_weight=3,
    eval_metric='logloss', random_state=42, n_jobs=-1
)
model.fit(X_train, y_train)

# Built-in feature importance (gain-based)
importance_df = pd.DataFrame({
    'feature': X.columns,
    'importance': model.feature_importances_
}).sort_values('importance', ascending=False)

print("XGBoost built-in feature importance (gain):")
print(importance_df.head(10).to_string(index=False))
XGBoost built-in feature importance (gain):
                  feature  importance
  days_since_last_session       0.189
     monthly_hours_watched       0.098
       sessions_last_30d       0.087
     payment_failures_6m       0.076
    support_tickets_90d       0.068
  content_completion_rate       0.062
       hours_change_pct       0.055
           months_active       0.051
  unique_titles_watched       0.048
              plan_price       0.041

This tells you which features the model uses most, but it does not tell you:

  • Direction: Does higher days_since_last_session increase or decrease churn probability?
  • Magnitude per prediction: How much did days_since_last_session = 45 push this customer's prediction higher?
  • Interactions: Does the effect of plan_price depend on months_active?

SHAP answers all three.

Computing SHAP Values with TreeSHAP

TreeSHAP is an exact, polynomial-time algorithm for computing SHAP values for tree-based models. For a single tree, it runs in O(TLD^2) time, where T is the number of trees, L is the maximum number of leaves, and D is the maximum depth. This is dramatically faster than the exponential brute-force Shapley calculation.

import shap

# TreeSHAP: exact SHAP values for tree-based models
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

print(f"SHAP values shape: {shap_values.shape}")
print(f"X_test shape: {X_test.shape}")
print(f"Expected value (base rate): {explainer.expected_value:.4f}")
SHAP values shape: (10000, 24)
X_test shape: (10000, 24)
Expected value (base rate): -2.1847

Each row of shap_values corresponds to one observation. Each column corresponds to one feature. The values are in the model's output space (log-odds for XGBoost binary classification). They sum to the difference between the model's prediction for that observation and the expected value:

# Verify the fundamental SHAP property: values sum to prediction - expected_value
idx = 0
raw_prediction = model.predict(X_test.iloc[[idx]], output_margin=True)[0]
shap_sum = shap_values[idx].sum() + explainer.expected_value

print(f"Raw model prediction (log-odds): {raw_prediction:.4f}")
print(f"SHAP sum + expected value:       {shap_sum:.4f}")
print(f"Difference:                       {abs(raw_prediction - shap_sum):.6f}")
Raw model prediction (log-odds): -1.7234
SHAP sum + expected value:       -1.7234
Difference:                       0.000001

This additivity is the defining property of SHAP and what makes it so powerful. The prediction decomposes exactly into feature contributions.

The SHAP Summary Plot: Global Interpretation

The summary plot is the single most useful SHAP visualization. It shows, for every observation in the dataset, every feature's SHAP value, colored by the feature's actual value.

# Global summary: which features matter most, and in what direction?
shap.summary_plot(shap_values, X_test, plot_type="dot", show=False)

How to read this plot:

  • Vertical axis: Features ranked by mean absolute SHAP value (most important at top).
  • Horizontal axis: SHAP value (contribution to prediction in log-odds). Positive pushes toward churn, negative pushes away from churn.
  • Color: Red = high feature value, blue = low feature value.

For days_since_last_session:

  • Red dots (high values, meaning many days since last session) cluster on the right (positive SHAP, pushing toward churn).
  • Blue dots (low values, meaning recent activity) cluster on the left (negative SHAP, pushing away from churn).
  • Interpretation: customers who have not logged in recently are more likely to churn. This is both intuitive and actionable.

For monthly_hours_watched:

  • The pattern is reversed: red (high hours) clusters on the left (less churn), blue (low hours) clusters on the right (more churn).
  • Interpretation: engaged customers churn less.

Practitioner Note --- The summary plot replaces the old-style bar chart of feature importances. It contains strictly more information: not just which features matter, but the direction of their effect and the distribution of their impact across observations. If you present a single visualization to your team, make it this one.

The SHAP Bar Plot: Mean Absolute Impact

When the summary dot plot is too busy for a slide, the bar plot provides a clean ranking:

# Clean bar chart for presentations
shap.summary_plot(shap_values, X_test, plot_type="bar", show=False)

This shows mean(|SHAP value|) for each feature --- how much each feature moves predictions on average, regardless of direction. It answers: "Which features does the model rely on most?"

The SHAP Waterfall Plot: Local Interpretation

The waterfall plot is for individual predictions. It answers the most common stakeholder question: "Why did the model predict this for this customer?"

# Local explanation: why was this customer flagged as high-risk?
high_risk_idx = np.where(
    model.predict_proba(X_test)[:, 1] > 0.5
)[0][0]

print(f"Customer index: {high_risk_idx}")
print(f"Predicted churn probability: "
      f"{model.predict_proba(X_test.iloc[[high_risk_idx]])[:, 1][0]:.3f}")
print(f"\nFeature values:")
print(X_test.iloc[high_risk_idx].to_string())

# Waterfall plot for this customer
shap.plots.waterfall(
    shap.Explanation(
        values=shap_values[high_risk_idx],
        base_values=explainer.expected_value,
        data=X_test.iloc[high_risk_idx].values,
        feature_names=X_test.columns.tolist()
    ),
    show=False
)

How to read the waterfall:

  • Starting point (left): The expected value --- the average model output across all training data.
  • Red bars: Features pushing the prediction higher (toward churn).
  • Blue bars: Features pushing the prediction lower (away from churn).
  • Final value (right): The model's actual prediction for this customer.

For this high-risk customer, you might see:

  • days_since_last_session = 38 pushes churn probability up by 0.12 (they have not logged in for over a month).
  • payment_failures_6m = 3 pushes it up by 0.08 (three payment failures in six months).
  • support_tickets_90d = 4 pushes it up by 0.05 (four support tickets --- they are frustrated).
  • monthly_hours_watched = 2.3 pushes it up by 0.04 (barely watching anything).

This is the explanation you give to the customer success team: "The model flagged this customer because they have not logged in for 38 days, have had 3 payment failures, filed 4 support tickets, and are watching only 2.3 hours per month."

The SHAP Dependence Plot: Feature Interactions

The dependence plot shows the relationship between a feature's value and its SHAP value, revealing non-linearities and interactions:

# How does days_since_last_session affect churn predictions?
shap.dependence_plot(
    "days_since_last_session", shap_values, X_test,
    interaction_index="payment_failures_6m", show=False
)

This reveals:

  • The relationship between days_since_last_session and its SHAP value is not linear. Below 5 days, the SHAP value is consistently negative (protective). Between 5 and 15 days, it transitions. Above 20 days, it is strongly positive (risk factor).
  • The color (interaction with payment_failures_6m) shows that for the same number of inactive days, customers with payment failures have higher SHAP values. The features interact: inactivity plus payment problems is worse than either alone.

SHAP Force Plot: Another Local View

The force plot is an alternative to the waterfall that shows the same information in a horizontal layout:

# Force plot for a single prediction
shap.force_plot(
    explainer.expected_value,
    shap_values[high_risk_idx],
    X_test.iloc[high_risk_idx],
    matplotlib=True, show=False
)

Red features push the prediction higher; blue features push it lower. The width of each segment corresponds to the magnitude of the SHAP value. This format works well in dashboards because it is compact.

Putting SHAP Together: A Complete Interpretation Workflow

# Complete SHAP workflow for the StreamFlow churn model
import shap
import matplotlib.pyplot as plt

explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# 1. Global: Which features matter most?
fig, axes = plt.subplots(1, 2, figsize=(20, 8))
plt.sca(axes[0])
shap.summary_plot(shap_values, X_test, plot_type="bar", show=False)
axes[0].set_title("Feature Importance (Mean |SHAP|)")

plt.sca(axes[1])
shap.summary_plot(shap_values, X_test, plot_type="dot", show=False)
axes[1].set_title("SHAP Summary")
plt.tight_layout()
plt.savefig("shap_global.png", dpi=150, bbox_inches='tight')
plt.show()

# 2. Local: Why this specific customer?
customer_idx = high_risk_idx
shap.plots.waterfall(
    shap.Explanation(
        values=shap_values[customer_idx],
        base_values=explainer.expected_value,
        data=X_test.iloc[customer_idx].values,
        feature_names=X_test.columns.tolist()
    ),
    show=False
)
plt.savefig("shap_waterfall_customer.png", dpi=150, bbox_inches='tight')
plt.show()

# 3. Interaction: How do top features interact?
shap.dependence_plot(
    "days_since_last_session", shap_values, X_test,
    interaction_index="auto", show=False
)
plt.savefig("shap_dependence.png", dpi=150, bbox_inches='tight')
plt.show()

# 4. Top 3 reasons per customer (for the customer success team)
def top_shap_reasons(shap_vals, feature_names, top_n=3):
    """Extract the top N features driving a prediction."""
    abs_vals = np.abs(shap_vals)
    top_indices = np.argsort(abs_vals)[::-1][:top_n]
    reasons = []
    for idx in top_indices:
        direction = "increases" if shap_vals[idx] > 0 else "decreases"
        reasons.append({
            'feature': feature_names[idx],
            'shap_value': round(shap_vals[idx], 4),
            'direction': direction
        })
    return reasons

# Show top 3 reasons for 5 high-risk customers
high_risk_mask = model.predict_proba(X_test)[:, 1] > 0.4
high_risk_indices = np.where(high_risk_mask)[0][:5]

for i, idx in enumerate(high_risk_indices):
    prob = model.predict_proba(X_test.iloc[[idx]])[:, 1][0]
    reasons = top_shap_reasons(
        shap_values[idx], X_test.columns.tolist()
    )
    print(f"\nCustomer {i+1} (churn probability: {prob:.3f}):")
    for r in reasons:
        print(f"  - {r['feature']}: SHAP = {r['shap_value']:+.4f} "
              f"({r['direction']} churn risk)")
Customer 1 (churn probability: 0.612):
  - days_since_last_session: SHAP = +0.4821 (increases churn risk)
  - payment_failures_6m: SHAP = +0.2134 (increases churn risk)
  - monthly_hours_watched: SHAP = +0.1567 (increases churn risk)

Customer 2 (churn probability: 0.534):
  - days_since_last_session: SHAP = +0.3912 (increases churn risk)
  - support_tickets_90d: SHAP = +0.1876 (increases churn risk)
  - sessions_last_30d: SHAP = +0.1243 (increases churn risk)

Customer 3 (churn probability: 0.487):
  - payment_failures_6m: SHAP = +0.3456 (increases churn risk)
  - days_since_last_session: SHAP = +0.2789 (increases churn risk)
  - content_completion_rate: SHAP = +0.0934 (increases churn risk)

Customer 4 (churn probability: 0.521):
  - days_since_last_session: SHAP = +0.4102 (increases churn risk)
  - hours_change_pct: SHAP = +0.1567 (increases churn risk)
  - support_tickets_90d: SHAP = +0.1098 (increases churn risk)

Customer 5 (churn probability: 0.478):
  - support_tickets_90d: SHAP = +0.2987 (increases churn risk)
  - payment_failures_6m: SHAP = +0.2145 (increases churn risk)
  - days_since_last_session: SHAP = +0.1876 (increases churn risk)

This is what a customer success team can use. Not "the model's AUC is 0.94" but "this customer is at risk because they have not logged in for 38 days, had 3 payment failures, and filed 4 support tickets."


Part 3: Partial Dependence Plots and ICE Plots

Partial Dependence Plots (PDP): The Average Marginal Effect

A partial dependence plot answers: "How does the model's average prediction change as one feature varies, holding all other features at their observed values?"

The computation is straightforward:

  1. Choose a feature (e.g., months_active).
  2. Create a grid of values for that feature (e.g., 1, 5, 10, 15, ..., 60).
  3. For each grid value, replace the feature's value for every observation in the dataset, keep all other features at their actual values, and compute the model's average prediction.
  4. Plot the grid values vs. the average predictions.
from sklearn.inspection import PartialDependenceDisplay

fig, axes = plt.subplots(2, 3, figsize=(18, 10))

# PDP for the 6 most important features
features_to_plot = [
    'days_since_last_session', 'monthly_hours_watched',
    'sessions_last_30d', 'payment_failures_6m',
    'support_tickets_90d', 'months_active'
]

for i, feature in enumerate(features_to_plot):
    ax = axes[i // 3, i % 3]
    PartialDependenceDisplay.from_estimator(
        model, X_test, [feature], ax=ax,
        kind='average', random_state=42
    )
    ax.set_title(f"PDP: {feature}")

plt.tight_layout()
plt.savefig("pdp_top_features.png", dpi=150, bbox_inches='tight')
plt.show()

What the PDPs reveal:

  • days_since_last_session: Churn probability is flat below 5 days, then rises steeply between 5 and 25 days, then flattens above 30 days (a ceiling effect --- once a customer has been gone for a month, additional days do not add much risk).
  • monthly_hours_watched: Churn probability drops sharply from 0 to 10 hours, then decreases more gradually. The biggest bang for the buck is getting customers to watch something.
  • months_active: Longer tenure is protective, but the effect diminishes after about 24 months. Long-term subscribers are sticky.
  • payment_failures_6m: Each additional payment failure adds roughly the same increment of churn risk. The relationship is approximately linear.

Practitioner Note --- PDPs show the average relationship, which can be misleading if the effect varies across subgroups. A flat PDP does not necessarily mean the feature does not matter --- it might push predictions up for half the population and down for the other half, averaging to zero. Always check ICE plots alongside PDPs.

Individual Conditional Expectation (ICE) Plots: The Disaggregated View

An ICE plot is a PDP disaggregated to the individual level. Instead of averaging across all observations, it shows the prediction curve for each observation separately.

fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# PDP + ICE for days_since_last_session
PartialDependenceDisplay.from_estimator(
    model, X_test.sample(300, random_state=42),
    ['days_since_last_session'], ax=axes[0],
    kind='both', random_state=42
)
axes[0].set_title("PDP + ICE: days_since_last_session")

# PDP + ICE for monthly_hours_watched
PartialDependenceDisplay.from_estimator(
    model, X_test.sample(300, random_state=42),
    ['monthly_hours_watched'], ax=axes[1],
    kind='both', random_state=42
)
axes[1].set_title("PDP + ICE: monthly_hours_watched")

plt.tight_layout()
plt.savefig("ice_plots.png", dpi=150, bbox_inches='tight')
plt.show()

In the ICE plot, each thin line represents one customer's prediction curve. The thick line is the PDP (the average). If the thin lines are roughly parallel, the feature's effect is consistent across customers. If they cross or diverge, the effect is heterogeneous --- the feature matters differently for different customers.

For days_since_last_session, you might observe that the ICE lines are roughly parallel, meaning the effect of inactivity is consistent: more inactive days means more churn risk, for essentially everyone.

For monthly_hours_watched, the lines might diverge at low values: for some customers, going from 0 to 5 hours has a dramatic protective effect; for others (perhaps those with high payment failures), the effect is muted. This is a feature interaction that the PDP alone would hide.

Two-Feature PDP: Interaction Surfaces

You can also create PDPs for pairs of features to reveal interactions:

fig, ax = plt.subplots(figsize=(10, 8))

PartialDependenceDisplay.from_estimator(
    model, X_test,
    [('days_since_last_session', 'payment_failures_6m')],
    ax=ax, random_state=42
)
ax.set_title("2D PDP: Inactivity x Payment Failures")
plt.savefig("pdp_2d_interaction.png", dpi=150, bbox_inches='tight')
plt.show()

This reveals the interaction surface. If the contour lines are parallel, the features do not interact. If the lines bend or converge, there is an interaction. For inactivity and payment failures, you would expect the contours to show that the combination is worse than either feature alone --- a multiplicative rather than additive effect.

PDP Limitations

PDPs have two important limitations:

  1. Assumption of feature independence. PDPs vary one feature while holding others fixed, which can create impossible or unlikely feature combinations. If monthly_hours_watched and sessions_last_30d are highly correlated, a PDP that sets hours to 100 while sessions remain at 2 is evaluating an unrealistic scenario.

  2. Averaging hides heterogeneity. As discussed above, the average curve can mask opposing effects. Always use ICE plots alongside PDPs.


Part 4: Permutation Importance

The Intuition

Permutation importance answers: "If I randomly shuffle this feature's values across observations --- destroying the relationship between the feature and the target --- how much does the model's performance drop?"

A large drop means the model relies heavily on that feature. A small drop means the feature is dispensable.

from sklearn.inspection import permutation_importance
from sklearn.metrics import roc_auc_score

# Compute permutation importance on the test set
perm_result = permutation_importance(
    model, X_test, y_test,
    n_repeats=30, random_state=42,
    scoring='roc_auc', n_jobs=-1
)

perm_df = pd.DataFrame({
    'feature': X_test.columns,
    'importance_mean': perm_result.importances_mean,
    'importance_std': perm_result.importances_std,
}).sort_values('importance_mean', ascending=False)

print("Permutation Importance (AUC drop when feature is shuffled):")
print(perm_df.head(10).to_string(index=False))
Permutation Importance (AUC drop when feature is shuffled):
                  feature  importance_mean  importance_std
  days_since_last_session           0.0412          0.0034
     monthly_hours_watched           0.0198          0.0021
       sessions_last_30d           0.0167          0.0019
     payment_failures_6m           0.0154          0.0025
    support_tickets_90d           0.0098          0.0018
  content_completion_rate           0.0076          0.0015
       hours_change_pct           0.0054          0.0012
           months_active           0.0048          0.0011
  unique_titles_watched           0.0039          0.0009
              plan_price           0.0031          0.0008

Permutation Importance vs. Built-In Feature Importance vs. SHAP

Property Built-In (Gain) Permutation SHAP
What it measures How much the feature improves splits How much performance drops when feature is shuffled Marginal contribution to each prediction
Computation Free (during training) Requires re-evaluation (n_repeats x n_features) Requires model-specific or kernel-based computation
Direction information? No No Yes
Local explanations? No No Yes
Affected by correlated features? Yes (splits between correlated features) Yes (shuffling one correlated feature may not hurt if the other remains) Distributes fairly among correlated features
Model-agnostic? No (tree-based only) Yes Mostly (KernelSHAP is fully agnostic)

Practitioner Note --- Permutation importance is most useful as a sanity check. If a feature has high built-in importance but zero permutation importance, the model may be relying on a spurious correlation or the feature may be fully redundant with another. If a feature has zero built-in importance but high permutation importance, something is wrong with your understanding of the model.

The Correlated-Features Problem

Permutation importance has a known weakness with correlated features. If monthly_hours_watched and sessions_last_30d are highly correlated, shuffling one does not destroy much information because the other still carries the same signal. Both features may show low permutation importance even though the underlying signal is critical.

# Demonstrate the correlated features problem
from sklearn.inspection import permutation_importance

# Create a perfectly correlated copy
X_test_with_copy = X_test.copy()
X_test_with_copy['hours_copy'] = X_test_with_copy['monthly_hours_watched']

# Retrain with the duplicate feature
X_train_with_copy = X_train.copy()
X_train_with_copy['hours_copy'] = X_train_with_copy['monthly_hours_watched']

model_dup = XGBClassifier(
    n_estimators=500, learning_rate=0.05, max_depth=5,
    subsample=0.8, colsample_bytree=0.8, min_child_weight=3,
    eval_metric='logloss', random_state=42, n_jobs=-1
)
model_dup.fit(X_train_with_copy, y_train)

perm_dup = permutation_importance(
    model_dup, X_test_with_copy, y_test,
    n_repeats=30, random_state=42,
    scoring='roc_auc', n_jobs=-1
)

original_idx = list(X_test_with_copy.columns).index('monthly_hours_watched')
copy_idx = list(X_test_with_copy.columns).index('hours_copy')

print("Permutation importance with duplicate feature:")
print(f"  monthly_hours_watched: {perm_dup.importances_mean[original_idx]:.4f}")
print(f"  hours_copy:            {perm_dup.importances_mean[copy_idx]:.4f}")
print(f"\n  Both are lower than the original "
      f"({perm_result.importances_mean[list(X_test.columns).index('monthly_hours_watched')]:.4f})")
print("  because each feature 'covers for' the other when shuffled.")
Permutation importance with duplicate feature:
  monthly_hours_watched: 0.0067
  hours_copy:            0.0061

  Both are lower than the original (0.0198)
  because each feature 'covers for' the other when shuffled.

The workaround is to either (a) remove correlated features before computing permutation importance, or (b) use grouped permutation importance, where correlated features are shuffled together.


Part 5: LIME --- Local Interpretable Model-Agnostic Explanations

The Idea

LIME (Ribeiro, Singh, and Guestrin, 2016) takes a fundamentally different approach from SHAP. Instead of computing exact marginal contributions, LIME approximates the model's behavior in the neighborhood of a single prediction with a simple, interpretable model (typically a linear regression or decision tree).

The process:

  1. Perturb: Generate many perturbed versions of the observation by slightly altering feature values.
  2. Predict: Get the complex model's prediction for each perturbed version.
  3. Weight: Weight each perturbed version by its proximity to the original observation (closer perturbations get more weight).
  4. Fit: Fit a simple linear model to the weighted perturbation-prediction pairs.
  5. Interpret: The linear model's coefficients are the LIME explanation.
import lime
import lime.lime_tabular

# Create a LIME explainer
lime_explainer = lime.lime_tabular.LimeTabularExplainer(
    training_data=X_train.values,
    feature_names=X_train.columns.tolist(),
    class_names=['Not Churned', 'Churned'],
    mode='classification',
    random_state=42
)

# Explain a single high-risk prediction
lime_exp = lime_explainer.explain_instance(
    X_test.iloc[high_risk_idx].values,
    model.predict_proba,
    num_features=10,
    num_samples=5000
)

# Display the explanation
print("LIME explanation for high-risk customer:")
print(f"Predicted churn probability: "
      f"{model.predict_proba(X_test.iloc[[high_risk_idx]])[:, 1][0]:.3f}")
print(f"\nTop contributing features:")
for feature, weight in lime_exp.as_list():
    direction = "toward churn" if weight > 0 else "away from churn"
    print(f"  {feature}: {weight:+.4f} ({direction})")
LIME explanation for high-risk customer:
Predicted churn probability: 0.612

Top contributing features:
  days_since_last_session > 25.00: +0.1823 (toward churn)
  payment_failures_6m > 2.00: +0.0945 (toward churn)
  support_tickets_90d > 3.00: +0.0712 (toward churn)
  monthly_hours_watched <= 5.00: +0.0634 (toward churn)
  sessions_last_30d <= 8.00: +0.0523 (toward churn)
  content_completion_rate <= 0.40: +0.0312 (toward churn)
  months_active <= 12.00: +0.0187 (toward churn)
  hours_change_pct <= -20.00: +0.0156 (toward churn)
  unique_titles_watched <= 4.00: +0.0098 (toward churn)
  devices_used <= 2.00: +0.0045 (toward churn)

SHAP vs. LIME: When to Use Which

Property SHAP LIME
Theoretical basis Game theory (Shapley values) Local linear approximation
Consistency guarantee Yes (satisfies all Shapley axioms) No (explanations can vary between runs)
Global explanations Yes (aggregate local explanations) Not directly
Computation speed Fast for trees (TreeSHAP), slower for others Moderate (requires perturbation sampling)
Stability Deterministic (for tree models) Stochastic (depends on perturbation sample)
Ease of explanation "This feature contributed +X to the prediction" "In this neighborhood, the model behaves like: if feature > threshold, then +Y"

Practitioner Note --- For tree-based models in production, SHAP is almost always the better choice. TreeSHAP is exact, fast, and deterministic. LIME's value is in model-agnostic scenarios --- neural networks, proprietary APIs, or models where SHAP is computationally infeasible. If you must choose one method and you are using XGBoost or LightGBM, choose SHAP.

LIME's Instability Problem

LIME's reliance on random perturbations means explanations can vary between runs. This is a serious problem for trust:

# Run LIME twice on the same observation
exp_1 = lime_explainer.explain_instance(
    X_test.iloc[high_risk_idx].values,
    model.predict_proba,
    num_features=5, num_samples=5000
)

exp_2 = lime_explainer.explain_instance(
    X_test.iloc[high_risk_idx].values,
    model.predict_proba,
    num_features=5, num_samples=5000
)

print("LIME Run 1 - Top 5 features:")
for feat, weight in exp_1.as_list():
    print(f"  {feat}: {weight:+.4f}")

print("\nLIME Run 2 - Top 5 features:")
for feat, weight in exp_2.as_list():
    print(f"  {feat}: {weight:+.4f}")

The feature rankings and weights will differ slightly between runs. For stable, reproducible explanations in production, set random_state and use a large num_samples (5000+), or prefer SHAP.


Part 6: Communicating Model Behavior to Non-Technical Stakeholders

The Translation Problem

You now have SHAP waterfall plots, partial dependence curves, and permutation importance rankings. Your product manager has none of the vocabulary to interpret them. The gap between what you have computed and what your stakeholder can use is the translation problem, and it is your responsibility to bridge it.

Framework: The Three-Slide Explanation

When presenting a model to non-technical stakeholders, use this structure:

Slide 1: "What the model does" (no math)

"We built a model that predicts which customers are likely to cancel their subscription in the next 30 days. It correctly identifies about 75% of customers who actually churn, and when it flags a customer as high-risk, it is right about 60% of the time."

No AUC. No F1. Translate to business language: "correctly identifies" and "when it flags someone, it is right."

Slide 2: "What drives the model's decisions" (one chart)

Show the SHAP summary bar plot. Present it as a ranked list in plain English:

"The model primarily looks at five things, in order of importance: 1. How recently the customer last logged in 2. How many hours they watch per month 3. How frequently they have sessions 4. Whether they have had payment failures 5. How many support tickets they have filed

Customers who have not logged in recently, watch fewer hours, have payment problems, or file many support tickets are more likely to be flagged."

Slide 3: "Here is a specific example" (waterfall or table)

"Let me show you Customer 47823. The model says there is a 61% chance this customer will churn. Here is why:

Reason Detail Impact
Has not logged in for 38 days Average is 5 days Strong churn signal
3 payment failures in 6 months Average is 0.3 Moderate churn signal
4 support tickets in 90 days Average is 0.8 Moderate churn signal
Only watches 2.3 hours/month Average is 18 hours Moderate churn signal

Does this make sense to you? Is there anything the model is missing?"

The last question is critical. It invites domain expertise and builds trust.

Writing a "How to Read This" Guide

For any SHAP-based dashboard or report, include a one-page guide:

HOW TO READ THE CHURN RISK REPORT

WHAT IS THIS?
Our churn prediction model scores every active subscriber daily.
Customers with a score above 40% are flagged as "at-risk."

HOW TO READ THE RISK SCORE:
- 0-20%: Low risk. No action needed.
- 20-40%: Moderate risk. Monitor during regular check-ins.
- 40-60%: High risk. Proactive outreach recommended.
- 60-100%: Critical risk. Immediate retention intervention.

HOW TO READ THE "TOP REASONS":
Each flagged customer has a list of the top 3 reasons the model
flagged them. These are the specific factors that most contributed
to their risk score, compared to the average customer.

Example:
  "Days since last session: 38 (avg: 5)"
  means this customer has not logged in for 38 days, while the
  average customer logs in every 5 days. This is the biggest
  reason the model thinks they might churn.

WHAT SHOULD I DO?
1. Review the top 3 reasons for each flagged customer.
2. Ask: "Do these reasons make sense given what I know?"
3. If yes: take the recommended retention action.
4. If no: flag the customer for manual review and tell the
   data team --- your feedback improves the model.

WHAT THE MODEL CANNOT SEE:
- Recent conversations you had with the customer
- Upcoming product launches that might re-engage them
- Life events (moved, changed jobs, etc.)
Your judgment matters. The model is a tool, not a replacement.

Common Stakeholder Questions and How to Answer Them

Stakeholder Question Bad Answer Good Answer
"How accurate is the model?" "The AUC is 0.94" "It correctly identifies about 75% of customers who actually churn. When it flags someone, it is right about 60% of the time."
"Why did it flag this customer?" "The SHAP value for feature 17 is 0.48" "Three reasons: they have not logged in for 38 days, had 3 payment failures, and filed 4 support tickets."
"Can I trust it?" "The model passed all validation tests" "It gets the right answer 3 out of 4 times for churners. When it is wrong, it is usually because of information the model cannot see --- like a conversation you had with the customer."
"What should I do with this?" "The model outputs probabilities" "For customers above 40% risk: call them. Here are the top 3 reasons for each --- use them to personalize the conversation."

Practitioner Note --- The single biggest deployment failure I have seen is not a modeling error. It is a communication error. The data scientist presents accuracy metrics the stakeholder does not understand, shows plots the stakeholder cannot read, and uses jargon the stakeholder does not speak. The model sits unused. Translation is not optional. It is a core data science skill.


Part 7: Bringing It All Together --- An Interpretation Report

Here is the complete workflow for producing a model interpretation deliverable:

import numpy as np
import pandas as pd
import shap
import matplotlib.pyplot as plt
from sklearn.inspection import (
    PartialDependenceDisplay, permutation_importance
)

# --- Assume model, X_train, X_test, y_test are already fitted ---

# Step 1: Compute SHAP values
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Step 2: Global importance (SHAP bar plot)
print("=" * 60)
print("GLOBAL MODEL INTERPRETATION")
print("=" * 60)

mean_abs_shap = np.abs(shap_values).mean(axis=0)
global_importance = pd.DataFrame({
    'feature': X_test.columns,
    'mean_abs_shap': mean_abs_shap
}).sort_values('mean_abs_shap', ascending=False)

print("\nFeature ranking (mean |SHAP|):")
for i, row in global_importance.head(10).iterrows():
    print(f"  {row['feature']:30s} {row['mean_abs_shap']:.4f}")

# Step 3: Permutation importance (sanity check)
perm = permutation_importance(
    model, X_test, y_test,
    n_repeats=30, random_state=42,
    scoring='roc_auc', n_jobs=-1
)

print("\nPermutation importance (AUC drop):")
perm_df = pd.DataFrame({
    'feature': X_test.columns,
    'perm_importance': perm.importances_mean
}).sort_values('perm_importance', ascending=False)

for i, row in perm_df.head(10).iterrows():
    print(f"  {row['feature']:30s} {row['perm_importance']:.4f}")

# Step 4: Check agreement between methods
print("\nAgreement check (top 5 features by each method):")
shap_top5 = set(global_importance.head(5)['feature'])
perm_top5 = set(perm_df.head(5)['feature'])
print(f"  SHAP top 5:        {shap_top5}")
print(f"  Permutation top 5: {perm_top5}")
print(f"  Overlap:           {shap_top5 & perm_top5}")

# Step 5: Local explanations for flagged customers
print("\n" + "=" * 60)
print("LOCAL EXPLANATIONS --- HIGH-RISK CUSTOMERS")
print("=" * 60)

probs = model.predict_proba(X_test)[:, 1]
high_risk = np.where(probs > 0.5)[0][:3]

for i, idx in enumerate(high_risk):
    customer_prob = probs[idx]
    customer_shap = shap_values[idx]
    top3_idx = np.argsort(np.abs(customer_shap))[::-1][:3]

    print(f"\n--- Customer {i+1} (Churn Probability: {customer_prob:.1%}) ---")
    for feat_idx in top3_idx:
        feat_name = X_test.columns[feat_idx]
        feat_val = X_test.iloc[idx, feat_idx]
        sv = customer_shap[feat_idx]
        direction = "INCREASES" if sv > 0 else "DECREASES"
        print(f"  {feat_name}: value={feat_val}, "
              f"SHAP={sv:+.4f} ({direction} churn risk)")

Part 8: Choosing the Right Interpretation Method

Decision Framework

                    Need to explain a specific prediction?
                   /                                      \
                 YES                                      NO
                /                                          \
         SHAP waterfall                           Need feature ranking?
         or LIME                                  /                   \
                                                YES                   NO
                                               /                       \
                                  SHAP summary plot               Need to see
                                  + Permutation importance        feature relationships?
                                  (cross-validate)                /              \
                                                                YES              NO
                                                               /                  \
                                                     PDP + ICE plots        Are you just
                                                     SHAP dependence        getting started?
                                                                           /
                                                                         YES
                                                                        /
                                                               Start with SHAP.
                                                               It does everything.

Method Summary Table

Method Type Best For Limitations
SHAP (TreeSHAP) Global + Local Tree models, production systems, stakeholder presentations Requires shap library; KernelSHAP is slow for large datasets
PDP Global Showing feature-prediction relationships Assumes feature independence; averages hide heterogeneity
ICE Local (many) Revealing heterogeneous feature effects Can be visually cluttered; does not quantify importance
Permutation Importance Global Model-agnostic feature ranking, sanity check No direction information; affected by correlated features
LIME Local Model-agnostic local explanations, neural networks Unstable (stochastic); no consistency guarantees

The Honest Summary

If you are using tree-based models (Random Forest, XGBoost, LightGBM, CatBoost) --- and in 2025, that is most tabular ML --- SHAP with TreeSHAP is the default answer. It is fast, exact, theoretically grounded, and produces both global and local explanations. Use PDP/ICE and permutation importance as complementary tools, not replacements.

If you are using neural networks or models without a TreeSHAP implementation, KernelSHAP or LIME become necessary. KernelSHAP is slower but consistent; LIME is faster but unstable.

If you are presenting to stakeholders, the method matters less than the translation. A well-explained LIME output beats a poorly communicated SHAP analysis every time.


Progressive Project M9: StreamFlow SHAP Explanations

Objective

Generate a complete interpretation package for the StreamFlow churn model: global feature importance, local explanations for three specific customers, a dependence plot for the most important feature, and a one-page stakeholder guide.

Step 1: Global Feature Importance

import numpy as np
import pandas as pd
import shap
import matplotlib.pyplot as plt
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split

# --- Use the StreamFlow dataset and trained model from this chapter ---
# (Setup code omitted for brevity --- see Part 2 above)

# Compute SHAP values
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Summary plot (save for the report)
plt.figure(figsize=(10, 8))
shap.summary_plot(shap_values, X_test, plot_type="dot", show=False)
plt.title("StreamFlow Churn Model: Global Feature Importance")
plt.tight_layout()
plt.savefig("m9_shap_summary.png", dpi=150, bbox_inches='tight')
plt.show()

# Print the ranking
mean_abs_shap = pd.DataFrame({
    'feature': X_test.columns,
    'mean_abs_shap': np.abs(shap_values).mean(axis=0)
}).sort_values('mean_abs_shap', ascending=False)

print("Global Feature Ranking (Mean |SHAP|):")
print(mean_abs_shap.to_string(index=False))

Step 2: Local Explanations for Three Specific Customers

# Select three customers with different risk profiles
probs = model.predict_proba(X_test)[:, 1]

# Customer A: High risk (>60%)
cust_a_idx = np.where(probs > 0.6)[0][0]
# Customer B: Medium risk (40-50%)
cust_b_idx = np.where((probs > 0.4) & (probs < 0.5))[0][0]
# Customer C: Low risk (<15%)
cust_c_idx = np.where(probs < 0.15)[0][0]

customers = {
    'Customer A (High Risk)': cust_a_idx,
    'Customer B (Medium Risk)': cust_b_idx,
    'Customer C (Low Risk)': cust_c_idx,
}

fig, axes = plt.subplots(3, 1, figsize=(12, 18))

for i, (label, idx) in enumerate(customers.items()):
    prob = probs[idx]
    print(f"\n{'='*50}")
    print(f"{label}: Churn Probability = {prob:.1%}")
    print(f"{'='*50}")

    # Top 3 reasons
    sv = shap_values[idx]
    top3 = np.argsort(np.abs(sv))[::-1][:3]
    for rank, feat_idx in enumerate(top3, 1):
        feat_name = X_test.columns[feat_idx]
        feat_val = X_test.iloc[idx, feat_idx]
        direction = "increases" if sv[feat_idx] > 0 else "decreases"
        print(f"  Reason {rank}: {feat_name} = {feat_val} "
              f"(SHAP = {sv[feat_idx]:+.4f}, {direction} churn risk)")

    # Waterfall plot
    plt.sca(axes[i])
    shap.plots.waterfall(
        shap.Explanation(
            values=sv,
            base_values=explainer.expected_value,
            data=X_test.iloc[idx].values,
            feature_names=X_test.columns.tolist()
        ),
        max_display=10, show=False
    )
    axes[i].set_title(f"{label} --- Churn Probability: {prob:.1%}")

plt.tight_layout()
plt.savefig("m9_waterfall_three_customers.png", dpi=150, bbox_inches='tight')
plt.show()

Step 3: Dependence Plot for Most Important Feature

# Dependence plot for the top feature
top_feature = mean_abs_shap.iloc[0]['feature']
print(f"Most important feature: {top_feature}")

plt.figure(figsize=(10, 6))
shap.dependence_plot(
    top_feature, shap_values, X_test,
    interaction_index="auto", show=False
)
plt.title(f"SHAP Dependence: {top_feature}")
plt.tight_layout()
plt.savefig("m9_dependence_top_feature.png", dpi=150, bbox_inches='tight')
plt.show()

Step 4: One-Page Stakeholder Guide

Write this as a plain-text document (no code, no jargon) that accompanies the plots:

STREAMFLOW CHURN MODEL: HOW TO READ THIS REPORT
================================================

WHAT THIS MODEL DOES:
Our churn model predicts which subscribers are likely to cancel
in the next 30 days. It scores every active subscriber daily.

THE RISK SCORE (0-100%):
  0-20%    Low risk       No action needed
  20-40%   Moderate risk  Monitor in regular check-ins
  40-60%   High risk      Proactive outreach recommended
  60%+     Critical       Immediate retention intervention

WHAT DRIVES THE MODEL:
The model primarily looks at these factors (in order of importance):
  1. How recently the customer last used StreamFlow
  2. How many hours they watch per month
  3. How often they have sessions
  4. Whether they have had payment failures
  5. How many support tickets they have filed

HOW TO READ "TOP REASONS" FOR A FLAGGED CUSTOMER:
Each flagged customer has a list of the top 3 reasons the model
flagged them. Example:

  "Days since last session: 38 (average is 5)"

This means the customer has not used StreamFlow in 38 days,
compared to the average of 5 days. That gap is the biggest
reason the model thinks they might cancel.

WHAT TO DO:
1. Review the flagged customer's top 3 reasons.
2. Ask: "Do these reasons match what I know about this customer?"
3. If yes: reach out with a personalized retention message
   that addresses the specific reasons.
4. If no: flag for manual review and tell the data team.

WHAT THE MODEL CANNOT SEE:
- Recent conversations with the customer
- Known account issues being resolved
- External factors (competitor promotions, life changes)

Your judgment still matters. The model surfaces who to focus on.
You decide what to do about it.

Deliverables Checklist

  • [ ] m9_shap_summary.png --- global feature importance (SHAP summary plot)
  • [ ] m9_waterfall_three_customers.png --- local explanations for 3 customers
  • [ ] m9_dependence_top_feature.png --- dependence plot for top feature
  • [ ] Stakeholder guide (plain text, one page, no jargon)
  • [ ] Brief written analysis: which features dominate, whether any surprise you, and one recommendation for the product team based on the interpretation

Chapter Summary

Model interpretation is not an optional add-on to the modeling process. It is the bridge between a model that exists and a model that gets used. SHAP values, built on Shapley's game-theoretic foundation, provide the most rigorous framework for explaining predictions --- both globally ("what does the model care about?") and locally ("why did the model flag this customer?"). Partial dependence plots show how features affect predictions on average; ICE plots reveal whether the effect is consistent or heterogeneous. Permutation importance provides a model-agnostic sanity check. LIME offers an alternative for models without efficient SHAP implementations.

But the methods are only half the story. The other half is translation. A SHAP waterfall plot is useless to a product manager who cannot read it. A permutation importance ranking is useless to a clinician who does not know what "AUC drop" means. Your job is to compute the explanations and to translate them into language, visuals, and decision frameworks that your stakeholders can act on.

A model you cannot explain is a model you cannot trust. Now you can explain yours.


Next chapter: Chapter 20: Clustering --- where we move from predicting known outcomes to discovering hidden structure in data.