Case Study 1: ShopSmart Market Basket Analysis for Product Recommendations

DataField.Dev

Case Study 1: ShopSmart Market Basket Analysis for Product Recommendations

Background

ShopSmart, the e-commerce platform from Chapters 20-22, has 200,000 monthly active customers, a catalog of 12,000 products across 45 categories, and approximately 1.2 million transactions per month. The merchandising team currently manages cross-sell recommendations manually: a category manager decides that customers who buy running shoes should see running socks, customers who buy a phone case should see screen protectors, and so on.

The problem with manual rules is scale. With 12,000 products, the number of possible cross-sell pairs is roughly 72 million. The five category managers can maintain maybe 200 manual rules between them. That leaves the vast majority of product combinations unexplored.

Marcus Chen, the Head of Analytics, wants to automate the discovery of cross-sell opportunities using association rules. The business question is specific: Which product pairs and triples co-occur in shopping baskets significantly more often than chance, and which of those co-occurrences are actionable for on-site product recommendations?

The success metric is equally specific: increase the cross-sell click-through rate on product pages from 3.2% (the current manual rule baseline) to 5.0% or higher.

The Data

ShopSmart's data warehouse has 18 months of order data. After deduplication and filtering (removing returns, test orders, and employee purchases), the analysis dataset contains 3.6 million transactions.

import numpy as np
import pandas as pd
from mlxtend.frequent_patterns import fpgrowth, association_rules
from mlxtend.preprocessing import TransactionEncoder
import matplotlib.pyplot as plt

np.random.seed(42)

# --- Simulate ShopSmart transaction data ---
# In production, this comes from the data warehouse via SQL.
# We simulate realistic structure: 8 product categories with
# known cross-category affinities.

n_transactions = 50_000
categories = {
    'electronics': ['laptop', 'tablet', 'phone', 'headphones', 'charger',
                    'mouse', 'keyboard', 'monitor', 'webcam', 'usb_hub'],
    'accessories': ['phone_case', 'screen_protector', 'laptop_bag',
                    'tablet_stand', 'cable_organizer'],
    'office': ['notebook', 'pens', 'desk_lamp', 'whiteboard', 'planner'],
    'fitness': ['yoga_mat', 'resistance_bands', 'water_bottle',
                'running_shoes', 'fitness_tracker'],
    'kitchen': ['coffee_maker', 'blender', 'knife_set', 'cutting_board',
                'measuring_cups'],
    'food': ['coffee_beans', 'protein_powder', 'snack_bars', 'tea_sampler',
             'olive_oil'],
    'personal_care': ['sunscreen', 'lip_balm', 'hand_cream', 'vitamins',
                      'essential_oils'],
    'books': ['python_book', 'business_book', 'cookbook', 'fitness_book',
              'self_help_book'],
}

all_products = [p for prods in categories.values() for p in prods]
product_to_cat = {p: cat for cat, prods in categories.items() for p in prods}

# Define cross-category affinities (lift > 1 pairs)
affinities = [
    ('phone', 'phone_case', 0.45),
    ('phone', 'screen_protector', 0.35),
    ('phone_case', 'screen_protector', 0.30),
    ('laptop', 'laptop_bag', 0.40),
    ('laptop', 'mouse', 0.35),
    ('laptop', 'keyboard', 0.20),
    ('coffee_maker', 'coffee_beans', 0.50),
    ('yoga_mat', 'resistance_bands', 0.30),
    ('yoga_mat', 'water_bottle', 0.25),
    ('running_shoes', 'water_bottle', 0.20),
    ('running_shoes', 'fitness_tracker', 0.15),
    ('python_book', 'laptop', 0.10),
    ('blender', 'protein_powder', 0.25),
    ('cookbook', 'knife_set', 0.20),
    ('fitness_book', 'yoga_mat', 0.15),
    ('tablet', 'tablet_stand', 0.30),
    ('coffee_beans', 'tea_sampler', 0.20),
]

# Generate transactions
transactions = []
for _ in range(n_transactions):
    basket = set()
    # Start with 1-3 random "anchor" items
    n_anchors = np.random.choice([1, 2, 3], p=[0.5, 0.35, 0.15])
    anchors = np.random.choice(all_products, size=n_anchors, replace=False)
    basket.update(anchors)

    # Add affinity items with specified probabilities
    for item_a, item_b, prob in affinities:
        if item_a in basket and np.random.random() < prob:
            basket.add(item_b)
        if item_b in basket and np.random.random() < prob * 0.6:
            basket.add(item_a)

    # Add 0-2 random items (noise)
    n_noise = np.random.choice([0, 1, 2], p=[0.4, 0.4, 0.2])
    noise = np.random.choice(all_products, size=n_noise, replace=False)
    basket.update(noise)

    transactions.append(list(basket))

print(f"Transactions: {len(transactions):,}")
print(f"Unique products: {len(all_products)}")
print(f"Avg basket size: {np.mean([len(t) for t in transactions]):.1f}")
print(f"Max basket size: {max(len(t) for t in transactions)}")

Step 1: Exploratory Analysis of Baskets

Before running any algorithm, understand the data.

# Basket size distribution
basket_sizes = [len(t) for t in transactions]
fig, axes = plt.subplots(1, 2, figsize=(12, 4))

axes[0].hist(basket_sizes, bins=range(1, max(basket_sizes) + 2),
             edgecolor='black', alpha=0.7)
axes[0].set_xlabel('Basket Size')
axes[0].set_ylabel('Frequency')
axes[0].set_title('Distribution of Basket Sizes')

# Item frequency
from collections import Counter
item_counts = Counter(item for t in transactions for item in t)
item_freq = pd.Series(item_counts).sort_values(ascending=False)

axes[1].barh(item_freq.head(20).index[::-1], item_freq.head(20).values[::-1],
             color='steelblue', edgecolor='black')
axes[1].set_xlabel('Transaction Count')
axes[1].set_title('Top 20 Products by Frequency')

plt.tight_layout()
plt.show()

print(f"\nItem frequency statistics:")
print(f"  Most common:  {item_freq.index[0]} ({item_freq.iloc[0]:,} transactions)")
print(f"  Least common: {item_freq.index[-1]} ({item_freq.iloc[-1]:,} transactions)")
print(f"  Median:       {item_freq.median():.0f} transactions")

Step 2: One-Hot Encoding and FP-Growth

# One-hot encode
te = TransactionEncoder()
te_array = te.fit(transactions).transform(transactions)
basket_df = pd.DataFrame(te_array, columns=te.columns_)

print(f"Basket DataFrame shape: {basket_df.shape}")
print(f"Sparsity: {1 - basket_df.sum().sum() / (basket_df.shape[0] * basket_df.shape[1]):.2%}")

# Run FP-Growth (faster than Apriori for this dataset size)
frequent_itemsets = fpgrowth(
    basket_df,
    min_support=0.01,    # 1% of transactions (at least 500 occurrences)
    use_colnames=True
)

print(f"\nFrequent itemsets found: {len(frequent_itemsets)}")
print(f"  1-item: {len(frequent_itemsets[frequent_itemsets['itemsets'].apply(len) == 1])}")
print(f"  2-item: {len(frequent_itemsets[frequent_itemsets['itemsets'].apply(len) == 2])}")
print(f"  3-item: {len(frequent_itemsets[frequent_itemsets['itemsets'].apply(len) == 3])}")

Step 3: Generate and Filter Association Rules

# Generate rules --- use lift as the primary metric
rules = association_rules(
    frequent_itemsets,
    metric="lift",
    min_threshold=1.0
)

print(f"Total rules (lift > 1.0): {len(rules)}")

# Apply the production filtering pipeline
def shopsmart_filter(rules_df):
    """ShopSmart's rule filtering criteria."""
    filtered = rules_df[
        (rules_df['support'] >= 0.005) &      # At least 0.5% of transactions
        (rules_df['confidence'] >= 0.20) &     # At least 20% confidence
        (rules_df['lift'] >= 1.5) &            # At least 50% above chance
        (rules_df['antecedents'].apply(len) <= 3) &  # Max 3 items in antecedent
        (rules_df['consequents'].apply(len) == 1)     # Single item consequent
    ].copy()

    filtered = filtered.sort_values('lift', ascending=False)
    return filtered

filtered_rules = shopsmart_filter(rules)
print(f"Filtered rules: {len(filtered_rules)}")

# Display top 20 rules
top_rules = filtered_rules[[
    'antecedents', 'consequents', 'support',
    'confidence', 'lift', 'conviction'
]].head(20)

for i, row in top_rules.iterrows():
    ant = ', '.join(sorted(row['antecedents']))
    con = ', '.join(sorted(row['consequents']))
    print(f"  {{{ant}}} -> {{{con}}}  "
          f"sup={row['support']:.3f}  conf={row['confidence']:.2f}  "
          f"lift={row['lift']:.2f}")

Step 4: Interpreting the Top Rules

The analysis reveals several categories of cross-sell patterns:

# Categorize rules by type
def categorize_rule(row):
    ant_cats = {product_to_cat.get(item, 'unknown') for item in row['antecedents']}
    con_cats = {product_to_cat.get(item, 'unknown') for item in row['consequents']}
    if ant_cats == con_cats:
        return 'within_category'
    else:
        return 'cross_category'

filtered_rules['rule_type'] = filtered_rules.apply(categorize_rule, axis=1)

print("Rule distribution by type:")
print(filtered_rules['rule_type'].value_counts())

print("\nTop cross-category rules (the most valuable for recommendations):")
cross_cat = filtered_rules[filtered_rules['rule_type'] == 'cross_category']
for _, row in cross_cat.head(10).iterrows():
    ant = ', '.join(sorted(row['antecedents']))
    con = ', '.join(sorted(row['consequents']))
    print(f"  {{{ant}}} -> {{{con}}}  lift={row['lift']:.2f}  conf={row['confidence']:.2f}")

Key Insight --- Within-category rules (phone -> phone_case) are the most obvious and often the ones category managers already know about. Cross-category rules are where association rules provide genuine discovery. A rule like {yoga_mat} -> {water_bottle} crosses the fitness and kitchen categories, a pairing that might not occur to a merchandising team organized by department.

Step 5: Building the Recommendation Engine

class MarketBasketRecommender:
    """
    Rule-based product recommender using association rules.

    Designed for real-time on-site recommendations:
    given a customer's cart, return ranked product suggestions.
    """

    def __init__(self, rules_df, min_lift=1.5, min_confidence=0.2):
        """
        Parameters
        ----------
        rules_df : pd.DataFrame from association_rules()
        min_lift : float, minimum lift for included rules
        min_confidence : float, minimum confidence for included rules
        """
        self.rules = rules_df[
            (rules_df['lift'] >= min_lift) &
            (rules_df['confidence'] >= min_confidence) &
            (rules_df['consequents'].apply(len) == 1)
        ].copy()
        self.rules = self.rules.sort_values('lift', ascending=False)
        print(f"Recommender initialized with {len(self.rules)} rules")

    def recommend(self, cart_items, top_n=5, exclude_categories=None):
        """
        Generate recommendations for items in cart.

        Parameters
        ----------
        cart_items : set of str, items currently in cart
        top_n : int, max recommendations to return
        exclude_categories : set of str, categories to exclude

        Returns
        -------
        pd.DataFrame with columns: item, lift, confidence, triggered_by
        """
        if exclude_categories is None:
            exclude_categories = set()

        candidates = []
        for _, rule in self.rules.iterrows():
            if rule['antecedents'].issubset(cart_items):
                for item in rule['consequents']:
                    if item not in cart_items:
                        cat = product_to_cat.get(item, 'unknown')
                        if cat not in exclude_categories:
                            candidates.append({
                                'item': item,
                                'lift': rule['lift'],
                                'confidence': rule['confidence'],
                                'triggered_by': frozenset(rule['antecedents'])
                            })

        if not candidates:
            return pd.DataFrame(columns=['item', 'lift', 'confidence', 'triggered_by'])

        result = pd.DataFrame(candidates)
        result = result.sort_values('lift', ascending=False).drop_duplicates(
            subset='item', keep='first'
        ).head(top_n)
        return result.reset_index(drop=True)


# Initialize the recommender
recommender = MarketBasketRecommender(rules, min_lift=1.5, min_confidence=0.15)

# Test with example carts
test_carts = [
    {'phone'},
    {'laptop', 'mouse'},
    {'yoga_mat'},
    {'coffee_maker'},
    {'phone', 'phone_case'},
]

for cart in test_carts:
    recs = recommender.recommend(cart, top_n=3)
    print(f"\nCart: {cart}")
    if len(recs) > 0:
        for _, r in recs.iterrows():
            print(f"  -> {r['item']}  (lift={r['lift']:.2f}, conf={r['confidence']:.2f})")
    else:
        print("  -> No recommendations (no matching rules)")

Step 6: Evaluating Recommendation Quality

# Simulate an A/B test: manual rules vs. association rule recommendations
# Manual rules: category managers' hand-picked cross-sells
manual_rules = {
    'phone': ['phone_case', 'screen_protector', 'charger'],
    'laptop': ['laptop_bag', 'mouse', 'keyboard'],
    'coffee_maker': ['coffee_beans'],
    'yoga_mat': ['resistance_bands'],
}

# Association rule recommendations: from the MarketBasketRecommender
# Evaluation: for each transaction, check if the recommended items
# appeared in the actual basket (hit rate)

np.random.seed(42)
test_transactions = transactions[:10_000]  # held-out test set

def evaluate_recommendations(test_trans, recommend_fn, top_n=3):
    """Compute hit rate: fraction of test transactions where at least
    one recommended item was actually purchased."""
    hits = 0
    eligible = 0

    for basket in test_trans:
        if len(basket) < 2:
            continue

        # Use first item as the "cart" (simulating mid-session)
        cart_item = basket[0]
        actual_remaining = set(basket[1:])

        recs = recommend_fn({cart_item}, top_n=top_n)
        if len(recs) == 0:
            continue

        eligible += 1
        rec_items = set(recs) if isinstance(recs, list) else set(recs['item'])
        if rec_items & actual_remaining:
            hits += 1

    hit_rate = hits / eligible if eligible > 0 else 0
    return hit_rate, hits, eligible


# Manual rule recommendation function
def manual_recommend(cart, top_n=3):
    recs = []
    for item in cart:
        if item in manual_rules:
            recs.extend(manual_rules[item][:top_n])
    return pd.DataFrame({'item': recs[:top_n]}) if recs else pd.DataFrame(columns=['item'])


# Association rule recommendation function (wrapped)
def ar_recommend(cart, top_n=3):
    return recommender.recommend(cart, top_n=top_n)


manual_rate, manual_hits, manual_eligible = evaluate_recommendations(
    test_transactions, manual_recommend
)
ar_rate, ar_hits, ar_eligible = evaluate_recommendations(
    test_transactions, ar_recommend
)

print("=== Recommendation Hit Rate Comparison ===")
print(f"Manual rules:      {manual_rate:.1%} ({manual_hits}/{manual_eligible} eligible)")
print(f"Association rules: {ar_rate:.1%} ({ar_hits}/{ar_eligible} eligible)")
print(f"Coverage (manual): {manual_eligible}/{len(test_transactions)} transactions")
print(f"Coverage (AR):     {ar_eligible}/{len(test_transactions)} transactions")

Key Result --- The association rule recommender typically achieves broader coverage than manual rules because it discovers cross-sell pairs that category managers missed. The hit rate depends on the quality of the rules and the test data, but the coverage advantage alone --- being able to make recommendations for a larger fraction of carts --- drives incremental revenue.

Outcome

Marcus Chen's team deployed the MarketBasketRecommender as a backend service behind ShopSmart's product pages. The A/B test ran for four weeks across 100,000 users (50/50 split):

Metric	Manual Rules (Control)	Association Rules (Treatment)
Cross-sell CTR	3.2%	5.8%
Items per order	2.4	2.7
Revenue per session	$48.20 \| $54.10
Recommendation coverage	34% of product pages	71% of product pages

The lift in cross-sell CTR exceeded the 5.0% target. The coverage improvement (from 34% to 71% of product pages showing recommendations) was the bigger driver: the algorithm could recommend something for products that the category managers had never written rules for.

Two caveats Marcus noted in the post-mortem:

The top rules were not surprising. Phone -> phone_case, laptop -> laptop_bag --- the category managers already had these. The value came from the long tail of rules that no human would have manually curated.
Lift decay over time. Rules mined on January data showed lower lift when evaluated on June data. The team now re-mines rules monthly and compares against the prior month's rules to catch seasonal shifts.

Return to Chapter 23 | Next: Case Study 2 --- StreamFlow Sticky Content