Key Takeaways: Chapter 23

DataField.Dev

Key Takeaways: Chapter 23

Association Rules and Market Basket Analysis

Association rules find co-occurrences, not causes. The rule {diapers} -> {beer} means these items appear in the same baskets more often than chance. It does not mean buying diapers causes beer purchases. The causal mechanism (new parents consolidating errands) requires domain knowledge. Treat every rule as a hypothesis to be validated, not a conclusion to be acted on blindly.
Lift > 1 is the minimum bar for actionability. A rule with high confidence but lift near 1.0 is misleading --- the consequent is just popular, and the antecedent adds no predictive value. Lift corrects for base rates by comparing the observed co-occurrence to what independence would predict. In practice, most teams filter to lift > 1.2 or higher. If you remember one metric, remember lift.
Support controls the tradeoff between coverage and noise. High min_support finds only rules among popular items --- items the merchandising team already knows about. Low min_support uncovers niche patterns with high lift but also generates thousands of spurious rules. Start with min_support = 0.01 for large retail datasets and adjust based on the number of rules generated. The goal is tens to hundreds of actionable rules, not thousands.
Confidence alone is insufficient. A rule {bread} -> {milk} with confidence 80% seems strong until you learn that milk appears in 78% of all transactions. The confidence is barely above the base rate. Always pair confidence with lift. Confidence tells you "how often is the rule correct?"; lift tells you "how much more often than chance?"
FP-Growth is the practical default for datasets above 10,000 transactions. Apriori requires one database scan per itemset level and generates candidate itemsets combinatorially. FP-Growth compresses the database into an FP-tree and mines it directly, eliminating candidate generation. Both algorithms find the same frequent itemsets; FP-Growth is simply faster. Use Apriori for pedagogical clarity and small datasets; use FP-Growth for everything else.
Conviction and Zhang's metric add directionality and boundedness. Lift is symmetric: lift(A -> B) = lift(B -> A). But business actions are directional --- recommending wine to cheese buyers is different from recommending cheese to wine buyers. Conviction is asymmetric and captures directional strength. Zhang's metric is bounded between -1 and +1, making it easier to compare across rules. Use lift as the primary filter, then conviction and Zhang's for ranking among filtered rules.
The value of association rules is in the long tail, not the top rules. The highest-lift rules are often obvious to domain experts (phone -> phone_case, coffee_maker -> coffee_beans). Category managers already know these. The business value comes from the next 50-200 rules: cross-category patterns that no single department manager would have guessed. This is where the algorithm earns its keep.
Association rules extend beyond retail baskets. Any domain with basket-like data can be analyzed with association rules: streaming genre combinations, insurance product bundles, SaaS feature usage patterns, course enrollment sequences, medical diagnosis co-occurrences. The framework is the same; the business interpretation changes. The StreamFlow case study shows how linking co-occurrence patterns to a downstream outcome (churn) extends the technique beyond simple cross-selling.
Rules decay over time. A rule mined on January data may not hold in July. Seasonal products, promotional effects, inventory changes, and shifting customer behavior all erode rule stability. Production deployments should re-mine rules on a regular cadence (monthly for most retailers) and compare the current rule set to the previous one. Stable rules (present across many months) justify permanent merchandising changes; seasonal rules justify temporary promotions.
Twenty rules with clear actions beat two thousand rules in a spreadsheet. The algorithm generates hundreds or thousands of rules. The deliverable is a curated list with a recommended business action for each rule: "Place X next to Y," "Recommend Y when X is in the cart," "Bundle X and Y at a 10% discount." If you cannot state the action, the rule is not ready for production.

If You Remember One Thing

Lift > 1 is the key filter. An association rule with high confidence but lift near 1.0 is an artifact of popularity, not a genuine pattern. Lift measures how much more likely the consequent is when the antecedent is present compared to its base rate. A rule with lift 3.0 means the co-occurrence is three times more frequent than chance alone would predict. That is a pattern worth acting on. A rule with lift 1.05 means the co-occurrence is barely distinguishable from chance. That is noise dressed up as insight. Filter by lift first, then refine by confidence and support.

These takeaways summarize Chapter 23: Association Rules and Market Basket Analysis. Return to the chapter for full context.