Chapter 10 Exercises: Recommendation Systems
Section A: Conceptual Foundations
Exercise 10.1 — The Business Case for Recommendations
A mid-size online bookstore (500,000 monthly visitors, 80,000 unique titles, average order value of $28) currently displays the same "Staff Picks" and "Bestsellers" lists to every visitor. The CEO asks you whether a recommendation system is worth the investment.
a) Calculate the potential revenue impact if recommendations increase average order value by 15 percent and conversion rate by 10 percent, assuming a current conversion rate of 3.2 percent and 500,000 monthly visitors. b) Identify three risks or costs that should be weighed against this potential revenue. c) The bookstore has only 12 months of transaction data and no explicit ratings. Is this sufficient to build a recommendation system? What kind of feedback data could they use?
Exercise 10.2 — The Long Tail
A music streaming service has 90 million tracks in its catalog. Analysis shows that the top 1 percent of tracks (900,000 tracks) account for 78 percent of total streams. The remaining 89.1 million tracks account for 22 percent.
a) Explain how a recommendation system could shift this distribution. b) Why would the streaming service want to shift it? Consider both user experience and business strategy (licensing costs, artist relations, competitive differentiation). c) What is the risk of shifting too aggressively toward the long tail?
Exercise 10.3 — Recommendation Business Models
For each of the following businesses, identify whether the primary recommendation objective is revenue optimization, engagement optimization, or discovery/curation. Then identify the key metric you would use to evaluate recommendation success.
a) An online grocery delivery service b) A social media platform funded by advertising c) A job search platform d) A news aggregator with a subscription model e) A luxury fashion e-commerce site
Section B: Collaborative Filtering
Exercise 10.4 — Ratings Matrix Computation
Consider the following user-item ratings matrix (1-5 scale, "?" = unrated):
| Movie A | Movie B | Movie C | Movie D | Movie E | |
|---|---|---|---|---|---|
| User 1 | 5 | 4 | ? | 2 | ? |
| User 2 | 4 | ? | 5 | 1 | 3 |
| User 3 | ? | 3 | 4 | ? | 5 |
| User 4 | 5 | 4 | ? | 1 | 4 |
a) Calculate the cosine similarity between User 1 and User 2 using their overlapping ratings (Movies A and D). Show your work. b) Calculate the cosine similarity between User 1 and User 4 using their overlapping ratings (Movies A, B, and D). Show your work. c) Based on your similarity calculations, which user is more similar to User 1? Use the more similar user's ratings to predict User 1's rating for Movie E. d) Explain why relying on only two overlapping items (as in part a) is problematic. What minimum overlap would you require in a production system?
Exercise 10.5 — User-Based vs. Item-Based
A social media platform is deciding between user-based and item-based collaborative filtering for its content recommendation system. The platform has 50 million active users and 10 million pieces of content created daily.
a) Explain why user-based collaborative filtering would be computationally impractical in this setting. b) Explain why item-based collaborative filtering is also challenging given the volume of daily content creation. c) Propose an alternative approach that addresses both scalability concerns. (Hint: think about the candidate generation stage discussed in Section 10.5.)
Exercise 10.6 — Similarity Metric Selection
For each scenario, recommend the most appropriate similarity metric (cosine, Pearson, or Jaccard) and explain your reasoning:
a) A movie streaming platform with 1-5 star ratings where some users consistently rate higher than others b) A grocery delivery app tracking which items customers add to their carts (no explicit ratings) c) A B2B software marketplace where enterprise buyers rate products on a standardized 1-10 scale with detailed rubrics
Exercise 10.7 — The Sparsity Challenge
An e-commerce platform has 2 million customers and 500,000 products. The average customer has purchased 12 items.
a) Calculate the density of the user-item interaction matrix (percentage of cells that are filled). b) Why does this level of sparsity make collaborative filtering difficult? c) Propose two techniques to mitigate the sparsity problem. Explain the tradeoff of each.
Section C: Content-Based and Hybrid Approaches
Exercise 10.8 — Content Feature Engineering
You are building a content-based recommendation system for an online course platform. Design a feature vector for each course that would enable content-based similarity computation. Your feature vector should include:
a) At least 5 categorical features with example values b) At least 3 numerical features with example ranges c) A strategy for handling the course description text (a free-text field averaging 200 words) d) A justification for each feature's inclusion — how does it relate to user preferences?
Exercise 10.9 — TF-IDF Intuition
Two product descriptions on Athena's website:
Product X: "Lightweight waterproof trail running shoe with breathable mesh upper and aggressive lug pattern for technical terrain. Features responsive cushioning and a reinforced toe cap."
Product Y: "Comfortable everyday walking shoe with soft cushioning and classic styling. Features a breathable mesh upper and durable rubber outsole for urban surfaces."
a) Without computing exact values, identify which words would have high TF-IDF scores for Product X (present in X but rare across the catalog). b) Which words would have high TF-IDF scores for Product Y? c) Which words would have low TF-IDF scores for both products? Why? d) If a customer purchased Product X, would a TF-IDF-based content filter likely recommend Product Y? Explain.
Exercise 10.10 — Hybrid Design
You are the data science lead at a streaming music platform. Design a hybrid recommendation system by specifying:
a) Which approach (collaborative, content-based, or popularity) you would use for each of the following page types: (i) homepage for a new user, (ii) homepage for a returning user, (iii) "Similar Artists" page, (iv) "Discover Weekly" playlist. b) For the returning user homepage, describe how you would combine collaborative and content-based scores. What weight would you assign to each, and why? c) How would you handle a newly released album from an established artist versus a newly released album from an unknown artist?
Section D: Cold Start and Implicit Feedback
Exercise 10.11 — Cold Start Strategy
An online fashion retailer launches in a new country (Brazil) where it has zero historical data. The company has extensive data from its existing markets (US, UK, Germany) but different fashion preferences apply.
a) Design a phased cold-start strategy for the first 6 months of operation. Identify what data you would use at each phase and when you would transition between approaches. b) What assumptions from the existing markets might be transferable? What assumptions would be dangerous to transfer? c) The retailer considers a "style quiz" during onboarding to collect initial preference data. Design a 5-question quiz that maximizes information gain while minimizing user effort.
Exercise 10.12 — Implicit Feedback Interpretation
For each user behavior below, assess: (a) what it likely signals about user preference, (b) what alternative explanations exist, and (c) how you would weight it relative to a 5-star rating on a scale of 0.0 to 1.0.
a) Customer spends 4 minutes on a product page, scrolls through all images, reads reviews, but does not purchase b) Customer adds an item to their cart but removes it 20 minutes later c) Customer purchases an item and returns it within 48 hours d) Customer purchases an item and purchases it again 3 months later e) Customer clicks on a product from a recommendation module but immediately clicks "Back"
Exercise 10.13 — Feedback Loop Analysis
Draw a diagram (or describe in words) the feedback loop that occurs when a recommendation system optimizes purely for click-through rate. Your analysis should address:
a) How does the system's behavior change over time? b) At what point does the loop become harmful to the user experience? c) What signal would indicate that the loop is becoming problematic? d) Propose a mechanism to break the loop without abandoning personalization.
Section E: Evaluation
Exercise 10.14 — Precision vs. Coverage Tradeoff
System A recommends the same 50 bestselling products to every user and achieves Precision@10 of 0.35 and Coverage of 0.04 (4 percent of catalog). System B uses personalized collaborative filtering and achieves Precision@10 of 0.22 and Coverage of 0.38.
a) Which system would you deploy if your primary goal is maximizing short-term conversion rate? Why? b) Which system would you deploy if your primary goal is maximizing long-term catalog utilization and customer retention? Why? c) Design a composite metric that balances both objectives. Define the formula and justify the weights.
Exercise 10.15 — NDCG Calculation
A recommendation system returns the following 5 items for a user. The user's actual relevance scores (1 = relevant, 0 = not relevant) are shown:
| Position | Item | Relevant? |
|---|---|---|
| 1 | Item A | 1 |
| 2 | Item B | 0 |
| 3 | Item C | 1 |
| 4 | Item D | 1 |
| 5 | Item E | 0 |
a) Calculate DCG@5 using the formula: DCG = sum of (relevance_i / log2(i + 1)) for positions i = 1 to 5. b) Calculate the ideal DCG@5 (IDCG) — what the DCG would be if all relevant items were ranked first. c) Calculate NDCG@5 = DCG / IDCG. d) If Item A and Item C swapped positions (C at position 1, A at position 3), how would NDCG change? What does this tell you about the importance of ranking order?
Exercise 10.16 — A/B Test Design
Athena wants to A/B test three recommendation strategies: (1) popularity-based, (2) collaborative filtering, and (3) hybrid. Design the experiment:
a) What is the null hypothesis? b) What primary metric would you use? What secondary metrics would you track? c) How would you handle the cold-start problem — should new users be included in the test or excluded? d) How long should the test run? What factors determine the minimum duration? e) What are the risks of ending the test too early?
Section F: Ethics and Architecture
Exercise 10.17 — Filter Bubble Audit
You discover that your e-commerce recommendation system has created distinct "filter bubbles" for different customer segments. Customers who initially purchased budget items see almost exclusively budget recommendations, while customers who initially purchased premium items see almost exclusively premium options.
a) Is this behavior a bug or a feature? Argue both sides. b) Design a "diversity injection" mechanism that ensures all customers see a range of price tiers without completely undermining personalization. c) What metrics would you track to ensure your diversity injection is working as intended?
Exercise 10.18 — Transparency Design
Design explanation templates for the following recommendation scenarios at Athena:
a) An item recommended because similar customers purchased it b) An item recommended because it is similar to something the customer previously purchased c) An item recommended because it is trending in the customer's geographic region d) An item recommended because it complements an item already in the customer's cart e) A sponsored item placed in the recommendation feed by a brand partner
For each, write the user-facing explanation text and assess whether it is truthful, helpful, and complete.
Exercise 10.19 — Architecture Decision
A food delivery app serves 2 million orders per day across 50 cities. Each city has different restaurant options, and restaurant menus change frequently (daily specials, out-of-stock items). The app currently uses batch recommendations updated every 6 hours.
a) Identify three problems with the current batch approach given the domain's characteristics. b) Design a hybrid batch/real-time architecture for this use case. Specify what is computed in batch, what is computed in real time, and what triggers a real-time update. c) What is the maximum acceptable latency for recommendation delivery on the restaurant listing page? Justify your answer.
Section G: Python and Applied
Exercise 10.20 — Extending the RecommendationEngine
Using the RecommendationEngine class from Section 10.9, complete the following tasks:
a) Add a recommend_for_segment method that generates recommendations for a group of customers (a segment from Chapter 9's CustomerSegmenter) by aggregating their preferences.
b) Add a diversity_score method that calculates the average pairwise cosine distance between items in a recommendation list — higher values indicate more diverse recommendations.
c) Modify the recommend_hybrid method to accept a min_diversity parameter. If the initial recommendations do not meet the diversity threshold, iteratively swap the most similar pair for the next-best candidate until the threshold is met (or the candidate pool is exhausted).
Exercise 10.21 — Implicit Feedback Engine
Modify the RecommendationEngine to work with implicit feedback instead of explicit ratings:
a) Replace the ratings matrix with a binary interaction matrix (1 = purchased, 0 = no interaction). b) Implement confidence weighting: assign higher confidence to items with multiple purchases (repeat buyers) and lower confidence to items with a single purchase. c) Compare the recommendations generated by the implicit model to the explicit model for the same customer. Discuss the differences.
Exercise 10.22 — Business Impact Simulation
Using the synthetic data from Section 10.9, simulate the business impact of the recommendation engine:
a) Define a "conversion probability" function where the probability of purchase increases with the predicted score (e.g., P(purchase) = predicted_score / 5). b) Simulate 1,000 customer sessions where each customer is shown 10 recommendations and purchases according to your conversion probability function. c) Calculate the simulated average order value, items per basket, and revenue per session. d) Compare these results to a baseline where 10 random products are shown to each customer.
Section H: Integration and Critical Thinking
Exercise 10.23 — The Merchandising Team's Objection
Athena's merchandising team pushes back on the recommendation engine with the following arguments. For each, write a two-paragraph response that acknowledges the concern and proposes a solution:
a) "The algorithm keeps recommending last season's inventory instead of the new arrivals we need to move." b) "Our brand partners pay for premium placement, and the algorithm is burying their products." c) "The algorithm doesn't understand that some product combinations are dangerous — it recommended a climbing harness to someone buying their first pair of hiking boots."
Exercise 10.24 — Competitive Analysis
Research and compare the recommendation approaches of two of the following companies (use publicly available information):
a) Amazon vs. Etsy (e-commerce, different scales) b) Netflix vs. TikTok (video, different content models) c) Spotify vs. Apple Music (music, different philosophical approaches)
For your chosen pair, address: (i) what type of recommendation system each uses, (ii) how each handles the cold start problem, (iii) what each optimizes for, and (iv) what ethical concerns each faces.
Exercise 10.25 — Full System Design
You are hired as a data science consultant for a regional hospital network that wants to build a recommendation system for patient wellness content. The system would recommend educational articles, videos, and programs based on a patient's health conditions, demographics, and engagement history.
a) Design the recommendation system architecture, specifying data sources, filtering approach (collaborative, content-based, hybrid), and evaluation metrics. b) Identify at least five ethical constraints unique to healthcare that do not apply in e-commerce (e.g., "do not recommend content that contradicts a patient's treatment plan"). c) How would you handle the cold-start problem for a new patient? What information could you use, and what information would be ethically inappropriate to use? d) How would HIPAA regulations affect your data handling? Identify specific architectural decisions driven by privacy requirements.
Selected answers appear in Appendix B (Answers to Selected Exercises). Full solutions with code are available in the online supplement.