Chapter 6 Quiz: The Business of Machine Learning

DataField.Dev

Chapter 6 Quiz: The Business of Machine Learning

Multiple Choice

Question 1. Which stage of the ML project lifecycle typically consumes 60 to 80 percent of total project time?

a) Modeling and experimentation b) Business problem definition c) Data preparation and feature engineering d) Deployment and integration

Question 2. Tom Kowalski's real-time pricing engine story illustrates which common ML failure mode?

a) Data leakage b) Overfitting to the wrong objective c) Wrong problem framing d) Insufficient feedback loops

Question 3. A fraud detection model for an e-commerce platform correctly identifies 95 out of 100 fraudulent transactions. However, it also flags 500 legitimate transactions as fraudulent, out of 9,900 legitimate transactions. Which metric best captures the model's tendency to flag legitimate transactions incorrectly?

a) Recall (95%) b) Accuracy (98.95%) c) Precision (16%) d) F1 score (28%)

Question 4. According to Professor Okonkwo's Five Questions framework, which of the following is NOT one of the five evaluation criteria for ML use cases?

a) Is the data available? b) Is the algorithm state-of-the-art? c) Is the value measurable? d) Is the organization ready?

Question 5. A churn prediction model has the following cost structure: each retention offer costs $20, the average lifetime value of a retained customer is $500, and the cost of a missed churner is $500. Which metric should the business prioritize?

a) Precision — because the cost of false positives is high relative to false negatives b) Recall — because the cost of false negatives ($500) far exceeds the cost of false positives ($20) c) Accuracy — because overall correctness is the most important metric d) F1 score — because the costs are balanced

Question 6. In the build-vs-buy decision framework, which factor most strongly favors building a custom ML solution?

a) The organization has limited ML talent b) The business needs a solution within 30 days c) The ML capability is a core source of competitive differentiation d) The ML problem involves standard, widely-available data

Question 7. Which of the following is an example of data leakage?

a) Using a customer's total purchase history to predict their next purchase b) Including "reason for account closure" as a feature in a customer churn prediction model c) Using weather data to predict ice cream sales d) Training a model on three years of data instead of one year

Question 8. What is the primary purpose of a feasibility sprint in ML project planning?

a) To build the final production model in a short timeframe b) To determine whether the data contains sufficient signal to make the prediction useful c) To demonstrate the model to executive stakeholders d) To select the optimal algorithm for the problem

Question 9. The "POC trap" refers to which organizational failure pattern?

a) Spending too much time on proof-of-concept without ever deploying b) Skipping the POC phase and going directly to production c) Underestimating the gap between a successful POC and a production system, leading to unrealistic deployment expectations d) Building too many POCs simultaneously and spreading resources thin

Question 10. According to the chapter, which cost category is most consistently underestimated in ML projects?

a) Talent costs b) Compute costs c) Data acquisition costs d) Maintenance costs

Question 11. In Ravi Mehta's prioritization matrix for Athena Retail Group, projects were evaluated on three dimensions. Which of the following is NOT one of those dimensions?

a) Business Impact b) Algorithm Complexity c) Technical Feasibility d) Data Readiness

Question 12. Which role is described as bridging the gap between model development and production deployment?

a) Data Scientist b) ML Engineer c) Data Analyst d) Product Manager

Question 13. A model that always predicts "not fraud" in a dataset where 1% of transactions are fraudulent achieves what accuracy?

a) 1% b) 50% c) 95% d) 99%

Question 14. Which ML project governance mechanism involves formal review points where a project must demonstrate progress before proceeding to the next phase?

a) Model Review Board b) Stage gates c) RACI matrix d) Model cards

Question 15. The concept of "model drift" refers to:

a) A model's accuracy gradually decreasing as the development team makes changes b) The tendency for ML projects to exceed their budgets over time c) The degradation of a deployed model's performance as real-world conditions change d) The movement of data scientists between companies

Short Answer

Question 16. Explain the difference between model metrics and business metrics. Provide one example of each for a customer churn prediction model.

Question 17. List Professor Okonkwo's Five Questions for evaluating ML use cases. For one question of your choice, describe a scenario where the answer would be "no" and explain its implications for the project.

Question 18. Describe two key differences between a proof of concept (POC) and a production ML system. Why does the POC-to-production gap matter for project planning?

Question 19. A manufacturing company is deciding whether to build a custom quality inspection model or purchase a vendor solution. The company has thousands of proprietary product images, two data scientists on staff, and needs the system operational within 6 months. Using the build-vs-buy framework, which option would you recommend and why?

Question 20. Explain why the "cone of uncertainty" concept is relevant to ML project estimation. How does a feasibility sprint help narrow this uncertainty?

Scenario-Based Questions

Question 21. You are advising a mid-size bank that wants to use ML to automate loan approval decisions. The bank's compliance officer is concerned about regulatory risk.

a) Which of Professor Okonkwo's Five Questions is most critical for this use case? Why? b) What governance structures from Section 6.10 would you recommend? c) What is the most dangerous failure mode for this application? How would you mitigate it?

Question 22. A startup CEO tells you: "We don't need ML governance — that's for big companies. We need to move fast." Write a three-to-four-sentence response that acknowledges the need for speed while making the case for proportional governance.

Question 23. Consider Athena's churn prediction pilot. If the model achieves 80% precision and 60% recall at the 0.7 probability threshold:

a) What does 80% precision mean in business terms for Athena? b) What does 60% recall mean in business terms for Athena? c) Given the cost structure described in the chapter ($15 per retention offer, $340 customer lifetime value), should Athena lower or raise the probability threshold? Explain your reasoning.

Question 24. Rank the following three ML project proposals for a grocery delivery service from highest to lowest priority, using the prioritization criteria from the chapter. Justify your ranking.

Project A: Predict which customers will subscribe to the premium delivery membership (Impact: High, Feasibility: Medium, Data Readiness: High)
Project B: Use computer vision to automatically assess produce freshness (Impact: High, Feasibility: Low, Data Readiness: Low)
Project C: Predict optimal delivery routes based on traffic patterns (Impact: Medium, Feasibility: High, Data Readiness: High)

Question 25. Reflect on NK Adeyemi's realization that she "doesn't need to be the data scientist" but rather "the person who makes sure the data scientists are solving the right problem." In two to three sentences, explain why this perspective is valuable in an MBA context and how it relates to the chapter's central argument about the business of machine learning.

Answer key for selected questions is available in the appendix (Answers to Selected Exercises).