Chapter 6 Further Reading: The Business of Machine Learning


ML Project Lifecycle and Management

  1. Sculley, D., et al. (2015). "Hidden Technical Debt in Machine Learning Systems." Advances in Neural Information Processing Systems (NeurIPS), 28. The seminal paper on the long-term costs of ML systems in production. Argues that ML systems accumulate "technical debt" at an accelerated rate compared to traditional software. Essential reading for anyone responsible for budgeting or maintaining ML projects. Directly relevant to Section 6.11 on ML economics and the maintenance cost discussion.

  2. Paleyes, A., Urma, R.G., & Lawrence, N.D. (2022). "Challenges in Deploying Machine Learning: A Survey of Case Studies." ACM Computing Surveys, 55(6), 1-29. A comprehensive survey of 99 published case studies documenting real-world ML deployment challenges. Finds that organizational and infrastructure issues are reported more frequently than modeling issues. Provides empirical support for the lifecycle perspective advocated in this chapter.

  3. Amershi, S., et al. (2019). "Software Engineering for Machine Learning: A Case Study." IEEE/ACM 41st International Conference on Software Engineering (ICSE-SEIP), 291-300. Microsoft researchers document the software engineering challenges specific to ML systems, based on internal experience across dozens of ML projects. Covers data management, model evaluation, deployment, and monitoring. Practical and well-grounded in real engineering practice.

  4. Polyzotis, N., Roy, S., Whang, S.E., & Zinkevich, M. (2018). "Data Lifecycle Challenges in Production Machine Learning." ACM SIGMOD Record, 47(2), 17-28. Google researchers describe the data management challenges they encounter in production ML pipelines. Focuses on data validation, feature management, and the underappreciated complexity of keeping data flowing reliably. Reinforces the argument that data engineering is at least as important as modeling.


Problem Framing and Scoping

  1. Dorard, L. (2020). The ML Canvas. mlcanvas.com. The original ML Canvas resource, including templates, examples, and a community of practitioners using the framework. A practical tool for implementing the scoping methodology described in Section 6.3.

  2. Lakshmanan, V., Robinson, S., & Munn, M. (2020). Machine Learning Design Patterns. O'Reilly Media. Catalogs 30 design patterns for ML systems, organized around data representation, problem framing, training, deployment, and responsible AI. The "Hashed Feature" and "Bridged Schema" patterns are particularly relevant for teams navigating the data challenges described in this chapter. Practical and richly illustrated with examples from Google's ML practice.

  3. Huyen, C. (2022). Designing Machine Learning Systems. O'Reilly Media. One of the best end-to-end resources on ML system design, covering problem framing, data engineering, feature engineering, model development, deployment, and monitoring. Written by a Stanford lecturer and ML practitioner. Accessible to readers with basic ML knowledge and invaluable for readers transitioning from model building to system design.


Build vs. Buy and ML Strategy

  1. Iansiti, M. & Lakhani, K.R. (2020). Competing in the Age of AI. Harvard Business Review Press. Examines how AI-native companies (including Stitch Fix, Ant Financial, and others) build competitive advantages through data and algorithms. Relevant to the build-vs-buy discussion (Section 6.6) and the broader question of when ML becomes a strategic capability versus a commodity function.

  2. Davenport, T.H. & Ronanki, R. (2018). "Artificial Intelligence for the Real World." Harvard Business Review, 96(1), 108-116. Based on a study of 152 AI projects, categorizes AI applications into three types: process automation, cognitive insight, and cognitive engagement. Provides a practical framework for evaluating and prioritizing AI use cases — complementary to Ravi's prioritization matrix in Section 6.12.

  3. Ng, A. (2018). "AI Transformation Playbook." Landing AI White Paper. Andrew Ng's concise guide to implementing AI in enterprises, based on his experience at Google Brain and Baidu. Covers pilot project selection, building an AI team, providing broad AI training, and developing an AI strategy. The pilot selection advice aligns closely with the feasibility sprint approach described in Section 6.8.


Team Building and Organization

  1. Patil, DJ & Mason, H. (2015). Data Driven: Creating a Data Culture. O'Reilly Media. Short, practical guide to building data-literate organizations. Covers hiring, team structure, and the cultural prerequisites for data-driven decision making. Useful context for the "Is the organization ready?" question from Professor Okonkwo's Five Questions.

  2. Colson, E. (2019). "What AI-Driven Companies Can Teach Us About Building Algorithms." Harvard Business Review, January-February 2019. Written by Stitch Fix's Chief Algorithms Officer, this article describes how to organize data science teams for maximum business impact. Argues for embedding data scientists in business teams rather than centralizing them. Provides the organizational design perspective behind the Stitch Fix case study.

  3. Shankar, S. & Bernstein, A. (2022). "Operationalizing Machine Learning: An Interview Study." arXiv:2209.09125. Qualitative research based on interviews with 18 ML engineers at major technology companies. Documents the practical challenges of putting models into production and the skills required. Valuable for understanding the ML Engineer role described in Section 6.7 and the deployment challenges in Section 6.5.


Metrics and Evaluation

  1. Provost, F. & Fawcett, T. (2013). Data Science for Business. O'Reilly Media. The standard textbook on connecting data science to business decision-making. Chapters on evaluation metrics, cost-sensitive modeling, and expected value calculations provide deeper treatment of the model-metrics-to-business-metrics translation discussed in Section 6.4. Accessible to business readers without strong technical backgrounds.

  2. Mitchell, M., et al. (2019). "Model Cards for Model Reporting." Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT)*, 220-229. Proposes a standardized framework for documenting ML models, including intended use, performance metrics, limitations, and ethical considerations. Referenced in Section 6.10 on documentation requirements. The model card format has become an industry standard and is used at Google, Microsoft, Hugging Face, and many other organizations.

  3. Gebru, T., et al. (2021). "Datasheets for Datasets." Communications of the ACM, 64(12), 86-92. The companion to Model Cards, proposing standardized documentation for training datasets. Covers dataset composition, collection methodology, intended use, and known biases. Essential reading for responsible ML practice and referenced in Section 6.10.


Failure Modes and Case Studies

  1. Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). "The Parable of Google Flu: Traps in Big Data Analysis." Science, 343(6176), 1203-1205. The definitive critique of Google Flu Trends and the primary source for Case Study 1. Introduces the concept of "big data hubris" and documents how GFT's predictions diverged from reality. Short (3 pages) and highly readable.

  2. Narayanan, A. (2019). "How to Recognize AI Snake Oil." Presentation, Princeton University. A widely circulated presentation that categorizes AI applications by their maturity and reliability. Distinguishes between "AI that works" (e.g., content recommendation, game playing) and "AI snake oil" (e.g., predicting recidivism, predicting job performance from facial expressions). Useful for calibrating expectations and identifying overhyped use cases.

  3. Sambasivan, N., et al. (2021). "'Everyone Wants to Do the Model Work, Not the Data Work': Data Cascades in High-Stakes AI." Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1-15. Documents how data quality problems cascade through ML systems, creating compounding failures. Based on 53 interviews with ML practitioners across multiple industries and countries. Provides empirical evidence for the data-preparation emphasis in Section 6.1 and the data readiness concerns in the Five Questions framework.


ML Economics and ROI

  1. Bughin, J., et al. (2018). "Notes from the AI Frontier: Modeling the Impact of AI on the World Economy." McKinsey Global Institute Discussion Paper. Large-scale analysis of AI's potential economic impact across industries and geographies. Useful for contextualizing the business value estimates in Section 6.12 and for executive-level discussions about ML investment.

  2. Bessen, J. & Righi, C. (2019). "Shocking Technology: What Happens When Firms Make Large IT Investments?" Boston University School of Law, Law and Economics Paper No. 19-6. Examines the financial impact of large IT and AI investments using firm-level data. Finds that large IT investments (including AI) often take 3-5 years to generate measurable returns. Relevant to the total cost of ownership analysis in Section 6.11 and the patience required for ML programs.


Broader Perspectives

  1. O'Neil, C. (2016). Weapons of Math Destruction. Crown Books. Examines how poorly designed and unaccountable ML systems can amplify inequality and cause harm. Relevant to the failure modes discussion (Section 6.5), the governance requirements (Section 6.10), and the broader responsibility themes in Part 5 of this textbook. Accessible to general readers.

  2. Christin, A. (2020). Metrics at Work: Journalism and the Contested Meaning of Algorithms. Princeton University Press. An ethnographic study of how algorithmic metrics reshape professional practice. While focused on journalism, the insights about how optimization targets distort behavior are directly relevant to the "overfitting to the wrong objective" failure mode (Section 6.5). A reminder that the metrics we choose shape the organizations we build.

  3. Agrawal, A., Gans, J., & Goldfarb, A. (2018). Prediction Machines: The Simple Economics of Artificial Intelligence. Harvard Business Review Press. Three economists reframe AI as a technology that reduces the cost of prediction. This framework provides an elegant way to evaluate ML use cases: wherever prediction is currently expensive (human experts, manual processes), ML can create value by making prediction cheap. Relevant to the Five Questions, the economics section, and the overall business-case analysis.

  4. Kelleher, J.D. & Tierney, B. (2018). Data Science. MIT Press Essential Knowledge Series. A concise, accessible introduction to data science for non-technical readers. At 280 pages, it covers the key concepts without the depth of a full textbook. Useful as supplementary reading for MBA students who want a broader foundation before diving into Part 2's technical chapters.


These readings span academic papers, practitioner guides, business strategy books, and critical perspectives. For readers short on time, start with Huyen (2022) for practical system design, Sculley et al. (2015) for the economics of maintenance, and Lazer et al. (2014) for the most important cautionary tale in modern data science.