Chapter 39: Further Reading

Essential Sources

1. DJ Patil and Hilary Mason, Data Driven: Creating a Data Culture (O'Reilly, 2015)

The foundational text on building a data-driven organization, written by the first U.S. Chief Data Scientist (Patil) and the former Chief Data Scientist at Bitly (Mason). At 42 pages, it is more manifesto than textbook — but its brevity is a feature. Every page addresses a decision that a DS leader will face in the first year.

Reading guidance: Chapter 2 ("Collect the Right Data") addresses a problem this textbook deliberately deferred: what happens when the data science team is ready to build but the data infrastructure does not exist? Patil and Mason's pragmatic advice — start with the data you have, demonstrate value, then earn the investment for better data — is the organizational equivalent of Theme 6 (Simplest Model That Works). Chapter 3 ("Communicate Effectively") extends Section 39.6.2's executive communication framework with concrete examples of how data scientists at LinkedIn and other organizations built executive trust through structured reporting. Chapter 5 ("Building the Data Team") covers hiring and team structure at a level complementary to this chapter: where Section 39.2 provides taxonomies and heuristics, Patil and Mason provide war stories and judgment. The book's most enduring insight is its framing of data science as a culture, not a function — the same argument that closes this chapter. For a DS leader in their first 90 days, this is the single most useful thing to read.

2. Stefan Thomke, Experimentation Works: The Surprising Power of Business Experiments (Harvard Business Review Press, 2020)

The definitive academic treatment of experimentation culture in business organizations. Thomke, a Harvard Business School professor, draws on detailed case studies of Booking.com (running 25,000+ experiments per year), Microsoft (the Bing experimentation platform), and State Farm (experimentation in a regulated industry) to show how organizations build, sustain, and scale experimentation practices.

Reading guidance: Chapter 3 ("What Makes a Good Business Experiment") extends Chapter 33's treatment of rigorous experimentation from the statistical to the organizational level: how to design experiments that answer the right business question, how to build the organizational infrastructure that enables rapid experimentation, and how to create incentives that reward learning rather than just positive results. Chapter 5 ("Leading with Experiments") directly addresses the experimentation maturity spectrum from Section 39.4.1 — Thomke describes the transition from Level 2 (test-sometimes) to Level 4 (evidence-pervasive) with specific organizational mechanisms, including the "experimentation council" model that Booking.com uses to govern test volume, quality, and resource allocation. Chapter 7 ("Experimentation and Ethics") is particularly relevant for Meridian Financial and MediCore, covering how regulated industries can practice experimentation within legal and ethical boundaries. For readers who found Section 39.4.1's maturity spectrum useful, Thomke provides the detailed operational guidance for moving between levels.

3. Will Larson, An Elegant Puzzle: Systems of Engineering Management (Stripe Press, 2019)

A practitioner's guide to engineering management at scale, written by the former VP of Engineering at Calm and engineering leader at Stripe, Uber, and Digg. While focused on software engineering rather than data science specifically, the organizational design principles — team sizing, organizational structure, technical strategy, career frameworks — transfer directly.

Reading guidance: Chapter 3 ("Organizations") provides the theoretical framework behind Section 39.2's team structure discussion: Larson formalizes the concept of "organizational debt" (the organizational analog of technical debt) and describes how to restructure teams without destroying productivity. His "four states of a team" model — falling behind, treading water, repaying debt, innovating — is a diagnostic tool for assessing whether the DS team's current challenges are structural (requiring reorganization) or resource-driven (requiring hiring). Chapter 5 ("Culture") extends Section 39.4 with specific mechanisms for building engineering culture at scale: architecture reviews, postmortem practices, writing culture, and the distinction between "values" (what you believe) and "practices" (what you do). Chapter 7 ("Careers") provides the framework for the career ladder exercise in Exercise 39.23 — Larson's treatment of level expectations, scope of impact, and promotion criteria is the most practical available for technical organizations. For the progressive project (designing StreamRec's 30-person organization), Chapters 3 and 5 together provide the organizational design toolkit that complements this chapter's DS-specific content.

4. Emily Glassberg Sacks and Michael Li, "Building a Data Science Team" (O'Reilly Report, 2022) and Jacqueline Nolis and Emily Robinson, Build a Career in Data Science (Manning, 2020)

Two complementary resources on the human side of data science organizations. Glassberg Sacks and Li's report (originally based on work at Coursera and The Data Incubator) covers organizational design from the leadership perspective: team structure, role definitions, hiring pipelines, and success metrics. Robinson and Nolis cover the same territory from the individual contributor perspective: how to choose the right organization, how to navigate organizational politics, how to communicate value, and how to build a career path.

Reading guidance: Glassberg Sacks and Li's taxonomy of DS roles — data analyst, data scientist, machine learning engineer, data engineer, research scientist — provides a more granular role decomposition than Section 39.7's hiring plan. Their treatment of the "full-stack data scientist" myth is particularly relevant: the idea that one person can do data engineering, modeling, deployment, monitoring, and stakeholder communication is unrealistic for production work, and building a team requires explicit role differentiation. Robinson and Nolis's Chapter 12 ("Making Effective Presentations") extends Section 39.6.2's executive communication framework from the individual contributor's perspective — how to present results when you are not the VP of DS but a mid-career data scientist who needs to convince a skeptical product manager. Their Chapter 14 ("Leaving Your Job") is unusually honest about when the organizational environment is unfixable and the right decision is to leave — a perspective rarely found in management-focused resources. For readers of this chapter who are building their own careers rather than building organizations, Robinson and Nolis is the essential complement.

5. Board of Governors of the Federal Reserve System and Office of the Comptroller of the Currency, "Supervisory Guidance on Model Risk Management" (SR 11-7, OCC 2011-12, 2011)

The regulatory document that defines the organizational structure of model risk management in U.S. financial institutions. While not a data science textbook, this 22-page guidance document has shaped the organizational design of every DS team in U.S. banking, insurance, and consumer lending — including Meridian Financial.

Reading guidance: Section III ("Model Validation") defines the independence requirement that mandates Meridian's separate MRM function — the requirement that model validation be conducted by parties independent of the model development process. Section IV ("Governance, Policies, and Controls") describes the board-level oversight requirements that drive the quarterly Model Risk Committee reporting and annual model inventory review. For data scientists in regulated industries, understanding SR 11-7 is as important as understanding gradient descent — it defines the organizational constraints within which all modeling work must operate. For data scientists in unregulated industries, SR 11-7 provides a useful template for what voluntary model governance could look like: model inventories, independent validation, documentation standards, and board-level reporting. Many of the practices that SR 11-7 mandates (model cards, validation reports, monitoring dashboards) are best practices for any organization that deploys models at scale — the regulation simply makes them non-optional.