Part I: The ML Mindset
You know how to analyze data. You can compute averages, build visualizations, and run regressions. That puts you ahead of most people who claim to "work with data." But analysis and prediction are fundamentally different activities, and the gap between them is where most aspiring data scientists get stuck.
Analysis asks: What happened? Prediction asks: What will happen? The shift sounds minor. It is not. It changes how you frame problems, how you evaluate success, how you structure code, and even how you think about truth. A statistician who proves that smoking causes cancer and a data scientist who predicts which patients will develop cancer are answering related but different questions — and they need different tools, different workflows, and different definitions of "correct."
Part I builds the mental framework you need before you touch an algorithm. Four chapters. Four foundations.
Chapter 1: From Analysis to Prediction reframes how you think about data problems. You will learn to distinguish between descriptive, inferential, and predictive tasks — and why the techniques that excel at one often fail at another.
Chapter 2: The Machine Learning Workflow maps the end-to-end lifecycle of an ML project. Not the textbook lifecycle (get data → train model → done) but the real one: problem framing, baseline establishment, iteration, evaluation, deployment, and the maintenance that never ends.
Chapter 3: Experimental Design and A/B Testing grounds your predictions in evidence. Building a model that predicts well offline is step one. Proving that the model improves outcomes in the real world — through properly designed experiments — is where the value is created.
Chapter 4: The Math Behind ML gives you the mathematical vocabulary for everything that follows. Probability, linear algebra, calculus, and loss functions — not as abstract exercises, but as the language your models speak. Every formula gets three presentations: the intuition, the notation, and the code.
By the end of Part I, you will think about data problems differently than you did before. You will understand why models fail, how to evaluate them honestly, and what the math is actually doing under the hood. That mental model is worth more than any algorithm.
Progressive Project Milestone
In Part I, you meet StreamFlow — a subscription streaming analytics platform with 2.4 million subscribers and an 8.2% monthly churn rate. You will frame the churn prediction problem, define the target variable, and design the A/B test that will validate your model's business impact. No code yet. Just thinking — the kind of thinking that separates data scientists from data analysts.
What You Need
- Chapters 1–3 require no math beyond basic probability
- Chapter 4 assumes familiarity with high school algebra and the concept of a function
- Python and Jupyter setup (see Prerequisites)