Part II: Machine Learning Fundamentals
"All models are wrong, but some are useful." — George E. P. Box
With mathematical foundations in place, we now turn to the core question of machine learning: how do we build systems that learn patterns from data and use those patterns to make predictions on new, unseen examples?
Part II covers the classical machine learning algorithms that remain essential to every AI engineer. These methods are not relics of a pre-deep-learning era — they are practical tools used daily in industry, and they provide the conceptual scaffolding on which deep learning is built. Understanding why linear regression works, how decision boundaries form, and what cross-validation actually measures will make you a better deep learning practitioner.
We begin with supervised learning, covering regression and classification algorithms from linear models to ensemble methods. We then explore unsupervised learning and dimensionality reduction — techniques for finding structure in unlabeled data. Model evaluation and validation teach you how to honestly assess whether your models work. Feature engineering shows you how to transform raw data into representations that algorithms can exploit. We close with probabilistic and Bayesian methods, which provide a principled framework for incorporating prior knowledge and quantifying uncertainty.
Chapters in This Part
| Chapter | Title | Key Question |
|---|---|---|
| 6 | Supervised Learning: Regression and Classification | How do we learn input-output mappings from labeled data? |
| 7 | Unsupervised Learning and Dimensionality Reduction | How do we discover structure in data without labels? |
| 8 | Model Evaluation, Selection, and Validation | How do we know if our model actually works? |
| 9 | Feature Engineering and Data Pipelines | How do we transform raw data into useful model inputs? |
| 10 | Probabilistic and Bayesian Methods | How do we incorporate uncertainty into our predictions? |
What You Will Be Able to Do After Part II
- Train and evaluate regression and classification models using scikit-learn
- Apply clustering and dimensionality reduction to explore datasets
- Use cross-validation, proper metrics, and statistical tests to compare models
- Build reproducible data processing pipelines
- Reason about models probabilistically using Bayesian inference
Prerequisites
- Part I (mathematical foundations and Python skills)
- Familiarity with basic data manipulation in pandas