Further Reading: What's Next

This is the final further reading section of the book — and appropriately, it's the most comprehensive. Think of it as your "next-step library," organized by career path and skill area. You don't need to read everything here. Pick the resources that match your six-month roadmap and dive in.


Tier 1: Verified Sources

General Data Science (Next Level)

Aurélien Géron, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (O'Reilly, 3rd edition, 2022). The natural next step after this book's machine learning chapters. Géron covers everything from classical ML to deep learning (including CNNs, RNNs, transformers, and reinforcement learning) with clear explanations and practical code. This is the single most recommended "next book" for readers who've completed an introductory data science course.

Wes McKinney, Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter (O'Reilly, 3rd edition, 2022). You've been using pandas throughout this book, but McKinney's definitive reference goes deeper into advanced features: MultiIndex, advanced groupby, time series functionality, and performance optimization. Invaluable as a desk reference for professional pandas work.

Jake VanderPlas, Python Data Science Handbook: Essential Tools for Working with Data (O'Reilly, 2nd edition, 2023). Comprehensive coverage of NumPy, pandas, matplotlib, and scikit-learn. Well-organized for reference use — when you need to look up "how do I do X in matplotlib?", VanderPlas usually has the answer.

SQL

Anthony DeBarros, Practical SQL: A Beginner's Guide to Storytelling with Data (No Starch Press, 2nd edition, 2022). The best introductory SQL book for data professionals. DeBarros teaches SQL through real-world data analysis examples — not abstract exercises. Covers PostgreSQL specifically but the concepts transfer to any SQL database. If you buy one SQL book, make it this one.

Alan Beaulieu, Learning SQL: Generate, Manipulate, and Retrieve Data (O'Reilly, 3rd edition, 2020). A more traditional SQL reference that covers the language comprehensively. Less narrative than DeBarros but broader in its coverage of SQL features. Good as a second book or reference.

Statistics and Inference

Richard McElreath, Statistical Rethinking: A Bayesian Course with Examples in R and Stan (CRC Press, 2nd edition, 2020). The best bridge between introductory statistics and Bayesian thinking. McElreath is an exceptional teacher who makes complex statistical ideas accessible through clear reasoning and vivid examples. The code examples are in R and Stan, but the concepts are language-agnostic, and Python translations are available online.

Joshua D. Angrist and Jörn-Steffen Pischke, Mastering 'Metrics: The Path from Cause to Effect (Princeton University Press, 2015). If you want to understand causal inference — moving beyond "correlation is not causation" to "here's how we actually establish causation" — this is the most accessible entry point. Covers instrumental variables, regression discontinuity, differences-in-differences, and other causal inference methods through clear examples. Essential for anyone interested in A/B testing, experimental design, or policy analysis.

David Spiegelhalter, The Art of Statistics: How to Learn from Data (Basic Books, 2019). Previously recommended in Chapter 1, but worth revisiting now that you have more context. Spiegelhalter's explanations of statistical concepts through real-world cases become even more valuable once you've actually done statistical analysis yourself.

Machine Learning and Deep Learning

Jeremy Howard and Sylvain Gugger, Deep Learning for Coders with Fastai and PyTorch (O'Reilly, 2020). The companion book to the fast.ai course. Takes a practical, top-down approach to deep learning — you build working systems before diving into theory. Excellent for people who learn by doing rather than by reading equations.

Christopher M. Bishop, Pattern Recognition and Machine Learning (Springer, 2006). A more mathematical and comprehensive treatment of machine learning. Not for everyone — requires comfort with linear algebra and probability theory — but invaluable if you want to truly understand why ML algorithms work, not just how to use them. Consider this a reference for when you want deeper understanding of a specific method.

Software Engineering for Data Scientists

Al Sweigart, Beyond the Basic Stuff with Python (No Starch Press, 2020). Bridges the gap between "I can write Python scripts" and "I can write professional, maintainable Python code." Covers object-oriented programming, code formatting, documentation, testing, and version control at a depth that data scientists need but courses rarely provide.

Chip Huyen, Designing Machine Learning Systems (O'Reilly, 2022). For anyone considering ML engineering. Covers the full lifecycle of ML in production: data engineering, feature engineering, model training, deployment, monitoring, and iteration. Bridges the gap between building a model in a notebook and running a model in production.

Communication and Visualization

Cole Nussbaumer Knaflic, Storytelling with Data: A Data Visualization Guide for Business Professionals (Wiley, 2015). Referenced throughout this book, but essential for professional data work. Knaflic's framework for creating effective data presentations — identify your audience, choose appropriate charts, eliminate clutter, and direct attention — will make you a better communicator immediately.

Edward Tufte, The Visual Display of Quantitative Information (Graphics Press, 2nd edition, 2001). The classic text on information design. Dense and opinionated, but Tufte's principles are timeless. Best appreciated now that you've created dozens of visualizations yourself.

Career and Professional Development

Emily Robinson and Jacqueline Nolis, Build a Career in Data Science (Manning, 2020). The most comprehensive data science career guide available. Covers finding jobs, preparing for interviews, negotiating offers, succeeding in your first role, and advancing as a senior practitioner. Written by two experienced data scientists who understand both sides of the hiring table. If you read one career book, make it this one.


Tier 2: Attributed Resources

Courses and Online Learning

fast.ai, "Practical Deep Learning for Coders" (Jeremy Howard and Rachel Thomas). The best free deep learning course, taught with a philosophy similar to this book: learn by doing, start with practical applications, understand theory as needed. Available at course.fast.ai. Includes a companion book listed above.

Andrew Ng's Machine Learning and Deep Learning courses (Coursera/DeepLearning.ai). Ng's courses are among the most popular in the world for good reason: clear explanations, well-structured content, and a focus on intuition alongside mathematics. The original Machine Learning course (now offered in a newer Python version) and the five-course Deep Learning Specialization are both excellent.

Stanford CS229 (Machine Learning) and CS231n (Computer Vision) lecture recordings. Available free on YouTube. These are actual Stanford lectures — rigorous, mathematical, and comprehensive. Best for learners who want academic depth alongside practical skill.

Mode Analytics SQL Tutorial. A practical, free SQL tutorial designed for data analysts. Covers basic queries through advanced window functions using real datasets. Good for building SQL skills quickly.

Kaggle Learn. Kaggle offers free micro-courses on Python, pandas, SQL, machine learning, and more. Each course takes 4-8 hours and includes hands-on exercises. Good for quick skill building and review.

Blogs and Publications Worth Following

Towards Data Science (Medium). The largest collection of data science blog posts. Quality varies, but the best posts are excellent. Good for seeing how practitioners approach real problems and communicate their work.

FiveThirtyEight. Data-driven journalism at its best. Excellent models of how to communicate statistical analysis to a general audience. Read their election, sports, and social science coverage for inspiration.

Distill (distill.pub). Interactive, beautifully designed explanations of machine learning concepts. Less active now, but the archived articles on neural networks, attention mechanisms, and representation learning are among the best explanatory resources on the internet.

Chip Huyen's blog. Huyen writes clearly about ML systems, career development, and the practical challenges of deploying ML in production. Her "Machine Learning Interviews" resource is also invaluable.

Julia Evans (jvns.ca). Evans writes delightful, accessible explanations of technical concepts (networking, Linux, debugging, SQL). Her "zine" format makes complex topics approachable. Not data-science-specific but excellent for building the systems understanding that data professionals need.

Communities

Reddit: r/datascience, r/learnpython, r/learnmachinelearning, r/statistics — active communities for asking questions, sharing projects, and getting career advice.

dbt Community Slack. If you're interested in analytics engineering or data engineering, this is one of the most active and welcoming data communities online.

MLOps Community. Focused on the practical challenges of deploying and maintaining ML systems. Valuable for anyone interested in ML engineering.

PyData meetups. A global network of local meetups focused on Python for data. Many host talks, workshops, and networking events. Check meetup.com for your city.

Women in Data Science (WiDS). A global initiative centered at Stanford that hosts an annual conference, local events, and community resources. Open to participants of all genders.

Podcasts

DataFramed (DataCamp). Interviews with data scientists across industries. Good for understanding the range of data science applications.

Not So Standard Deviations (Roger Peng and Hilary Parker). A conversational podcast about data science in practice. Both hosts are experienced data scientists with strong statistical backgrounds.

Linear Digressions. A (concluded but archived) podcast that explained one machine learning or data science concept per episode in accessible terms.


Path: Data Analyst

  1. Start with Practical SQL by DeBarros
  2. Learn Tableau or Power BI (free tutorials available on both platforms)
  3. Read Storytelling with Data by Knaflic
  4. Build two portfolio projects demonstrating SQL + visualization
  5. Apply to jobs while continuing to learn

Path: Data Scientist

  1. Start with Practical SQL (SQL is still critical for data scientists)
  2. Work through Hands-On Machine Learning by Géron (Chapters 1-14 for deep learning foundations)
  3. Study A/B testing and experimental design (Angrist and Pischke's Mastering 'Metrics or an online course)
  4. Read Statistical Rethinking by McElreath for Bayesian thinking
  5. Build two to three portfolio projects demonstrating advanced analytical skills

Path: ML Engineer

  1. Work through Deep Learning for Coders (fast.ai book) or the fast.ai course
  2. Read Beyond the Basic Stuff with Python by Sweigart for software engineering fundamentals
  3. Read Designing Machine Learning Systems by Huyen
  4. Learn Docker, cloud deployment (pick one: AWS, GCP, or Azure), and basic MLOps
  5. Build a project that deploys a model as a web service

Path: Data Engineer

  1. Master SQL with Practical SQL and then advanced resources (window functions, query optimization)
  2. Learn dbt (data build tool) through the free dbt Learn course
  3. Learn Apache Spark through a course or Spark: The Definitive Guide (O'Reilly, 2018)
  4. Study cloud data services (pick one: AWS Redshift, Google BigQuery, or Snowflake)
  5. Build a data pipeline project using Airflow or a similar orchestration tool

A Final Note on Learning

You've spent months learning data science through a single book. That's an achievement worth celebrating.

But this book was always meant to be a beginning, not an end. The resources listed here are your next steps — not obligations, but invitations. Pick the ones that match your goals, your interests, and your learning style. Build things. Break things. Ask questions. Share your work.

The data science community is vast, welcoming, and always hungry for new voices and new perspectives. Your voice matters. Your perspective matters. The questions only you would think to ask, the analyses only you would design, the stories only you would tell — those are what the field needs.

Welcome to data science. For real, this time.

The adventure continues.