Further Reading: Setting Up Your Toolkit

You now have a working data science environment and know your way around a Jupyter notebook. If you want to go deeper on any of the tools before moving on, here's where to look. As always, pick what interests you — there's no requirement to read any of these before Chapter 3.


Tier 1: Verified Sources

These are published books and established references with full bibliographic details.

Al Sweigart, Automate the Boring Stuff with Python: Practical Programming for Total Beginners (No Starch Press, 2nd edition, 2019). If you enjoyed the feeling of running your first Python code and want more of that immediate gratification, Sweigart's book is perfect. It teaches Python through practical projects — renaming files, sending emails, scraping websites — that produce visible results quickly. It assumes zero programming experience and maintains a friendly, encouraging tone throughout. The book is also freely available to read online at the author's website, making it one of the most accessible Python resources available. We'll draw on Python fundamentals in Chapters 3-5, and Sweigart's book is an excellent companion for that material.

Allen B. Downey, Think Python: How to Think Like a Computer Scientist (O'Reilly, 3rd edition, 2024). Downey's book takes a more conceptual approach than Sweigart's — it doesn't just teach you Python syntax, it teaches you to think like a programmer. If you found yourself enjoying the "why" behind the tools (Why does Jupyter use a browser? Why does Python have two modes?), Think Python will satisfy that curiosity. The third edition uses Python 3.12 and covers modern Python idioms. Like Sweigart's book, it's also available online for free. Think Python is particularly valuable preparation for Chapters 3 and 4 of our textbook.

Wes McKinney, Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter (O'Reilly, 3rd edition, 2022). Mentioned in Chapter 1's Further Reading as well, McKinney's book includes an excellent section on Jupyter Notebook usage and best practices. Since McKinney created the pandas library and is a core contributor to the Jupyter ecosystem, his perspective on how notebooks fit into the data science workflow is authoritative. You won't need this until Part II (Chapters 7-13), but the early chapters on the Python environment and Jupyter are relevant now.

Jake VanderPlas, Python Data Science Handbook: Essential Tools for Working with Data (O'Reilly, 2nd edition, 2023). Chapter 1 of VanderPlas's handbook covers IPython (the enhanced Python shell that Jupyter is built on) and Jupyter notebooks in more depth than most introductory textbooks. If you want to understand the magic commands (%timeit, %matplotlib inline), advanced shell features, and the full scope of what the IPython kernel can do, this is the reference. Like McKinney's book, it will become increasingly useful as you progress through our textbook.


Tier 2: Attributed Resources

These are online resources and documentation that are well-known in the community. We provide enough detail for you to find them, but don't include URLs since they may change.

The Official Jupyter Documentation. The Jupyter project maintains comprehensive documentation covering the classic Notebook interface, JupyterLab, kernels, configuration, and more. Search for "Jupyter Notebook documentation" to find the official docs. The "Notebook Basics" section is a good complement to what you learned in this chapter, with interactive examples you can try in your own notebook.

The Official Python Tutorial (docs.python.org). Python's official tutorial, maintained by the Python Software Foundation, is one of the best language tutorials ever written. It's thorough, well-organized, and freely available. Start with the section "An Informal Introduction to Python" for a gentle entry point that parallels what you did in Section 2.4. We'll be building on this material heavily in Chapters 3 and 4.

The Anaconda Documentation (docs.anaconda.com). If you had any installation issues or want to learn about managing environments, updating packages, or using conda from the command line, the Anaconda documentation is the authoritative source. The "Getting Started" section walks through common tasks like creating environments, installing packages, and switching between Python versions.

"A Gallery of Interesting Jupyter Notebooks" (GitHub). This curated collection on GitHub showcases notebooks used in data science, journalism, education, and research. Browsing through these will give you a sense of what's possible with notebooks and expose you to good organizational practices. Search for "A gallery of interesting Jupyter Notebooks GitHub" to find it. Even if you can't understand the code yet, pay attention to how the authors use Markdown to structure their narratives.

Corey Schafer's Python tutorials (YouTube). Schafer's YouTube channel includes clear, well-paced tutorials on Python basics, Jupyter notebooks, Anaconda installation, and many other topics. His videos are widely recommended in the Python learning community for their accuracy and production quality. If you prefer video instruction as a supplement to reading, search for "Corey Schafer Jupyter Notebook tutorial" or "Corey Schafer Python tutorial for beginners."


Depending on what caught your attention in this chapter:

  • If you want more practice with the Jupyter interface: Work through all the exercises in this chapter, especially the keyboard shortcut drills. Then browse the "Gallery of Interesting Jupyter Notebooks" to see how experienced practitioners use the tool.

  • If you want to get ahead on Python before Chapter 3: Read the first few chapters of either Automate the Boring Stuff or Think Python. Both are free online, and both cover the same fundamentals we'll address in Chapter 3 — but from a different angle, which can reinforce your understanding.

  • If you had installation problems: Consult the Anaconda documentation's troubleshooting section, or try Corey Schafer's installation video tutorial. If your computer can't run Anaconda at all, set up Google Colab (free, runs in a browser) and continue with the book — you'll learn the same concepts.

  • If you're curious about JupyterLab: Launch it from Anaconda Navigator or by typing jupyter lab in your terminal. Explore the interface. Everything you learned about cells, Markdown, and kernels works identically. JupyterLab just adds a file browser, tabbed editing, and a more modern layout.

  • If you can't wait to start analyzing real data: Patience. Chapter 3 teaches you variables and data types. Chapter 4 teaches you control flow and functions. Chapter 5 teaches you data structures. And Chapter 6 — that's where you load your first real dataset. Each step is building toward that moment, and rushing will leave gaps that slow you down later.

Happy exploring. And remember — the hardest part is already behind you. You installed the tools. You ran the code. Everything else is practice.