Prerequisites
This textbook assumes a modest but specific set of prior knowledge. The material progresses from accessible foundations to genuinely advanced techniques, and we have worked to make the early chapters approachable for readers without extensive quantitative training. However, a minimum baseline in programming and mathematics is necessary to engage productively with the text. This chapter describes what you need to know, provides a short self-assessment, and offers guidance for readers who need to strengthen their preparation before diving in.
Python Programming
We use Python as the implementation language throughout the book. You do not need to be an expert programmer, but you should be comfortable with the following:
- Basic syntax and data types. Variables, strings, integers, floats, booleans, lists, dictionaries, and tuples. You should be able to write and interpret assignment statements, f-strings, and basic type conversions without hesitation.
- Control flow. If/elif/else statements, for loops, while loops, and list comprehensions. You should understand how to iterate over a list, filter elements conditionally, and build new data structures from existing ones.
- Functions. Defining functions with
def, passing arguments, returning values, and understanding scope. You should be comfortable writing a function that takes numerical inputs, performs a calculation, and returns a result. - Imports and libraries. Using
importto bring in external packages, calling functions from those packages, and reading basic library documentation. Prior experience with NumPy or pandas is helpful but not required—we introduce these tools as we use them. - File I/O. Reading data from CSV files and writing results to files. The
pandas.read_csv()function handles most of our data loading needs, and we walk through its usage when it first appears.
Python Self-Test
If you can complete the following task without consulting a reference, your Python skills are sufficient for this book:
Write a function called calculate_payout that takes two arguments: a wager amount (a float) and American odds (an integer). The function should return the net profit for a winning bet. Recall that for positive odds (e.g., +150), the profit on a \$100 wager equals the odds value. For negative odds (e.g., -150), you must wager the absolute value of the odds to win \$100. Your function should handle both cases and work for any wager amount, not just \$100.
If you can write this function confidently, you are ready. If you struggled or could not complete it, we recommend working through an introductory Python course before proceeding. Several excellent free resources exist, including the official Python Tutorial at docs.python.org, Al Sweigart's Automate the Boring Stuff with Python, and the Python track on freeCodeCamp. Investing two to four weeks in Python fundamentals will pay substantial dividends throughout this book.
Mathematics
The mathematical demands of this book vary by chapter. The foundational chapters (Part I) require only algebra and basic probability. The modeling chapters (Parts II and III) require a working knowledge of statistics. The most advanced chapters (Parts IV, V, and VII) occasionally draw on calculus and linear algebra, though we provide sufficient context that readers without these backgrounds can follow the reasoning.
At minimum, you should be comfortable with:
- Algebra. Manipulating equations, solving for unknowns, working with fractions and percentages, and understanding functional notation (e.g., f(x) = 2x + 3). This is used constantly throughout the book for odds conversions, expected value calculations, and model specifications.
- Basic probability. The concept of probability as a number between 0 and 1, the complement rule (P(not A) = 1 - P(A)), the addition rule for mutually exclusive events, and the multiplication rule for independent events. You should understand what it means to say that an event has a 30% probability of occurring.
- Descriptive statistics. Mean, median, standard deviation, and variance. You should understand what these quantities measure and be able to interpret them. For example, you should understand that a mean prediction error of zero does not imply that a model is accurate, because the errors could be large in magnitude and cancel out.
- Basic statistical inference. A conceptual understanding of hypothesis testing, p-values, and confidence intervals is helpful for Parts II and III. You do not need to be able to derive these from first principles—we develop the relevant theory as needed—but prior exposure to the ideas will make the material easier to absorb.
For readers who wish to strengthen their mathematical preparation, we recommend the following resources: Naked Statistics by Charles Wheelan for an intuitive, non-technical introduction to statistical thinking; OpenIntro Statistics (freely available at openintro.org) for a more formal treatment at the introductory undergraduate level; and the Khan Academy courses on probability and statistics for interactive, self-paced learning.
Calculus appears occasionally in derivations, particularly when we optimize objective functions or work with continuous probability distributions. If you have taken a first course in calculus, these sections will be straightforward. If you have not, you can follow the arguments by accepting the optimization results without working through every derivative yourself—we always state the results clearly and explain their intuitive meaning.
Linear algebra is used in the chapters on regression, machine learning, and portfolio optimization. Familiarity with vectors, matrices, matrix multiplication, and the concept of a system of linear equations is helpful. The NumPy library handles the computational heavy lifting, so you will not need to perform matrix operations by hand, but understanding what the operations represent will deepen your comprehension of the models.
Computing Setup
Before starting Chapter 1, we recommend having the following in place:
- Python 3.10 or later installed on your machine. Verify your installation by opening a terminal and running
python --version(orpython3 --versionon some systems). - A package manager. If you installed Python via Anaconda, you already have
conda. Otherwise,pip(which ships with Python) is sufficient. Verify withpip --version. - A text editor or IDE. Any environment where you can write and execute Python scripts will work. We recommend VS Code (free, cross-platform, excellent Python support) or JupyterLab (ideal for interactive exploration). PyCharm is another strong choice if you prefer a full-featured IDE.
- Core packages installed. Run the following command to install the packages used in the first several parts of the book:
pip install numpy pandas matplotlib seaborn scikit-learn statsmodels scipy requests
Verify the installation by opening a Python interpreter and running:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
print("Setup complete. NumPy version:", np.__version__)
If this runs without errors, your environment is ready.
- Companion repository cloned. The companion code repository contains all datasets, code listings, and Jupyter notebooks organized by chapter. We recommend cloning it to a local directory before you begin so that you can run examples and work through exercises without any additional setup steps.
A Final Note on Preparation
If you meet the prerequisites described above—basic Python fluency, algebraic comfort, and a conceptual grasp of probability and statistics—you are ready to begin. Do not wait until you feel perfectly prepared; some of the most effective learning happens when you encounter a technique slightly beyond your current comfort zone and work through it with the help of the text. The early chapters are deliberately paced to build confidence and fill gaps. Trust the progression, do the exercises, and you will find your footing quickly.