Chapter 3 Exercises: Python Environment Setup

Introduction

These exercises will help you solidify your understanding of Python environment setup and configuration. Complete them in order, as later exercises build on earlier ones. Each exercise includes difficulty ratings to help you gauge your progress.

Difficulty Levels: - Beginner: Essential skills everyone should master - Intermediate: Skills for independent work - Advanced: Professional-level competencies

Section A: Python Installation and Basics (Exercises 1-8)

Exercise 1: Verify Python Installation

Difficulty: Beginner

Open your command prompt or terminal and complete the following tasks:

Check your Python version using the command line
Check your pip version
Determine the path where Python is installed
List the first 10 packages currently installed in your base environment

Expected Commands:

python --version
pip --version
python -c "import sys; print(sys.executable)"
pip list | head -10

Deliverable: Screenshot or text output of all four commands.

Exercise 2: Python Interactive Mode

Difficulty: Beginner

Enter Python's interactive mode and perform the following calculations:

Calculate the field goal percentage for a player who made 432 shots out of 892 attempts
Calculate the true shooting attempts (TSA) formula: FGA + 0.44 * FTA for a player with 756 FGA and 412 FTA
Create a list of the five traditional basketball positions
Create a dictionary with three players and their points per game

Deliverable: A text file with your Python commands and outputs.

Exercise 3: pip Fundamentals

Difficulty: Beginner

Without installing anything, use pip commands to answer the following:

What is the latest version of pandas available on PyPI?
What are the dependencies of the seaborn package?
Find all packages in your current environment that contain "data" in their name
Generate a requirements.txt file from your current environment

Commands to use:

pip index versions pandas
pip show seaborn
pip list | grep -i data
pip freeze > my_requirements.txt

Deliverable: Answers to each question with supporting command output.

Exercise 4: Understanding Package Versions

Difficulty: Beginner

Create a file called version_check.py that:

Imports pandas, numpy, matplotlib, and seaborn
Prints the version of each library
Prints whether each version meets the minimum requirements: - pandas >= 2.0.0 - numpy >= 1.24.0 - matplotlib >= 3.7.0 - seaborn >= 0.12.0

Expected Output:

Package Version Check
====================
pandas: 2.0.3 [PASS]
numpy: 1.24.3 [PASS]
matplotlib: 3.7.2 [PASS]
seaborn: 0.12.2 [PASS]

Exercise 5: Command Line Python Scripts

Difficulty: Intermediate

Create a Python script called quick_stats.py that accepts command-line arguments:

python quick_stats.py --points 28 --fga 22 --fta 8 --rebounds 10 --assists 7

The script should calculate and display: - Field goal attempts - True shooting attempts - A simple efficiency rating (points + rebounds + assists)

Hint: Use the argparse module.

Exercise 6: pip Advanced Usage

Difficulty: Intermediate

Perform the following pip operations:

Install a specific older version of requests (2.28.0)
Check which packages depend on this version
Upgrade to the latest version
Create a constraints file that pins numpy to version 1.24.3

Deliverable: Document each command and its output.

Exercise 7: Investigating Package Metadata

Difficulty: Intermediate

Write a Python script that uses pkg_resources or importlib.metadata to:

List all installed packages with versions
Find all packages installed after a certain date
Calculate the total size of installed packages
Identify which packages have updates available

Exercise 8: Building a Custom Package Index

Difficulty: Advanced

Research and document how you would:

Create a private PyPI server for your organization
Configure pip to use both public PyPI and your private server
Publish a private package to your server

Deliverable: A written plan with specific tools and configuration files needed.

Section B: Virtual Environments (Exercises 9-15)

Exercise 9: Creating Your First Virtual Environment

Difficulty: Beginner

Create a virtual environment for a basketball analytics project:

Create a new directory called nba_analysis
Create a virtual environment inside it called venv
Activate the environment
Verify that the environment is active
Install pandas and numpy
Deactivate the environment

Document: All commands used and the output at each step.

Exercise 10: Requirements File Management

Difficulty: Beginner

Create three different requirements files for a basketball analytics project:

requirements.txt - Production dependencies with pinned versions
requirements-dev.txt - Development dependencies (pytest, black, flake8)
requirements-docs.txt - Documentation dependencies (sphinx, sphinx-rtd-theme)

Each file should include appropriate version specifiers (==, >=, ~=, etc.)

Deliverable: Three requirements files with at least 5 packages each.

Exercise 11: Environment Recreation

Difficulty: Intermediate

Your colleague has shared their requirements.txt file:

pandas==2.0.3
numpy==1.24.3
matplotlib==3.7.2
seaborn==0.12.2
scikit-learn==1.3.0
jupyter==1.0.0

Create a new virtual environment
Install the requirements
Verify all packages are correctly installed
Export a new requirements file with all dependencies (including transitive ones)
Compare the two requirements files and explain the differences

Exercise 12: Conda Environment Setup

Difficulty: Intermediate

If you have Conda installed, create an environment using a YAML file:

Write an environment.yml file for basketball analytics
Create the environment from this file
List all packages in the environment
Export the environment to a new YAML file
Compare the original and exported files

Starter YAML:

name: nba_analytics
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.11
  - pandas>=2.0
  - numpy>=1.24
  - matplotlib>=3.7

Exercise 13: Managing Multiple Environments

Difficulty: Intermediate

Create two separate environments to simulate working on different projects:

Environment legacy_project with: - Python 3.9 - pandas 1.5.0 - numpy 1.23.0
Environment modern_project with: - Python 3.11 - pandas 2.0.3 - numpy 1.24.3

Write a script that: - Checks which environment is currently active - Lists the versions of key packages - Warns if you are using outdated packages

Exercise 14: Virtual Environment Troubleshooting

Difficulty: Intermediate

Debug the following scenarios (describe how you would fix each):

You activate your virtual environment but import pandas still fails
Your requirements.txt installs but your code throws version incompatibility errors
Your virtual environment folder was accidentally deleted but you have requirements.txt
pip install hangs when trying to install a package
Two packages in your requirements have conflicting dependency versions

Difficulty: Advanced

Research and implement a solution for sharing environments across different operating systems:

Create an environment on your current OS
Generate platform-independent requirements
Document how a colleague on a different OS would recreate the environment
Handle platform-specific packages (if any)

Consider: pip-tools, poetry, or pipenv as alternatives to plain requirements.txt

Section C: Jupyter Notebooks (Exercises 16-22)

Exercise 16: Jupyter Notebook Basics

Difficulty: Beginner

Create a Jupyter notebook called basketball_basics.ipynb that:

Has a title in a markdown cell
Imports pandas, numpy, and matplotlib
Creates a simple DataFrame with 5 players and their statistics
Calculates the mean and standard deviation of points per game
Creates a bar chart of points by player
Includes markdown cells explaining each step

Exercise 17: Jupyter Magic Commands

Difficulty: Beginner

Create a notebook that demonstrates the following magic commands:

%timeit - Time a pandas operation
%matplotlib inline - Display plots inline
%%writefile - Write a cell to a Python file
%run - Run an external Python script
%who - List variables in the namespace
%load - Load code from an external file
%history - Show command history

Deliverable: Notebook with examples and explanations of each magic command.

Exercise 18: Notebook Best Practices

Difficulty: Intermediate

Refactor the following poorly structured notebook code into a well-organized notebook:

Original (messy) code:

import pandas as pd
import numpy as np
df = pd.read_csv('data.csv')
df.head()
x = df['points'].mean()
print(x)
import matplotlib.pyplot as plt
plt.plot(df['games'], df['points'])
df2 = df[df['points'] > 20]
df2.to_csv('output.csv')

Create a properly structured notebook with: - Clear section headers - Import statements grouped at the top - Explanatory markdown cells - Proper variable names - Output cells showing results

Exercise 19: Interactive Widgets

Difficulty: Intermediate

Create a notebook with interactive widgets that allow users to:

Select a player from a dropdown
Choose a date range with sliders
Toggle between different statistics with radio buttons
Update a plot based on the selections

Use: ipywidgets library

import ipywidgets as widgets
from IPython.display import display

# Your code here

Exercise 20: Notebook as a Report

Difficulty: Intermediate

Create a professional-looking notebook that serves as an analysis report:

Include a title page with author, date, and abstract
Add a table of contents using markdown
Include executive summary
Create publication-quality figures with proper labels
Add references section
Export to HTML and PDF formats

Topic: "Analysis of Three-Point Shooting Trends Over 10 Seasons"

Exercise 21: Converting Notebooks to Scripts

Difficulty: Intermediate

Create a notebook with reusable analysis functions
Convert it to a Python script using jupyter nbconvert
Refactor the script into a proper module with: - Docstrings - Type hints - Main function - Command-line interface

Commands:

jupyter nbconvert --to script notebook.ipynb

Exercise 22: Notebook Testing and CI/CD

Difficulty: Advanced

Set up a testing pipeline for Jupyter notebooks:

Install nbval for notebook testing
Create tests that verify notebook cells execute without errors
Create a pytest configuration for notebook testing
Write a GitHub Actions workflow that tests notebooks on push

Section D: Version Control with Git (Exercises 23-30)

Exercise 23: Git Basics

Difficulty: Beginner

Initialize a Git repository for a basketball analytics project:

Create a new directory and initialize Git
Create a .gitignore file appropriate for Python projects
Create a simple Python script
Stage and commit the files
View the commit history
Create a README.md and commit it

Document: Each command and its purpose.

Exercise 24: Branching and Merging

Difficulty: Beginner

Practice branching workflow:

Create a new branch called feature/player-analysis
Make changes to a Python file on this branch
Commit the changes
Switch back to main branch
Merge the feature branch into main
Delete the feature branch

Exercise 25: Handling Merge Conflicts

Difficulty: Intermediate

Intentionally create and resolve a merge conflict:

Create two branches from main
Modify the same line of the same file in both branches
Merge one branch into main
Attempt to merge the second branch
Resolve the conflict manually
Complete the merge

Document: The conflict markers and how you resolved them.

Exercise 26: Git History and Logs

Difficulty: Intermediate

Use Git log commands to explore a repository's history:

View the last 5 commits in one-line format
View commits by a specific author
View commits that changed a specific file
View commits between two dates
Find a commit that contains a specific word in the message
View a graphical representation of branches

Commands to explore:

git log --oneline -5
git log --author="Name"
git log --since="2023-01-01" --until="2023-12-31"
git log --grep="feature"
git log --graph --all --oneline

Exercise 27: Working with Remote Repositories

Difficulty: Intermediate

Practice working with GitHub:

Create a new repository on GitHub
Add the remote to your local repository
Push your local commits to GitHub
Make changes on GitHub (edit a file in the browser)
Pull the changes to your local repository
Create a feature branch and push it to GitHub

Exercise 28: Git Tags and Releases

Difficulty: Intermediate

Manage project versions with Git tags:

Create an annotated tag for version 1.0.0
Push tags to remote
View all tags
Checkout a specific tag
Create a release on GitHub from a tag

Exercise 29: Git Stash and Recovery

Difficulty: Intermediate

Practice using Git stash for work-in-progress:

Make changes to files without committing
Stash the changes with a descriptive message
Switch to another branch and do some work
Return to original branch and apply the stash
Clear the stash

Also practice: - Recovering deleted files with git checkout - Undoing the last commit with git reset - Viewing the reflog to find lost commits

Exercise 30: Git Hooks and Automation

Difficulty: Advanced

Set up Git hooks for a Python project:

Create a pre-commit hook that runs black (code formatter)
Create a pre-commit hook that runs flake8 (linter)
Create a pre-push hook that runs pytest
Document how hooks work and where they are stored

Alternative: Use the pre-commit framework:

pip install pre-commit

Section E: Project Organization and Reproducibility (Exercises 31-37)

Exercise 31: Project Template Creation

Difficulty: Beginner

Create a complete project template for basketball analytics:

Create the full directory structure (data, notebooks, src, tests, output)
Create placeholder files in each directory
Create README.md with project description
Create .gitignore
Create requirements.txt
Initialize Git repository

Exercise 32: Documentation with Docstrings

Difficulty: Intermediate

Write comprehensive docstrings for the following functions:

def calculate_per(pts, reb, ast, stl, blk, fgm, fga, ftm, fta, to, pf, mins):
    # Calculate Player Efficiency Rating
    pass

def calculate_win_shares(player_stats, team_stats, league_stats):
    # Calculate Win Shares
    pass

def predict_game_outcome(home_team_stats, away_team_stats, model):
    # Predict game outcome
    pass

Include: - Description - Parameters with types - Returns with types - Examples - Notes on formula sources

Exercise 33: Configuration Management

Difficulty: Intermediate

Implement configuration management for an analytics project:

Create a config.py file with project settings
Create a config.yaml file for environment-specific settings
Write a function to load configuration from YAML
Implement environment variable overrides
Handle secrets securely (API keys, database passwords)

Do not commit secrets to version control!

Exercise 34: Logging Implementation

Difficulty: Intermediate

Add logging to a basketball analytics script:

Set up logging with different levels (DEBUG, INFO, WARNING, ERROR)
Log to both console and file
Include timestamps and function names in log messages
Create rotating log files (max size, max files)
Log the start and end of major operations

import logging

# Your logging configuration here

Exercise 35: Testing Your Analysis Code

Difficulty: Intermediate

Write tests for basketball analytics functions:

Install pytest
Create a tests directory with test files
Write tests for statistical calculation functions
Write tests for data loading functions
Run tests and generate a coverage report

Example functions to test:

def calculate_true_shooting_pct(points, fga, fta):
    if fga + fta == 0:
        return 0.0
    return points / (2 * (fga + 0.44 * fta))

def calculate_usage_rate(fga, fta, to, team_fga, team_fta, team_to, mins, team_mins):
    # Usage rate calculation
    pass

Exercise 36: Reproducibility Checklist

Difficulty: Intermediate

Audit an analysis project for reproducibility. Create a checklist document that verifies:

All dependencies are documented with versions
Random seeds are set and documented
Data sources are documented with access dates
All preprocessing steps are scripted (not manual)
Configuration is separate from code
README includes complete setup instructions
Tests verify key calculations

Apply this checklist to your own project.

Exercise 37: Makefile for Automation

Difficulty: Advanced

Create a Makefile that automates common project tasks:

.PHONY: install test clean lint format run

install:
    # Install dependencies

test:
    # Run tests

clean:
    # Clean generated files

lint:
    # Run linting

format:
    # Format code

run:
    # Run main analysis

Implement each target and document usage in README.

Section F: Integration Challenges (Exercises 38-40)

Exercise 38: Complete Environment Setup Challenge

Difficulty: Advanced

Starting from scratch on a new machine (or in a Docker container):

Document the complete process to set up a basketball analytics environment
Install Python, pip, and all necessary tools
Create a virtual environment
Install all required packages
Verify the installation
Run a test analysis script

Time yourself and aim for under 15 minutes.

Exercise 39: Debugging Environment Issues

Difficulty: Advanced

You receive the following error messages. Diagnose and fix each:

ModuleNotFoundError: No module named 'pandas'
ImportError: cannot import name 'RandomForestClassifier' from 'sklearn'
RuntimeError: Python 3.10 is required, but Python 3.8.10 is installed
ERROR: pip's dependency resolver does not currently take into account all packages
jupyter: command not found

Document your debugging process and solutions.

Exercise 40: Full Project Setup

Difficulty: Advanced

Create a complete, production-ready project setup for analyzing NBA player efficiency:

Initialize project with proper structure
Set up virtual environment with all dependencies
Create sample data files
Write analysis code in src/ directory
Create Jupyter notebook that imports from src/
Write comprehensive tests
Set up pre-commit hooks for code quality
Initialize Git with proper .gitignore
Write complete README with setup instructions
Verify a fresh clone can be set up and run successfully

Deliverable: GitHub repository (can be private) with all components.

Answer Key for Selected Exercises

Exercise 4: Version Check Script Solution

"""Check installed package versions against requirements."""

from packaging import version
import importlib

REQUIREMENTS = {
    'pandas': '2.0.0',
    'numpy': '1.24.0',
    'matplotlib': '3.7.0',
    'seaborn': '0.12.0'
}

def check_version(package_name, min_version):
    """Check if installed version meets minimum requirement."""
    try:
        module = importlib.import_module(package_name)
        installed = module.__version__
        meets_requirement = version.parse(installed) >= version.parse(min_version)
        status = "PASS" if meets_requirement else "FAIL"
        return installed, status
    except ImportError:
        return "NOT INSTALLED", "FAIL"

def main():
    print("Package Version Check")
    print("=" * 30)

    all_pass = True
    for package, min_ver in REQUIREMENTS.items():
        installed, status = check_version(package, min_ver)
        print(f"{package}: {installed} [{status}]")
        if status == "FAIL":
            all_pass = False

    print("=" * 30)
    if all_pass:
        print("All requirements satisfied!")
    else:
        print("Some requirements not met.")

if __name__ == "__main__":
    main()

Exercise 9: Virtual Environment Commands

# 1. Create directory
mkdir nba_analysis
cd nba_analysis

# 2. Create virtual environment
python -m venv venv

# 3. Activate (Windows)
venv\Scripts\activate

# 3. Activate (macOS/Linux)
source venv/bin/activate

# 4. Verify
which python  # Should show path in venv
python --version

# 5. Install packages
pip install pandas numpy

# 6. Deactivate
deactivate

Exercise 23: Git Basics Commands

# 1. Create and initialize
mkdir basketball_analytics
cd basketball_analytics
git init

# 2. Create .gitignore
cat > .gitignore << 'EOF'
__pycache__/
*.py[cod]
venv/
.ipynb_checkpoints/
*.csv
.env
EOF

# 3. Create Python script
cat > analysis.py << 'EOF'
"""Basic basketball analysis script."""
import pandas as pd

def calculate_ppg(total_points, games_played):
    return total_points / games_played
EOF

# 4. Stage and commit
git add .gitignore analysis.py
git commit -m "Initial commit: Add gitignore and analysis script"

# 5. View history
git log --oneline

# 6. Create and commit README
cat > README.md << 'EOF'
# Basketball Analytics

A Python project for analyzing basketball statistics.
EOF

git add README.md
git commit -m "Add README documentation"

Submission Guidelines

For each exercise: 1. Save your work in a clearly named folder (e.g., exercise_01/) 2. Include all code files, notebooks, and documentation 3. Add a brief reflection on what you learned 4. Note any challenges you encountered and how you solved them

Grading Rubric: - Beginner exercises: Completion and correctness - Intermediate exercises: Code quality, documentation, and best practices - Advanced exercises: Comprehensive solution, professional quality, innovative approaches

Chapter 3 Exercises: Python Environment Setup

Introduction

Section A: Python Installation and Basics (Exercises 1-8)

Exercise 1: Verify Python Installation

Exercise 2: Python Interactive Mode

Exercise 3: pip Fundamentals

Exercise 4: Understanding Package Versions

Exercise 5: Command Line Python Scripts

Exercise 6: pip Advanced Usage

Exercise 7: Investigating Package Metadata

Exercise 8: Building a Custom Package Index

Section B: Virtual Environments (Exercises 9-15)

Exercise 9: Creating Your First Virtual Environment

Exercise 10: Requirements File Management

Exercise 11: Environment Recreation

Exercise 12: Conda Environment Setup

Exercise 13: Managing Multiple Environments

Exercise 14: Virtual Environment Troubleshooting

Exercise 15: Cross-Platform Environment Sharing

Section C: Jupyter Notebooks (Exercises 16-22)

Exercise 16: Jupyter Notebook Basics

Exercise 17: Jupyter Magic Commands

Exercise 18: Notebook Best Practices

Exercise 19: Interactive Widgets

Exercise 20: Notebook as a Report

Exercise 21: Converting Notebooks to Scripts

Exercise 22: Notebook Testing and CI/CD

Section D: Version Control with Git (Exercises 23-30)

Exercise 23: Git Basics

Exercise 24: Branching and Merging

Exercise 25: Handling Merge Conflicts

Exercise 26: Git History and Logs

Exercise 27: Working with Remote Repositories

Exercise 28: Git Tags and Releases

Exercise 29: Git Stash and Recovery

Exercise 30: Git Hooks and Automation

Section E: Project Organization and Reproducibility (Exercises 31-37)

Exercise 31: Project Template Creation

Exercise 32: Documentation with Docstrings

Exercise 33: Configuration Management

Exercise 34: Logging Implementation

Exercise 35: Testing Your Analysis Code

Exercise 36: Reproducibility Checklist

Exercise 37: Makefile for Automation

Section F: Integration Challenges (Exercises 38-40)

Exercise 38: Complete Environment Setup Challenge

Exercise 39: Debugging Environment Issues

Exercise 40: Full Project Setup

Answer Key for Selected Exercises

Exercise 4: Version Check Script Solution

Exercise 9: Virtual Environment Commands

Exercise 23: Git Basics Commands

Submission Guidelines