Chapter 3 Exercises: Python Environment Setup
Introduction
These exercises will help you solidify your understanding of Python environment setup and configuration. Complete them in order, as later exercises build on earlier ones. Each exercise includes difficulty ratings to help you gauge your progress.
Difficulty Levels: - Beginner: Essential skills everyone should master - Intermediate: Skills for independent work - Advanced: Professional-level competencies
Section A: Python Installation and Basics (Exercises 1-8)
Exercise 1: Verify Python Installation
Difficulty: Beginner
Open your command prompt or terminal and complete the following tasks:
- Check your Python version using the command line
- Check your pip version
- Determine the path where Python is installed
- List the first 10 packages currently installed in your base environment
Expected Commands:
python --version
pip --version
python -c "import sys; print(sys.executable)"
pip list | head -10
Deliverable: Screenshot or text output of all four commands.
Exercise 2: Python Interactive Mode
Difficulty: Beginner
Enter Python's interactive mode and perform the following calculations:
- Calculate the field goal percentage for a player who made 432 shots out of 892 attempts
- Calculate the true shooting attempts (TSA) formula: FGA + 0.44 * FTA for a player with 756 FGA and 412 FTA
- Create a list of the five traditional basketball positions
- Create a dictionary with three players and their points per game
Deliverable: A text file with your Python commands and outputs.
Exercise 3: pip Fundamentals
Difficulty: Beginner
Without installing anything, use pip commands to answer the following:
- What is the latest version of pandas available on PyPI?
- What are the dependencies of the seaborn package?
- Find all packages in your current environment that contain "data" in their name
- Generate a requirements.txt file from your current environment
Commands to use:
pip index versions pandas
pip show seaborn
pip list | grep -i data
pip freeze > my_requirements.txt
Deliverable: Answers to each question with supporting command output.
Exercise 4: Understanding Package Versions
Difficulty: Beginner
Create a file called version_check.py that:
- Imports pandas, numpy, matplotlib, and seaborn
- Prints the version of each library
- Prints whether each version meets the minimum requirements: - pandas >= 2.0.0 - numpy >= 1.24.0 - matplotlib >= 3.7.0 - seaborn >= 0.12.0
Expected Output:
Package Version Check
====================
pandas: 2.0.3 [PASS]
numpy: 1.24.3 [PASS]
matplotlib: 3.7.2 [PASS]
seaborn: 0.12.2 [PASS]
Exercise 5: Command Line Python Scripts
Difficulty: Intermediate
Create a Python script called quick_stats.py that accepts command-line arguments:
python quick_stats.py --points 28 --fga 22 --fta 8 --rebounds 10 --assists 7
The script should calculate and display: - Field goal attempts - True shooting attempts - A simple efficiency rating (points + rebounds + assists)
Hint: Use the argparse module.
Exercise 6: pip Advanced Usage
Difficulty: Intermediate
Perform the following pip operations:
- Install a specific older version of requests (2.28.0)
- Check which packages depend on this version
- Upgrade to the latest version
- Create a constraints file that pins numpy to version 1.24.3
Deliverable: Document each command and its output.
Exercise 7: Investigating Package Metadata
Difficulty: Intermediate
Write a Python script that uses pkg_resources or importlib.metadata to:
- List all installed packages with versions
- Find all packages installed after a certain date
- Calculate the total size of installed packages
- Identify which packages have updates available
Exercise 8: Building a Custom Package Index
Difficulty: Advanced
Research and document how you would:
- Create a private PyPI server for your organization
- Configure pip to use both public PyPI and your private server
- Publish a private package to your server
Deliverable: A written plan with specific tools and configuration files needed.
Section B: Virtual Environments (Exercises 9-15)
Exercise 9: Creating Your First Virtual Environment
Difficulty: Beginner
Create a virtual environment for a basketball analytics project:
- Create a new directory called
nba_analysis - Create a virtual environment inside it called
venv - Activate the environment
- Verify that the environment is active
- Install pandas and numpy
- Deactivate the environment
Document: All commands used and the output at each step.
Exercise 10: Requirements File Management
Difficulty: Beginner
Create three different requirements files for a basketball analytics project:
requirements.txt- Production dependencies with pinned versionsrequirements-dev.txt- Development dependencies (pytest, black, flake8)requirements-docs.txt- Documentation dependencies (sphinx, sphinx-rtd-theme)
Each file should include appropriate version specifiers (==, >=, ~=, etc.)
Deliverable: Three requirements files with at least 5 packages each.
Exercise 11: Environment Recreation
Difficulty: Intermediate
Your colleague has shared their requirements.txt file:
pandas==2.0.3
numpy==1.24.3
matplotlib==3.7.2
seaborn==0.12.2
scikit-learn==1.3.0
jupyter==1.0.0
- Create a new virtual environment
- Install the requirements
- Verify all packages are correctly installed
- Export a new requirements file with all dependencies (including transitive ones)
- Compare the two requirements files and explain the differences
Exercise 12: Conda Environment Setup
Difficulty: Intermediate
If you have Conda installed, create an environment using a YAML file:
- Write an
environment.ymlfile for basketball analytics - Create the environment from this file
- List all packages in the environment
- Export the environment to a new YAML file
- Compare the original and exported files
Starter YAML:
name: nba_analytics
channels:
- conda-forge
- defaults
dependencies:
- python=3.11
- pandas>=2.0
- numpy>=1.24
- matplotlib>=3.7
Exercise 13: Managing Multiple Environments
Difficulty: Intermediate
Create two separate environments to simulate working on different projects:
-
Environment
legacy_projectwith: - Python 3.9 - pandas 1.5.0 - numpy 1.23.0 -
Environment
modern_projectwith: - Python 3.11 - pandas 2.0.3 - numpy 1.24.3
Write a script that: - Checks which environment is currently active - Lists the versions of key packages - Warns if you are using outdated packages
Exercise 14: Virtual Environment Troubleshooting
Difficulty: Intermediate
Debug the following scenarios (describe how you would fix each):
- You activate your virtual environment but
import pandasstill fails - Your requirements.txt installs but your code throws version incompatibility errors
- Your virtual environment folder was accidentally deleted but you have requirements.txt
- pip install hangs when trying to install a package
- Two packages in your requirements have conflicting dependency versions
Exercise 15: Cross-Platform Environment Sharing
Difficulty: Advanced
Research and implement a solution for sharing environments across different operating systems:
- Create an environment on your current OS
- Generate platform-independent requirements
- Document how a colleague on a different OS would recreate the environment
- Handle platform-specific packages (if any)
Consider: pip-tools, poetry, or pipenv as alternatives to plain requirements.txt
Section C: Jupyter Notebooks (Exercises 16-22)
Exercise 16: Jupyter Notebook Basics
Difficulty: Beginner
Create a Jupyter notebook called basketball_basics.ipynb that:
- Has a title in a markdown cell
- Imports pandas, numpy, and matplotlib
- Creates a simple DataFrame with 5 players and their statistics
- Calculates the mean and standard deviation of points per game
- Creates a bar chart of points by player
- Includes markdown cells explaining each step
Exercise 17: Jupyter Magic Commands
Difficulty: Beginner
Create a notebook that demonstrates the following magic commands:
%timeit- Time a pandas operation%matplotlib inline- Display plots inline%%writefile- Write a cell to a Python file%run- Run an external Python script%who- List variables in the namespace%load- Load code from an external file%history- Show command history
Deliverable: Notebook with examples and explanations of each magic command.
Exercise 18: Notebook Best Practices
Difficulty: Intermediate
Refactor the following poorly structured notebook code into a well-organized notebook:
Original (messy) code:
import pandas as pd
import numpy as np
df = pd.read_csv('data.csv')
df.head()
x = df['points'].mean()
print(x)
import matplotlib.pyplot as plt
plt.plot(df['games'], df['points'])
df2 = df[df['points'] > 20]
df2.to_csv('output.csv')
Create a properly structured notebook with: - Clear section headers - Import statements grouped at the top - Explanatory markdown cells - Proper variable names - Output cells showing results
Exercise 19: Interactive Widgets
Difficulty: Intermediate
Create a notebook with interactive widgets that allow users to:
- Select a player from a dropdown
- Choose a date range with sliders
- Toggle between different statistics with radio buttons
- Update a plot based on the selections
Use: ipywidgets library
import ipywidgets as widgets
from IPython.display import display
# Your code here
Exercise 20: Notebook as a Report
Difficulty: Intermediate
Create a professional-looking notebook that serves as an analysis report:
- Include a title page with author, date, and abstract
- Add a table of contents using markdown
- Include executive summary
- Create publication-quality figures with proper labels
- Add references section
- Export to HTML and PDF formats
Topic: "Analysis of Three-Point Shooting Trends Over 10 Seasons"
Exercise 21: Converting Notebooks to Scripts
Difficulty: Intermediate
- Create a notebook with reusable analysis functions
- Convert it to a Python script using
jupyter nbconvert - Refactor the script into a proper module with: - Docstrings - Type hints - Main function - Command-line interface
Commands:
jupyter nbconvert --to script notebook.ipynb
Exercise 22: Notebook Testing and CI/CD
Difficulty: Advanced
Set up a testing pipeline for Jupyter notebooks:
- Install nbval for notebook testing
- Create tests that verify notebook cells execute without errors
- Create a pytest configuration for notebook testing
- Write a GitHub Actions workflow that tests notebooks on push
Section D: Version Control with Git (Exercises 23-30)
Exercise 23: Git Basics
Difficulty: Beginner
Initialize a Git repository for a basketball analytics project:
- Create a new directory and initialize Git
- Create a
.gitignorefile appropriate for Python projects - Create a simple Python script
- Stage and commit the files
- View the commit history
- Create a README.md and commit it
Document: Each command and its purpose.
Exercise 24: Branching and Merging
Difficulty: Beginner
Practice branching workflow:
- Create a new branch called
feature/player-analysis - Make changes to a Python file on this branch
- Commit the changes
- Switch back to main branch
- Merge the feature branch into main
- Delete the feature branch
Exercise 25: Handling Merge Conflicts
Difficulty: Intermediate
Intentionally create and resolve a merge conflict:
- Create two branches from main
- Modify the same line of the same file in both branches
- Merge one branch into main
- Attempt to merge the second branch
- Resolve the conflict manually
- Complete the merge
Document: The conflict markers and how you resolved them.
Exercise 26: Git History and Logs
Difficulty: Intermediate
Use Git log commands to explore a repository's history:
- View the last 5 commits in one-line format
- View commits by a specific author
- View commits that changed a specific file
- View commits between two dates
- Find a commit that contains a specific word in the message
- View a graphical representation of branches
Commands to explore:
git log --oneline -5
git log --author="Name"
git log --since="2023-01-01" --until="2023-12-31"
git log --grep="feature"
git log --graph --all --oneline
Exercise 27: Working with Remote Repositories
Difficulty: Intermediate
Practice working with GitHub:
- Create a new repository on GitHub
- Add the remote to your local repository
- Push your local commits to GitHub
- Make changes on GitHub (edit a file in the browser)
- Pull the changes to your local repository
- Create a feature branch and push it to GitHub
Exercise 28: Git Tags and Releases
Difficulty: Intermediate
Manage project versions with Git tags:
- Create an annotated tag for version 1.0.0
- Push tags to remote
- View all tags
- Checkout a specific tag
- Create a release on GitHub from a tag
Exercise 29: Git Stash and Recovery
Difficulty: Intermediate
Practice using Git stash for work-in-progress:
- Make changes to files without committing
- Stash the changes with a descriptive message
- Switch to another branch and do some work
- Return to original branch and apply the stash
- Clear the stash
Also practice:
- Recovering deleted files with git checkout
- Undoing the last commit with git reset
- Viewing the reflog to find lost commits
Exercise 30: Git Hooks and Automation
Difficulty: Advanced
Set up Git hooks for a Python project:
- Create a pre-commit hook that runs black (code formatter)
- Create a pre-commit hook that runs flake8 (linter)
- Create a pre-push hook that runs pytest
- Document how hooks work and where they are stored
Alternative: Use the pre-commit framework:
pip install pre-commit
Section E: Project Organization and Reproducibility (Exercises 31-37)
Exercise 31: Project Template Creation
Difficulty: Beginner
Create a complete project template for basketball analytics:
- Create the full directory structure (data, notebooks, src, tests, output)
- Create placeholder files in each directory
- Create README.md with project description
- Create .gitignore
- Create requirements.txt
- Initialize Git repository
Exercise 32: Documentation with Docstrings
Difficulty: Intermediate
Write comprehensive docstrings for the following functions:
def calculate_per(pts, reb, ast, stl, blk, fgm, fga, ftm, fta, to, pf, mins):
# Calculate Player Efficiency Rating
pass
def calculate_win_shares(player_stats, team_stats, league_stats):
# Calculate Win Shares
pass
def predict_game_outcome(home_team_stats, away_team_stats, model):
# Predict game outcome
pass
Include: - Description - Parameters with types - Returns with types - Examples - Notes on formula sources
Exercise 33: Configuration Management
Difficulty: Intermediate
Implement configuration management for an analytics project:
- Create a
config.pyfile with project settings - Create a
config.yamlfile for environment-specific settings - Write a function to load configuration from YAML
- Implement environment variable overrides
- Handle secrets securely (API keys, database passwords)
Do not commit secrets to version control!
Exercise 34: Logging Implementation
Difficulty: Intermediate
Add logging to a basketball analytics script:
- Set up logging with different levels (DEBUG, INFO, WARNING, ERROR)
- Log to both console and file
- Include timestamps and function names in log messages
- Create rotating log files (max size, max files)
- Log the start and end of major operations
import logging
# Your logging configuration here
Exercise 35: Testing Your Analysis Code
Difficulty: Intermediate
Write tests for basketball analytics functions:
- Install pytest
- Create a tests directory with test files
- Write tests for statistical calculation functions
- Write tests for data loading functions
- Run tests and generate a coverage report
Example functions to test:
def calculate_true_shooting_pct(points, fga, fta):
if fga + fta == 0:
return 0.0
return points / (2 * (fga + 0.44 * fta))
def calculate_usage_rate(fga, fta, to, team_fga, team_fta, team_to, mins, team_mins):
# Usage rate calculation
pass
Exercise 36: Reproducibility Checklist
Difficulty: Intermediate
Audit an analysis project for reproducibility. Create a checklist document that verifies:
- All dependencies are documented with versions
- Random seeds are set and documented
- Data sources are documented with access dates
- All preprocessing steps are scripted (not manual)
- Configuration is separate from code
- README includes complete setup instructions
- Tests verify key calculations
Apply this checklist to your own project.
Exercise 37: Makefile for Automation
Difficulty: Advanced
Create a Makefile that automates common project tasks:
.PHONY: install test clean lint format run
install:
# Install dependencies
test:
# Run tests
clean:
# Clean generated files
lint:
# Run linting
format:
# Format code
run:
# Run main analysis
Implement each target and document usage in README.
Section F: Integration Challenges (Exercises 38-40)
Exercise 38: Complete Environment Setup Challenge
Difficulty: Advanced
Starting from scratch on a new machine (or in a Docker container):
- Document the complete process to set up a basketball analytics environment
- Install Python, pip, and all necessary tools
- Create a virtual environment
- Install all required packages
- Verify the installation
- Run a test analysis script
Time yourself and aim for under 15 minutes.
Exercise 39: Debugging Environment Issues
Difficulty: Advanced
You receive the following error messages. Diagnose and fix each:
ModuleNotFoundError: No module named 'pandas'ImportError: cannot import name 'RandomForestClassifier' from 'sklearn'RuntimeError: Python 3.10 is required, but Python 3.8.10 is installedERROR: pip's dependency resolver does not currently take into account all packagesjupyter: command not found
Document your debugging process and solutions.
Exercise 40: Full Project Setup
Difficulty: Advanced
Create a complete, production-ready project setup for analyzing NBA player efficiency:
- Initialize project with proper structure
- Set up virtual environment with all dependencies
- Create sample data files
- Write analysis code in src/ directory
- Create Jupyter notebook that imports from src/
- Write comprehensive tests
- Set up pre-commit hooks for code quality
- Initialize Git with proper .gitignore
- Write complete README with setup instructions
- Verify a fresh clone can be set up and run successfully
Deliverable: GitHub repository (can be private) with all components.
Answer Key for Selected Exercises
Exercise 4: Version Check Script Solution
"""Check installed package versions against requirements."""
from packaging import version
import importlib
REQUIREMENTS = {
'pandas': '2.0.0',
'numpy': '1.24.0',
'matplotlib': '3.7.0',
'seaborn': '0.12.0'
}
def check_version(package_name, min_version):
"""Check if installed version meets minimum requirement."""
try:
module = importlib.import_module(package_name)
installed = module.__version__
meets_requirement = version.parse(installed) >= version.parse(min_version)
status = "PASS" if meets_requirement else "FAIL"
return installed, status
except ImportError:
return "NOT INSTALLED", "FAIL"
def main():
print("Package Version Check")
print("=" * 30)
all_pass = True
for package, min_ver in REQUIREMENTS.items():
installed, status = check_version(package, min_ver)
print(f"{package}: {installed} [{status}]")
if status == "FAIL":
all_pass = False
print("=" * 30)
if all_pass:
print("All requirements satisfied!")
else:
print("Some requirements not met.")
if __name__ == "__main__":
main()
Exercise 9: Virtual Environment Commands
# 1. Create directory
mkdir nba_analysis
cd nba_analysis
# 2. Create virtual environment
python -m venv venv
# 3. Activate (Windows)
venv\Scripts\activate
# 3. Activate (macOS/Linux)
source venv/bin/activate
# 4. Verify
which python # Should show path in venv
python --version
# 5. Install packages
pip install pandas numpy
# 6. Deactivate
deactivate
Exercise 23: Git Basics Commands
# 1. Create and initialize
mkdir basketball_analytics
cd basketball_analytics
git init
# 2. Create .gitignore
cat > .gitignore << 'EOF'
__pycache__/
*.py[cod]
venv/
.ipynb_checkpoints/
*.csv
.env
EOF
# 3. Create Python script
cat > analysis.py << 'EOF'
"""Basic basketball analysis script."""
import pandas as pd
def calculate_ppg(total_points, games_played):
return total_points / games_played
EOF
# 4. Stage and commit
git add .gitignore analysis.py
git commit -m "Initial commit: Add gitignore and analysis script"
# 5. View history
git log --oneline
# 6. Create and commit README
cat > README.md << 'EOF'
# Basketball Analytics
A Python project for analyzing basketball statistics.
EOF
git add README.md
git commit -m "Add README documentation"
Submission Guidelines
For each exercise:
1. Save your work in a clearly named folder (e.g., exercise_01/)
2. Include all code files, notebooks, and documentation
3. Add a brief reflection on what you learned
4. Note any challenges you encountered and how you solved them
Grading Rubric: - Beginner exercises: Completion and correctness - Intermediate exercises: Code quality, documentation, and best practices - Advanced exercises: Comprehensive solution, professional quality, innovative approaches