Chapter 12: Modules, Packages, and the Python Ecosystem

Contributors

21 min read

> "Good programmers write good code. Great programmers reuse great code."

Learning Objectives

Import and use modules from the Python standard library
Create your own modules and understand the module search path
Organize code into packages with __init__.py
Use if __name__ == '__main__' correctly
Explore the standard library: os, sys, datetime, random, collections, pathlib

In This Chapter

Chapter Overview
12.1 Why Modules?
12.2 Importing Modules
12.3 The Python Standard Library Tour
12.4 Creating Your Own Modules
12.5 The __name__ == "__main__" Guard
12.6 Packages: Organizing Multiple Modules
12.7 Installing Third-Party Packages with pip
12.8 The Module Search Path
12.9 Common Pitfalls
12.10 Project Checkpoint: TaskFlow v1.1
Chapter Summary

Chapter 12: Modules, Packages, and the Python Ecosystem

"Good programmers write good code. Great programmers reuse great code." — Paraphrased from a common software engineering maxim

Chapter Overview

You've been writing functions since Chapter 6. You know how to package a block of logic behind a name and call it whenever you need it. But here's a question that starts to gnaw at you once your programs grow beyond a few hundred lines: where do all these functions live?

Right now, everything sits in a single .py file. Your grade calculator, Elena's report script, Dr. Patel's DNA pipeline — they each exist as one growing file that you scroll through endlessly. That works when a program is 50 lines. It becomes painful at 200. It becomes unmanageable at 500. And professional codebases? Those run into hundreds of thousands of lines. Nobody puts all of that in one file.

This chapter teaches you how Python organizes code across multiple files and how to tap into the enormous library of code that other people have already written for you. You'll learn about modules (individual .py files you can import), packages (directories of related modules), and the Python standard library — a collection of pre-built tools that ships with every Python installation. You'll also learn how to install third-party packages from the broader Python ecosystem.

By the end of this chapter, you'll split TaskFlow into multiple well-organized files, and you'll understand why that's not just cosmetic — it's the difference between code that scales and code that collapses under its own weight.

In this chapter, you will learn to: - Import modules using three different styles and understand their trade-offs - Explore the Python standard library and use modules like datetime, random, collections, and pathlib - Create your own modules by saving functions in separate .py files - Use the if __name__ == "__main__" guard to write files that work as both importable modules and runnable scripts - Organize related modules into packages with __init__.py - Install third-party packages using pip - Understand how Python finds modules (the module search path)

🏃 Fast Track: If you've used import before and understand the basics, skim sections 12.1-12.2 and jump to section 12.4 (Creating Your Own Modules) for the hands-on material.

🔬 Deep Dive: After this chapter, read Chapter 23 for a deeper look at virtual environments, requirements.txt, and managing dependencies for real projects.

12.1 Why Modules?

Let's start with a story that will sound familiar.

You've been working on a project for a few weeks. It started as a clean little script — maybe 80 lines. Then you added input validation. Then file I/O. Then a menu system. Then formatting. Now it's 500 lines, and every time you want to find a specific function, you're scrolling, scrolling, scrolling. You make a change to one function and accidentally break another one 300 lines away because they shared a variable name you forgot about.

🧩 Productive Struggle: Your project has 500 lines in one file. How would you organize it? Before reading further, take two minutes and sketch out a plan. What groups of functions belong together? What would you name the files? There's no single right answer — but thinking about it before seeing the "official" approach will make the solution stick.

This isn't a hypothetical. It's the exact problem that every programmer hits, and it's the problem that modules solve.

The Three Problems Modules Solve

1. Organization. Modules let you group related functions, classes, and variables into separate files. Your grade calculation logic goes in one file. Your display formatting goes in another. Your file I/O goes in a third. When you need to fix the display, you open the display file. You don't wade through 500 lines of unrelated code.

2. Reuse. Once you've written a useful function in a module, you can import it into any other program. Elena wrote a format_currency() function for her nonprofit report. She can now use it in her budget tracker, her donation analyzer, and her annual summary — without copying and pasting the code. This is the DRY principle ("Don't Repeat Yourself") at scale.

3. Namespace isolation. Each module has its own namespace — its own separate world of names. If your display.py has a function called format() and your storage.py also has a function called format(), they don't collide. Python keeps them separate as display.format() and storage.format(). Without modules, you'd be stuck inventing increasingly awkward names like format_display_output() and format_storage_data() to avoid conflicts.

🔗 Connection to Chapter 6: Remember when we discussed how functions create local scopes to prevent variable names from colliding? Modules extend that same principle to the file level. Functions isolate names within a function; modules isolate names within a file. It's abstraction all the way up.

What Is a Module, Exactly?

A module is simply a .py file. That's it. Every Python file you've written is already a module. The file grade_calculator.py is a module named grade_calculator. The file helpers.py is a module named helpers.

The magic happens when you import one module into another. That's how you share code between files.

12.2 Importing Modules

Python gives you three ways to import code, and each has its place. Let's learn all three, then compare them.

Style 1: `import module_name`

The simplest approach imports the entire module:

import math

print(math.sqrt(144))    # 12.0
print(math.pi)           # 3.141592653589793
print(math.ceil(4.2))    # 5

Output:

12.0
3.141592653589793
5

With this style, you access everything through the module name: math.sqrt(), math.pi, math.ceil(). The module name acts as a prefix — a namespace qualifier — that tells you exactly where each function comes from.

Style 2: `from module_name import specific_thing`

If you only need one or two items from a module, you can import them directly:

from math import sqrt, pi

print(sqrt(144))    # 12.0
print(pi)           # 3.141592653589793

Output:

12.0
3.141592653589793

Now sqrt and pi are available directly — no math. prefix needed. This is convenient but carries a risk: if you also have a variable called pi in your code, the import will silently overwrite it (or vice versa).

Style 3: `import module_name as alias`

Sometimes module names are long or clash with your own names. An alias provides a shortcut:

import datetime as dt

now = dt.datetime.now()
print(now.strftime("%Y-%m-%d %H:%M"))

Output (will vary):

2025-09-15 14:30

You'll see this pattern constantly in the Python world. The data science ecosystem has well-known conventions: import numpy as np, import pandas as pd, import matplotlib.pyplot as plt. These aren't just personal preferences — they're community standards that make code instantly readable to other Python developers.

You can also combine from with as:

from collections import Counter as Tally

votes = ["yes", "no", "yes", "yes", "no", "abstain"]
result = Tally(votes)
print(result)

Output:

Counter({'yes': 3, 'no': 2, 'abstain': 1})

Comparison Table: Import Styles

Style	Syntax	Access Pattern	Best When
Full import	`import math`	`math.sqrt(x)`	You use many items from the module; want clear origin
Selective import	`from math import sqrt`	`sqrt(x)`	You use 1-3 items; names are unambiguous
Aliased import	`import datetime as dt`	`dt.datetime.now()`	Module name is long; community convention exists
Selective + alias	`from collections import Counter as C`	`C(data)`	Renaming for clarity or to avoid name clash
Avoid	`from math import *`	`sqrt(x)`, `pi`, ...	Almost never — pollutes namespace unpredictably

⚠️ Pitfall: The from module import * syntax imports everything from a module into your namespace. This is almost always a bad idea. You can't tell where names came from, and two import * statements can silently overwrite each other's names. Use it only in the interactive REPL for quick experiments — never in production code.

What Happens When You Import?

When Python executes an import statement, three things happen:

Python finds the module file (we'll cover how it finds it in Section 12.8).
Python executes the entire file from top to bottom. Every assignment, function definition, and print() call in that file runs.
Python creates a module object and binds it to the name you imported.

That second point is important. If a module has a print("Loading...") call at the top level, you'll see "Loading..." appear as a side effect of importing it. This is sometimes useful and sometimes a nasty surprise. We'll revisit this in Section 12.5 when we discuss the __name__ guard.

Also important: Python only executes a module once per session, no matter how many times you import it. The first import math runs the module; subsequent import math statements just reuse the already-loaded module object. This is efficient and prevents side effects from repeating.

🔄 Check Your Understanding

What is the difference between import random and from random import randint?

After from math import sqrt, can you still use math.pi? Why or why not?

Why is from os import * considered bad practice?

Verify

import random makes the entire random module available via random.randint(), random.choice(), etc. from random import randint makes only randint available directly — no prefix needed, but other random functions aren't imported.

No. from math import sqrt imports only sqrt into your namespace. The name math itself is not defined. You'd need a separate import math to access math.pi.

It dumps every public name from os into your namespace — hundreds of names. You can't tell where a function came from, and it might silently overwrite your own variables.

12.3 The Python Standard Library Tour

Python ships with a massive collection of pre-built modules called the standard library — often described with the phrase "batteries included." These modules are already installed. No pip install needed. You just import them.

The standard library contains over 200 modules. You won't need most of them, but a handful are so useful that you'll import them in almost every project. Let's tour the greatest hits.

`datetime` — Working with Dates and Times

Elena needs timestamps on her nonprofit reports. The datetime module handles all things date and time:

from datetime import datetime, timedelta

# Current date and time
now = datetime.now()
print(f"Report generated: {now.strftime('%B %d, %Y at %I:%M %p')}")

# Parsing a date from a string
deadline = datetime.strptime("2025-12-31", "%Y-%m-%d")
print(f"Deadline: {deadline.strftime('%A, %B %d, %Y')}")

# Date arithmetic
days_left = deadline - now
print(f"Days until deadline: {days_left.days}")

# Adding time
one_week_later = now + timedelta(weeks=1)
print(f"One week from now: {one_week_later.strftime('%Y-%m-%d')}")

Output (will vary based on current date):

Report generated: September 15, 2025 at 02:30 PM
Deadline: Wednesday, December 31, 2025
Days until deadline: 107
One week from now: 2025-09-22

Elena replaces her manual timestamp typing with datetime.now() — one line of code, always accurate, never a typo.

`random` — Generating Random Values

The random module is essential for simulations, games, and testing:

import random

# Random integer in a range (inclusive on both ends)
die_roll = random.randint(1, 6)
print(f"You rolled: {die_roll}")

# Random choice from a sequence
colors = ["red", "green", "blue", "yellow"]
pick = random.choice(colors)
print(f"Random color: {pick}")

# Shuffle a list in place
deck = list(range(1, 53))
random.shuffle(deck)
print(f"First 5 cards: {deck[:5]}")

# Random float between 0 and 1
probability = random.random()
print(f"Random probability: {probability:.4f}")

Output (will vary):

You rolled: 4
Random color: blue
First 5 cards: [37, 12, 48, 3, 21]
Random probability: 0.7234

`collections` — Specialized Data Structures

Here's where Dr. Patel's life changes. She's been counting nucleotides in DNA sequences with a manual loop:

# Dr. Patel's original approach — 15 lines of counting code
sequence = "ATCGATCGATCGAATTCCGG"
counts = {}
for nucleotide in sequence:
    if nucleotide in counts:
        counts[nucleotide] += 1
    else:
        counts[nucleotide] = 0
        counts[nucleotide] += 1
print(counts)

Output:

{'A': 5, 'T': 5, 'C': 5, 'G': 5}

Then a colleague mentions collections.Counter:

from collections import Counter

sequence = "ATCGATCGATCGAATTCCGG"
counts = Counter(sequence)
print(counts)
print(f"Most common: {counts.most_common(2)}")

Output:

Counter({'A': 5, 'T': 5, 'C': 5, 'G': 5})
Most common: [('A', 5), ('T', 5)]

Two lines replace fifteen. Dr. Patel stares at her screen for a long moment, then quietly deletes a lot of code.

Counter isn't the only gem in collections. Here are two more:

from collections import defaultdict, namedtuple

# defaultdict — no more KeyError when accessing missing keys
word_positions = defaultdict(list)
sentence = "the cat sat on the mat"
for i, word in enumerate(sentence.split()):
    word_positions[word].append(i)
print(dict(word_positions))

# namedtuple — tuples with named fields (like lightweight classes)
Student = namedtuple("Student", ["name", "grade", "gpa"])
alice = Student("Alice", "A", 3.9)
print(f"{alice.name}: {alice.grade} ({alice.gpa})")

Output:

{'the': [0, 4], 'cat': [1], 'sat': [2], 'on': [3], 'mat': [5]}
Alice: A (3.9)

`pathlib` — Modern File Path Handling

🔗 Spaced Review (Ch 10): In Chapter 10, you learned to read and write files. The pathlib module gives you a cleaner way to work with file paths — no more messy string concatenation with slashes.

from pathlib import Path

# Create a path object
data_dir = Path("data")
report_file = data_dir / "reports" / "weekly.txt"
print(f"Path: {report_file}")
print(f"File name: {report_file.name}")
print(f"Extension: {report_file.suffix}")
print(f"Parent directory: {report_file.parent}")

# Check if a path exists
home = Path.home()
print(f"Home directory: {home}")
print(f"Home exists: {home.exists()}")

# List files in a directory
current = Path(".")
python_files = list(current.glob("*.py"))
print(f"Python files here: {python_files}")

Output (will vary):

Path: data/reports/weekly.txt
File name: weekly.txt
Extension: .txt
Parent directory: data/reports
Home directory: /home/username
Home exists: True
Python files here: [PosixPath('main.py'), PosixPath('helpers.py')]

Notice the / operator for joining paths — much cleaner than os.path.join("data", "reports", "weekly.txt").

`math` — Mathematical Functions

import math

print(f"Pi: {math.pi}")
print(f"e: {math.e}")
print(f"Square root of 2: {math.sqrt(2):.6f}")
print(f"Ceiling of 4.1: {math.ceil(4.1)}")
print(f"Floor of 4.9: {math.floor(4.9)}")
print(f"Factorial of 6: {math.factorial(6)}")
print(f"Log base 2 of 1024: {math.log2(1024):.1f}")

Output:

Pi: 3.141592653589793
e: 2.718281828459045
Square root of 2: 1.414214
Ceiling of 4.1: 5
Floor of 4.9: 4
Factorial of 6: 720
Log base 2 of 1024: 10.0

`os` and `sys` — System Interaction

These modules let your program interact with the operating system:

import os
import sys

# os: Working with the file system
print(f"Current directory: {os.getcwd()}")
print(f"Files here: {os.listdir('.')[:5]}")

# os.path: Older-style path manipulation (pathlib is preferred)
print(f"Join: {os.path.join('data', 'output.csv')}")
print(f"Exists: {os.path.exists('.')}")

# sys: Python runtime information
print(f"Python version: {sys.version}")
print(f"Platform: {sys.platform}")
print(f"Command-line args: {sys.argv}")

Output (will vary):

Current directory: /home/user/projects
Files here: ['main.py', 'data', 'README.md']
Join: data/output.csv
Exists: True
Python version: 3.12.4 (main, Jun  6 2024, 18:26:44) [GCC 11.4.0]
Platform: linux
Command-line args: ['example.py']

💡 Intuition: Think of the standard library as a well-stocked toolbox that came free with Python. Before you write a function to do something common — counting items, working with dates, handling file paths, generating random numbers — check whether the standard library already has it. Nine times out of ten, it does, and the standard library version is better tested and more reliable than what you'd write from scratch.

12.4 Creating Your Own Modules

You've been importing other people's modules. Now let's create your own.

Here's the grade calculator, currently living in one file. Let's split it into two modules: one for the calculation logic, and one for the display logic.

File: grading.py

"""Grading logic — calculations and letter grade conversion."""

def calculate_average(scores):
    """Return the average of a list of numeric scores."""
    if not scores:
        return 0.0
    return sum(scores) / len(scores)

def letter_grade(score):
    """Convert a numeric score (0-100) to a letter grade."""
    if score >= 90:
        return "A"
    elif score >= 80:
        return "B"
    elif score >= 70:
        return "C"
    elif score >= 60:
        return "D"
    else:
        return "F"

def weighted_average(scores, weights):
    """Return the weighted average of scores with corresponding weights."""
    if len(scores) != len(weights):
        raise ValueError("scores and weights must have the same length")
    total = sum(s * w for s, w in zip(scores, weights))
    return total / sum(weights)

File: display.py

"""Display formatting — how results appear on screen."""

def print_header(title):
    """Print a centered header with a border."""
    border = "=" * 40
    print(border)
    print(f"{title:^40}")
    print(border)

def print_student_report(name, scores, average, grade):
    """Print a formatted student report."""
    print(f"\nStudent: {name}")
    print(f"Scores:  {', '.join(str(s) for s in scores)}")
    print(f"Average: {average:.1f}")
    print(f"Grade:   {grade}")
    print("-" * 30)

File: main.py

"""Grade calculator — main program that ties everything together."""

import grading
import display

def main():
    display.print_header("Grade Calculator v2.0")

    students = {
        "Alice": [92, 88, 95, 91],
        "Bob": [78, 82, 75, 80],
        "Charlie": [65, 70, 68, 72],
    }

    for name, scores in students.items():
        avg = grading.calculate_average(scores)
        grade = grading.letter_grade(avg)
        display.print_student_report(name, scores, avg, grade)

    print("\nDone!")

main()

Output (when running main.py):

========================================
          Grade Calculator v2.0
========================================

Student: Alice
Scores:  92, 88, 95, 91
Average: 91.5
Grade:   A
------------------------------

Student: Bob
Scores:  78, 82, 75, 80
Average: 78.8
Grade:   C
------------------------------

Student: Charlie
Scores:  65, 70, 68, 72
Average: 68.8
Grade:   D
------------------------------

Done!

All three files must be in the same directory. When main.py says import grading, Python looks for grading.py in the same directory (among other places — see Section 12.8).

Notice how clean main.py is. It's just the orchestration layer — it calls functions from grading and display. If you need to change how letter grades are assigned, you edit grading.py. If you want to change the display format, you edit display.py. Neither change touches main.py.

✅ Best Practice: Keep your modules focused. Each module should have a clear, single responsibility. If you find yourself describing a module with "and" — "this module handles grading and display and file I/O" — it probably needs to be split.

12.5 The `name == "main"` Guard

Here's a scenario that trips up every beginner at least once.

You write grading.py with some functions and add a few test calls at the bottom to verify they work:

"""Grading logic with test calls at the bottom."""

def calculate_average(scores):
    if not scores:
        return 0.0
    return sum(scores) / len(scores)

def letter_grade(score):
    if score >= 90: return "A"
    elif score >= 80: return "B"
    elif score >= 70: return "C"
    elif score >= 60: return "D"
    else: return "F"

# Quick test
print("Testing grading module...")
print(calculate_average([90, 80, 70]))   # Should be 80.0
print(letter_grade(85))                   # Should be B

This works great when you run grading.py directly. But what happens when main.py does import grading? Remember — Python executes the entire file on import. So you'll see:

Testing grading module...
80.0
B

...printed to the console every time someone imports your module. That's not what you want.

The solution is Python's __name__ guard:

"""Grading logic — with proper __name__ guard."""

def calculate_average(scores):
    if not scores:
        return 0.0
    return sum(scores) / len(scores)

def letter_grade(score):
    if score >= 90: return "A"
    elif score >= 80: return "B"
    elif score >= 70: return "C"
    elif score >= 60: return "D"
    else: return "F"

if __name__ == "__main__":
    # This only runs when grading.py is executed directly,
    # NOT when it's imported by another file.
    print("Testing grading module...")
    print(calculate_average([90, 80, 70]))   # 80.0
    print(letter_grade(85))                   # B

How It Works

Every Python module has a built-in variable called __name__. Its value depends on how the file is being used:

If you run the file directly (python grading.py), Python sets __name__ to the string "__main__".
If the file is imported by another file (import grading), Python sets __name__ to the module's name — in this case, the string "grading".

So if __name__ == "__main__": is asking: "Am I the file that was run directly, or was I imported?" If you were run directly, execute the code inside the if block. If you were imported, skip it.

# Demonstrate __name__ behavior
# Save this as demo_name.py

print(f"My __name__ is: {__name__}")

Running directly:

$ python demo_name.py
My __name__ is: __main__

Importing from another file:

import demo_name   # prints: My __name__ is: demo_name

💡 Intuition: Think of __name__ as a module's self-awareness. It knows whether it's the "main character" (run directly) or a "supporting actor" (imported by someone else). The guard lets the module behave differently in each role.

✅ Best Practice: Every Python file that could be either imported or run directly should have a __name__ guard. Put your "entry point" code — the stuff that should only run when the file is the main program — inside if __name__ == "__main__":. This is so standard that most Python developers consider it mandatory.

12.6 Packages: Organizing Multiple Modules

A module is a single .py file. A package is a directory containing multiple related modules.

Once you have three, four, five modules, they need structure. You don't just dump them all in the same directory with unrelated files. You organize them into a package.

Package Structure

Here's what a package looks like on disk:

my_project/
    main.py
    grading/
        __init__.py
        calculations.py
        display.py
        utils.py

The grading/ directory is a package. The key ingredient is __init__.py — a file that tells Python "this directory is a package, not just a random folder." It can be empty, or it can contain initialization code.

Creating a Package

Let's turn our grade calculator modules into a proper package.

File: grading/__init__.py

"""The grading package — tools for grade calculation and display."""

# You can leave this empty, or import key items for convenience:
from .calculations import calculate_average, letter_grade
from .display import print_header, print_student_report

File: grading/calculations.py

"""Grade calculation functions."""

def calculate_average(scores):
    """Return the average of a list of numeric scores."""
    if not scores:
        return 0.0
    return sum(scores) / len(scores)

def letter_grade(score):
    """Convert a numeric score (0-100) to a letter grade."""
    if score >= 90: return "A"
    elif score >= 80: return "B"
    elif score >= 70: return "C"
    elif score >= 60: return "D"
    else: return "F"

File: grading/display.py

"""Grade display and formatting functions."""

def print_header(title):
    """Print a centered header with a border."""
    border = "=" * 40
    print(border)
    print(f"{title:^40}")
    print(border)

def print_student_report(name, scores, average, grade):
    """Print a formatted student report."""
    print(f"\nStudent: {name}")
    print(f"Scores:  {', '.join(str(s) for s in scores)}")
    print(f"Average: {average:.1f}")
    print(f"Grade:   {grade}")
    print("-" * 30)

File: main.py (at the project root, outside the package)

"""Main program using the grading package."""

# Option A: Import the package (uses __init__.py convenience imports)
from grading import calculate_average, letter_grade, print_header, print_student_report

# Option B: Import specific modules
# from grading import calculations, display
# Then use: calculations.calculate_average(...), display.print_header(...)

# Option C: Import submodules directly
# from grading.calculations import calculate_average, letter_grade
# from grading.display import print_header, print_student_report

def main():
    print_header("Grade Calculator v3.0 (Package Edition)")

    students = {
        "Alice": [92, 88, 95, 91],
        "Bob": [78, 82, 75, 80],
    }

    for name, scores in students.items():
        avg = calculate_average(scores)
        grade = letter_grade(avg)
        print_student_report(name, scores, avg, grade)

if __name__ == "__main__":
    main()

Output:

========================================
  Grade Calculator v3.0 (Package Edition)
========================================

Student: Alice
Scores:  92, 88, 95, 91
Average: 91.5
Grade:   A
------------------------------

Student: Bob
Scores:  78, 82, 75, 80
Average: 78.8
Grade:   C
------------------------------

Relative Imports

Notice the dots in __init__.py:

from .calculations import calculate_average, letter_grade

The . means "from this same package." It's a relative import — it says "import from calculations.py which is in the same directory as me." This is different from an absolute import like from grading.calculations import ..., which uses the full package path.

Relative imports only work inside packages. You'll see them in __init__.py and when one module in a package imports from another module in the same package.

🔄 Check Your Understanding

What is the purpose of __init__.py in a package directory?

What does the . mean in from .calculations import calculate_average?

If you have a package utils/ with modules math_helpers.py and string_helpers.py, how would you import the clean_text() function from string_helpers.py?

Verify

__init__.py tells Python that the directory is a package (not just a folder). It can be empty, or it can contain initialization code and convenience imports.

The . means "from this same package" — it's a relative import.

from utils.string_helpers import clean_text (absolute import from outside the package) or from .string_helpers import clean_text (relative import from inside the package).

12.7 Installing Third-Party Packages with `pip`

The standard library is enormous, but it doesn't cover everything. The broader Python ecosystem — hosted on the Python Package Index (PyPI) at pypi.org — contains over 500,000 third-party packages. These cover everything from web frameworks (flask, django) to data science (pandas, numpy) to game development (pygame) to terminal formatting (rich).

You install third-party packages using pip, the package installer that ships with Python.

Basic `pip` Commands

# Install a package
pip install requests

# Install a specific version
pip install requests==2.31.0

# Upgrade a package
pip install --upgrade requests

# Uninstall a package
pip uninstall requests

# List installed packages
pip list

# Show details about a package
pip show requests

# Freeze current packages (for reproducibility)
pip freeze > requirements.txt

# Install all packages from a requirements file
pip install -r requirements.txt

A Quick Example: `requests`

The requests library makes HTTP requests clean and simple (compared to Python's built-in urllib):

# After: pip install requests
import requests

response = requests.get("https://api.github.com")
print(f"Status: {response.status_code}")
print(f"Content type: {response.headers['Content-Type']}")
data = response.json()
print(f"GitHub API URL: {data['current_user_url']}")

Output:

Status: 200
Content type: application/json; charset=utf-8
GitHub API URL: https://api.github.com/user

⚠️ Pitfall: On some systems, pip is called pip3 (to distinguish it from Python 2's pip). If pip install fails, try pip3 install. You can also use python -m pip install to be explicit about which Python installation you're using.

🔗 Bridge to Chapter 23: Chapter 23 covers virtual environments — isolated Python installations for each project. In real work, you'll create a virtual environment for every project to avoid version conflicts between different projects' dependencies. For now, pip install in your global Python installation is fine for learning.

12.8 The Module Search Path

When you write import grading, how does Python actually find grading.py? It follows a specific search order, stored in sys.path:

import sys

for path in sys.path:
    print(path)

Output (will vary):

/home/user/projects
/usr/lib/python312.zip
/usr/lib/python3.12
/usr/lib/python3.12/lib-dynload
/usr/local/lib/python3.12/dist-packages

Python searches these directories in order:

The directory containing the script being run (or the current directory in the REPL).
Directories in the PYTHONPATH environment variable (if set).
The standard library directories.
The site-packages directory (where pip installs third-party packages).

The first match wins. If you have a file called random.py in your project directory and you import random, Python will import your file instead of the standard library's random module — because your directory is checked first. This is a common and devastating bug. We'll cover it in Section 12.9.

💡 Intuition: Think of sys.path like a search path on a shelf. Python goes left to right, checking each location. The moment it finds a file matching the name you imported, it stops looking. If you accidentally placed a file with the same name as a standard library module on the leftmost shelf, Python grabs it and never reaches the real one.

12.9 Common Pitfalls

Pitfall 1: Name Shadowing

This is the most common module-related bug. You create a file called random.py in your project directory, and suddenly import random breaks everything:

🐛 Debugging Walkthrough: Name Shadowing

The symptom: python import random print(random.randint(1, 10)) AttributeError: module 'random' has no attribute 'randint'

What happened? You have a file called random.py in your project directory. Python found your random.py first (because the script's directory is first in sys.path) and imported it instead of the standard library's random module. Your file doesn't have a randint function, so the AttributeError appears.

The fix: Never name your files after standard library modules. Rename random.py to something else — dice_game.py, my_random.py, anything that doesn't clash. Also delete the cached random.pyc file if one exists in a __pycache__ directory.

Prevention: Common names to avoid: random.py, math.py, os.py, sys.py, collections.py, datetime.py, json.py, email.py, test.py, string.py.

Pitfall 2: Circular Imports

🐛 Debugging Walkthrough: Circular Imports

The scenario: You have two modules that import each other.

File: models.py ```python from display import format_task # imports display

class Task: def init(self, title): self.title = title def str(self): return format_task(self) ```

File: display.py ```python from models import Task # imports models — CIRCULAR!

def format_task(task): return f"[Task] {task.title}"

def show_all(tasks): for task in tasks: if isinstance(task, Task): print(format_task(task)) ```

The symptom: ImportError: cannot import name 'Task' from partially initialized module 'models' (most likely due to a circular import)

What happened? Python tries to import models.py. The first line of models.py says from display import format_task, so Python pauses loading models.py and starts loading display.py. The first line of display.py says from models import Task, but models.py isn't finished loading yet — Python hasn't gotten to the class Task definition. So Task doesn't exist, and the import fails.

Three fixes: 1. Restructure to remove the cycle. Often the cleanest solution. Ask: does display.py really need to import models? Could the isinstance check be done differently? 2. Move the import inside a function. Delay the import until it's actually needed: python def show_all(tasks): from models import Task # imported at call time, not load time for task in tasks: if isinstance(task, Task): print(format_task(task)) 3. Use a third module. Move shared code into a separate module that both can import without cycles.

Pitfall 3: Import Side Effects

If a module runs code at the top level (outside any function), that code executes when the module is imported. This can cause unexpected behavior:

# bad_module.py
print("Initializing bad_module...")      # Runs on import!
data = open("config.txt").read()          # Runs on import!
connection = connect_to_database()        # Runs on import!

Anyone who writes import bad_module will trigger a print statement, a file read, and a database connection — even if they only wanted one function from the module.

The fix: Put side-effect code inside functions or behind a __name__ guard. Module-level code should be limited to function and class definitions, constants, and simple assignments.

🔗 Spaced Review (Ch 11): Remember the EAFP philosophy from error handling? The same principle applies here — when you import a module and something goes wrong (missing file, network error), try/except around the import can provide a graceful fallback: python try: import rich HAS_RICH = True except ImportError: HAS_RICH = False # Fall back to plain print() if rich isn't installed

12.10 Project Checkpoint: TaskFlow v1.1

🔗 Spaced Review (Ch 6): In Chapter 6, you learned that functions are the building blocks of organized code. In Chapter 10, you added file persistence. In Chapter 11, you added error handling. Now we take the next step: splitting TaskFlow into multiple modules, each with a clear responsibility.

It's time to split TaskFlow from one monolithic file into a well-organized multi-module project. Here's the new structure:

taskflow/
    main.py
    models.py
    storage.py
    display.py
    cli.py

Each file has a single, clear responsibility:

File	Responsibility
`models.py`	Task data creation and manipulation
`storage.py`	Loading and saving tasks to JSON
`display.py`	Formatting and printing task information
`cli.py`	Menu display, user input handling
`main.py`	Entry point — wires everything together

`models.py` — Task Data

"""TaskFlow models — task creation and manipulation."""

from datetime import datetime

def create_task(title, priority="medium", category="general"):
    """Create a new task dictionary with metadata."""
    return {
        "title": title,
        "priority": priority,
        "category": category,
        "created": datetime.now().isoformat(),
        "completed": False,
    }

def complete_task(task):
    """Mark a task as completed."""
    task["completed"] = True
    task["completed_at"] = datetime.now().isoformat()

def matches_search(task, keyword):
    """Check if a task matches a search keyword (case-insensitive)."""
    keyword_lower = keyword.lower()
    return (
        keyword_lower in task["title"].lower()
        or keyword_lower in task["category"].lower()
    )

if __name__ == "__main__":
    # Quick test
    t = create_task("Write chapter 12", priority="high", category="writing")
    print(f"Created: {t}")
    print(f"Matches 'chapter': {matches_search(t, 'chapter')}")
    print(f"Matches 'cooking': {matches_search(t, 'cooking')}")

`storage.py` — JSON Persistence

"""TaskFlow storage — load and save tasks to a JSON file."""

import json
from pathlib import Path

DEFAULT_FILE = Path("tasks.json")

def load_tasks(filepath=DEFAULT_FILE):
    """Load tasks from a JSON file. Return empty list if file doesn't exist."""
    filepath = Path(filepath)
    if not filepath.exists():
        return []
    try:
        with open(filepath, "r", encoding="utf-8") as f:
            return json.load(f)
    except (json.JSONDecodeError, OSError) as e:
        print(f"Warning: Could not load tasks from {filepath}: {e}")
        return []

def save_tasks(tasks, filepath=DEFAULT_FILE):
    """Save tasks to a JSON file."""
    filepath = Path(filepath)
    try:
        with open(filepath, "w", encoding="utf-8") as f:
            json.dump(tasks, f, indent=2, ensure_ascii=False)
    except OSError as e:
        print(f"Error: Could not save tasks to {filepath}: {e}")

if __name__ == "__main__":
    # Quick test: save and load
    test_tasks = [
        {"title": "Test task", "priority": "high", "completed": False}
    ]
    test_file = Path("test_tasks.json")
    save_tasks(test_tasks, test_file)
    loaded = load_tasks(test_file)
    print(f"Saved and loaded: {loaded}")
    test_file.unlink()  # Clean up test file
    print("Test file cleaned up.")

`display.py` — Formatting and Output

"""TaskFlow display — formatting and printing task information."""

def print_header():
    """Print the TaskFlow application header."""
    print("\n" + "=" * 44)
    print("        TaskFlow v1.1 — Task Manager")
    print("=" * 44)

def print_task(task, index):
    """Print a single task with its index number."""
    status = "done" if task.get("completed") else "todo"
    priority = task.get("priority", "medium")
    marker = {"high": "!!!", "medium": " ! ", "low": "   "}.get(priority, "   ")
    check = "[x]" if task.get("completed") else "[ ]"
    print(f"  {index:>3}. {check} {marker} {task['title']}")
    if task.get("category", "general") != "general":
        print(f"            Category: {task['category']}")

def print_task_list(tasks):
    """Print all tasks in a formatted list."""
    if not tasks:
        print("\n  No tasks yet. Add one with option 1!")
        return
    print(f"\n  Your Tasks ({len(tasks)} total):")
    print("  " + "-" * 40)
    for i, task in enumerate(tasks, 1):
        print_task(task, i)
    print()

def print_message(message):
    """Print an informational message."""
    print(f"\n  >> {message}")

if __name__ == "__main__":
    # Quick visual test
    print_header()
    sample_tasks = [
        {"title": "Buy groceries", "priority": "medium", "completed": False, "category": "personal"},
        {"title": "Submit report", "priority": "high", "completed": True, "category": "work"},
        {"title": "Read chapter 12", "priority": "low", "completed": False, "category": "general"},
    ]
    print_task_list(sample_tasks)

`cli.py` — Menu and User Input

"""TaskFlow CLI — menu display and user input handling."""

MENU = """
  What would you like to do?
  1. Add a task
  2. List all tasks
  3. Complete a task
  4. Search tasks
  5. Delete a task
  6. Quit
"""

def show_menu():
    """Display the main menu and return the user's choice."""
    print(MENU)
    while True:
        choice = input("  Enter choice (1-6): ").strip()
        if choice in ("1", "2", "3", "4", "5", "6"):
            return choice
        print("  Invalid choice. Please enter 1-6.")

def get_task_input():
    """Prompt user for task details. Return a dict of inputs."""
    title = input("  Task title: ").strip()
    if not title:
        return None

    priority = input("  Priority (high/medium/low) [medium]: ").strip().lower()
    if priority not in ("high", "medium", "low"):
        priority = "medium"

    category = input("  Category [general]: ").strip().lower()
    if not category:
        category = "general"

    return {"title": title, "priority": priority, "category": category}

def get_task_number(max_num):
    """Prompt user for a task number. Return the number or None."""
    try:
        num = int(input(f"  Task number (1-{max_num}): "))
        if 1 <= num <= max_num:
            return num
        print(f"  Please enter a number between 1 and {max_num}.")
        return None
    except ValueError:
        print("  Please enter a valid number.")
        return None

def get_search_keyword():
    """Prompt user for a search keyword."""
    return input("  Search keyword: ").strip()

if __name__ == "__main__":
    # Quick test
    print("Testing CLI module...")
    choice = show_menu()
    print(f"You chose: {choice}")

`main.py` — Entry Point

"""
TaskFlow v1.1 — A multi-module command-line task manager.

This is the entry point. Run this file to start TaskFlow.
"""

import models
import storage
import display
import cli

def run():
    """Main application loop."""
    tasks = storage.load_tasks()
    display.print_header()
    display.print_message(f"Loaded {len(tasks)} task(s) from disk.")

    while True:
        choice = cli.show_menu()

        if choice == "1":
            # Add a task
            task_input = cli.get_task_input()
            if task_input:
                task = models.create_task(**task_input)
                tasks.append(task)
                storage.save_tasks(tasks)
                display.print_message(f"Added: '{task_input['title']}'")
            else:
                display.print_message("No title entered. Task not added.")

        elif choice == "2":
            # List all tasks
            display.print_task_list(tasks)

        elif choice == "3":
            # Complete a task
            display.print_task_list(tasks)
            if tasks:
                num = cli.get_task_number(len(tasks))
                if num:
                    models.complete_task(tasks[num - 1])
                    storage.save_tasks(tasks)
                    display.print_message(
                        f"Completed: '{tasks[num - 1]['title']}'"
                    )

        elif choice == "4":
            # Search tasks
            keyword = cli.get_search_keyword()
            if keyword:
                results = [t for t in tasks if models.matches_search(t, keyword)]
                display.print_message(
                    f"Found {len(results)} task(s) matching '{keyword}':"
                )
                display.print_task_list(results)
            else:
                display.print_message("No keyword entered.")

        elif choice == "5":
            # Delete a task
            display.print_task_list(tasks)
            if tasks:
                num = cli.get_task_number(len(tasks))
                if num:
                    removed = tasks.pop(num - 1)
                    storage.save_tasks(tasks)
                    display.print_message(f"Deleted: '{removed['title']}'")

        elif choice == "6":
            # Quit
            storage.save_tasks(tasks)
            display.print_message("Tasks saved. Goodbye!")
            break

if __name__ == "__main__":
    run()

Why This Structure Matters

Compare the single-file TaskFlow from Chapter 11 — all the logic in one big file — to this version. Each module can be:

Understood independently. You can read display.py without knowing anything about storage.py.
Tested independently. Run python models.py to test just the model functions.
Modified independently. Change how tasks are stored (switch from JSON to SQLite) by editing only storage.py. Nothing else in the project changes.
Reused. If you build a different project that needs JSON file storage, you can reuse storage.py with minimal changes.

This is the payoff of modular thinking. The same abstraction principle you learned with functions in Chapter 6 — hide complexity behind a clean interface — now applies at the file level. Each module is a black box with a clear API: import it, call its functions, don't worry about the internals.

🔄 Check Your Understanding

Why does main.py import four separate modules instead of putting all the code in one file?

What would happen if you renamed models.py to json.py? Why would this be a problem?

In the TaskFlow structure, which file would you modify to add a new display format? Which file stays unchanged?

Verify

Separation of concerns: each module handles one responsibility (data, storage, display, input). This makes the code easier to understand, test, modify, and reuse. Changes to one area don't ripple into others.

Renaming to json.py would shadow the standard library json module. When storage.py does import json, it would import your json.py instead of Python's built-in JSON module, causing an error.

You'd modify display.py for display changes. models.py, storage.py, and cli.py would stay unchanged — that's the benefit of separation of concerns.

Chapter Summary

You've leveled up from writing single-file scripts to organizing code across multiple files — the way professional Python developers do it. Here's what you've learned:

Modules are .py files. You import them to reuse code across files. Python gives you three import styles (import X, from X import Y, import X as Z), each with trade-offs in readability and namespace clarity.

The standard library is your free toolkit of 200+ modules. Before you write a function to count items, generate random numbers, work with dates, or handle file paths — check the standard library first. collections.Counter, datetime, random, pathlib, and math will cover an enormous range of common tasks.

Your own modules are just .py files with functions. Save them in the same directory (or in a package) and import them. Use the __name__ == "__main__" guard to write files that work as both importable modules and standalone scripts.

Packages are directories of related modules with an __init__.py file. They're the next level of organization when you have multiple modules that belong together.

pip installs third-party packages from PyPI, giving you access to over 500,000 community-built tools. In Chapter 23, you'll learn to manage these dependencies properly with virtual environments.

The module search path (sys.path) determines where Python looks for modules. The most important rule: never name your files after standard library modules.

And most importantly, you've applied all of this to TaskFlow — splitting a monolithic script into models.py, storage.py, display.py, cli.py, and main.py. Each module has a single responsibility, can be tested independently, and can be modified without touching the others. That's the power of modular design.

What's Next: In Chapter 13, you'll learn to write automated tests for your code — and that modular structure you just created will make testing dramatically easier. It's no coincidence that well-organized code is also testable code.

Learning Objectives

In This Chapter

Chapter 12: Modules, Packages, and the Python Ecosystem

Chapter Overview

12.1 Why Modules?

The Three Problems Modules Solve

What Is a Module, Exactly?

12.2 Importing Modules

Style 1: import module_name

Style 2: from module_name import specific_thing

Style 3: import module_name as alias

Comparison Table: Import Styles

What Happens When You Import?

12.3 The Python Standard Library Tour

datetime — Working with Dates and Times

random — Generating Random Values

collections — Specialized Data Structures

pathlib — Modern File Path Handling

math — Mathematical Functions

os and sys — System Interaction

12.4 Creating Your Own Modules

12.5 The __name__ == "__main__" Guard

How It Works

12.6 Packages: Organizing Multiple Modules

Package Structure

Creating a Package

Relative Imports

12.7 Installing Third-Party Packages with pip

Basic pip Commands

A Quick Example: requests

12.8 The Module Search Path

12.9 Common Pitfalls

Pitfall 1: Name Shadowing

Pitfall 2: Circular Imports

Pitfall 3: Import Side Effects

12.10 Project Checkpoint: TaskFlow v1.1

models.py — Task Data

storage.py — JSON Persistence

display.py — Formatting and Output

cli.py — Menu and User Input

main.py — Entry Point

Why This Structure Matters

Chapter Summary

Style 1: `import module_name`

Style 2: `from module_name import specific_thing`

Style 3: `import module_name as alias`

`datetime` — Working with Dates and Times

`random` — Generating Random Values

`collections` — Specialized Data Structures

`pathlib` — Modern File Path Handling

`math` — Mathematical Functions

`os` and `sys` — System Interaction

12.5 The `name == "main"` Guard

12.7 Installing Third-Party Packages with `pip`

Basic `pip` Commands

A Quick Example: `requests`

`models.py` — Task Data

`storage.py` — JSON Persistence

`display.py` — Formatting and Output

`cli.py` — Menu and User Input

`main.py` — Entry Point