Chapter 15 Exercises: Building Command-Line Tools and Scripts

DataField.Dev

Chapter 15 Exercises: Building Command-Line Tools and Scripts

Tier 1: Recall and Understanding (Exercises 1-6)

These exercises test your knowledge of CLI concepts and terminology.

Exercise 1: CLI Architecture Layers

List the five layers of a well-structured CLI application as described in this chapter, from outermost (user-facing) to innermost (core functionality). For each layer, provide one example of what belongs there.

Exercise 2: argparse vs. click Comparison

Fill in the following comparison table:

Feature	argparse	click
Installation requirement	?	?
Coding style	?	?
Testing support	?	?
Subcommand nesting	?	?

Exercise 3: Exit Code Meanings

Match each exit code with its conventional meaning:

Exit code 0
Exit code 1
Exit code 2
Exit code 130

Meanings: Usage error, General error, Success, Terminated by Ctrl+C

Exercise 4: Configuration Precedence

A CLI tool reads configuration from four sources. Rank them from lowest to highest priority:

A. Environment variables
B. Built-in defaults
C. Command-line flags
D. Configuration file

Exercise 5: Logging vs. Printing

For each of the following outputs, state whether it should use print() (stdout) or logging (stderr):

The list of files found by a search command
"Processing 42 files..."
"Warning: config file not found, using defaults"
A CSV export of processed data
"Error: permission denied on /etc/shadow"
Debug information about which regex pattern was selected

Exercise 6: Packaging Terminology

Define each of the following terms in one sentence: - console_scripts entry point - pyproject.toml - Editable install (pip install -e .) - Wheel (.whl file) - Source distribution (.tar.gz)

Tier 2: Application (Exercises 7-12)

These exercises require you to apply concepts to concrete tasks.

Exercise 7: argparse Subcommand Parser

Write an argparse setup function that creates a parser for a tool called imgutil with: - A global --verbose flag - A resize subcommand that accepts --width (int, required), --height (int, required), and a positional input_file argument - A convert subcommand that accepts --format (choices: png, jpg, webp), --quality (int, default 85), and a positional input_file argument - A metadata subcommand that accepts a positional input_file and a --json flag for JSON output

Exercise 8: Click Command Group

Rewrite the imgutil tool from Exercise 7 using click decorators. Use @click.group() for the main command and @cli.command() for each subcommand. Use click.Path(exists=True) for the input file argument.

Exercise 9: Configuration Loader

Write a function load_config(config_path: Path | None = None) -> dict that: 1. Starts with these defaults: {"output_format": "text", "max_files": 100, "verbose": False} 2. Loads and merges settings from a TOML file if config_path is provided and exists 3. Overrides with any environment variables that start with IMGUTIL_ 4. Returns the merged configuration dictionary

Exercise 10: Logging Setup

Write a setup_logging function that accepts a verbosity integer (0, 1, or 2) and configures the logging module: - Verbosity 0: Only show WARNING and above on the console - Verbosity 1: Show INFO and above on the console - Verbosity 2: Show DEBUG and above on the console with timestamps - Always write DEBUG to a file called imgutil.log

Exercise 11: Safe File Writer

Write a function safe_write_json(path: Path, data: dict) -> None that: 1. Serializes data to JSON with 2-space indentation 2. Writes to a temporary file in the same directory first 3. Renames the temporary file to the target path (atomic write) 4. Cleans up the temporary file if an error occurs during writing 5. Creates parent directories if they do not exist

Exercise 12: Progress Bar Integration

Write a function that takes a list of file paths and computes their SHA-256 checksums, displaying a rich progress bar during processing. Return a dictionary mapping file paths to their hex digest strings.

Tier 3: Analysis and Problem-Solving (Exercises 13-18)

These exercises require analyzing code, identifying issues, and designing solutions.

Exercise 13: Code Review -- Error Handling

Review the following code and identify at least five problems with its error handling:

def process_files(directory, pattern):
    files = list(Path(directory).glob(pattern))
    results = []
    for f in files:
        data = open(f).read()
        result = json.loads(data)
        results.append(result)
    print(f"Processed {len(results)} files")
    return results

Rewrite it with proper error handling, including specific exception types, resource cleanup, logging, and appropriate exit codes.

Exercise 14: Architecture Refactoring

The following code mixes all five CLI layers together. Identify which lines belong to which layer and refactor it into at least three separate functions with clear responsibilities:

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("input_file")
    parser.add_argument("--uppercase", action="store_true")
    args = parser.parse_args()

    if not os.path.exists(args.input_file):
        print(f"Error: {args.input_file} not found")
        sys.exit(1)

    with open(args.input_file) as f:
        content = f.read()

    if args.uppercase:
        content = content.upper()

    word_count = len(content.split())
    print(f"Words: {word_count}")
    print(content[:200])

Exercise 15: Configuration Conflict Resolution

A user has the following configuration:

~/.config/mytool/config.toml:

[output]
format = "json"
max_lines = 500

Environment:

export MYTOOL_OUTPUT_FORMAT=csv

Command line:

mytool process --max-lines 1000 data.txt

What should the final effective configuration be for format and max_lines? Explain the precedence at each level.

Exercise 16: Prompt Engineering for CLI Tools

Write the prompt you would give to an AI assistant to generate a CLI tool called csvtool that can: - Merge multiple CSV files into one - Filter rows by column value - Convert between CSV and JSON - Show summary statistics (row count, column count, unique values per column)

Your prompt should specify: the CLI framework, argument structure for each command, error handling expectations, output formatting, and any libraries to use. Follow the specification-driven prompting approach from Chapter 10.

Exercise 17: Testing Strategy

For a CLI tool with the following commands -- add, list, delete, and export -- design a testing strategy. List: 1. Five unit tests for the business logic layer (not the CLI layer) 2. Three integration tests using click's CliRunner 3. Two edge case tests 4. How you would mock file system operations

Exercise 18: Stdin/Stdout Pipeline Design

Design a CLI tool called textstat that computes statistics on text input. It should: - Accept input from a file argument OR from stdin (piped input) - Output results to stdout in text or JSON format - Be chainable with other Unix tools (e.g., cat file.txt | textstat | jq .word_count)

Write the argument parsing code and the input-reading function. Explain how your design supports pipeline composition.

Tier 4: Synthesis and Creation (Exercises 19-24)

These exercises require building complete features or tools.

Exercise 19: Complete CLI Tool -- Directory Analyzer

Build a CLI tool called dirstat that analyzes a directory and reports: - Total number of files and subdirectories - File count by extension (e.g., .py: 45, .md: 12) - Total size, average size, and largest file - Files modified in the last N days (configurable)

Requirements: - Use click for argument parsing - Support --format text|json output - Support --recursive flag - Include a --verbose mode that logs each file as it is scanned - Use rich for table output in text mode - Handle permission errors gracefully (skip inaccessible files, report count at end)

Exercise 20: Configuration System

Build a reusable ConfigManager class that: - Supports TOML, YAML, and JSON config file formats (auto-detected by extension) - Implements the full precedence chain: defaults < config file < env vars < CLI args - Provides .get(key, default) with dot-notation for nested keys - Provides .set(key, value) for runtime overrides - Can save the current configuration back to a TOML file - Has a dump() method that shows all settings and their sources (useful for --dump-config flags)

Exercise 21: Interactive Setup Wizard

Build an interactive setup wizard using click.prompt() and click.confirm() that: - Asks the user for a project name, description, and author - Offers a choice of license (MIT, Apache 2.0, GPL 3.0) - Asks which optional features to enable (linting, testing, CI) - Confirms all choices before proceeding - Creates a project directory with a pyproject.toml based on the choices - Also supports a --non-interactive mode where all options are passed as flags

Exercise 22: File Processing Pipeline

Build a text file processing pipeline tool called textpipe that: - Reads input files (or stdin) - Applies a configurable chain of transformations: lowercase, remove blank lines, strip whitespace, deduplicate lines, sort lines - Writes the result to an output file (or stdout) - Shows a progress bar when processing multiple files - The transformations to apply are specified as flags: --lowercase, --strip, --dedup, --sort, --no-blanks - Transformations are applied in a consistent order regardless of flag order

Exercise 23: Plugin System

Design and implement a simple plugin system for a CLI tool: - The main tool defines a Plugin abstract base class with name, description, and execute(args) methods - Plugins are Python files in a ~/.toolname/plugins/ directory - The tool discovers and loads plugins at startup - Each plugin registers itself as a new subcommand - Write the plugin loader, a sample plugin, and the CLI integration code

Exercise 24: Complete Packaging

Take the dirstat tool from Exercise 19 and package it for distribution: - Create the full project structure with src/ layout - Write a complete pyproject.toml with all metadata - Add a __main__.py for python -m dirstat support - Create a CHANGELOG.md with one entry - Write a Makefile (or justfile) with targets: install, dev, test, lint, build - Explain the steps to upload to TestPyPI

Tier 5: Evaluation and Critical Thinking (Exercises 25-30)

These exercises require judgment, comparison, and critical analysis.

Exercise 25: Framework Comparison

You are choosing a CLI framework for a new tool that will: - Have 15+ subcommands organized in 3 groups - Be maintained by a team of 5 developers - Need comprehensive test coverage - Be distributed via pip to thousands of users

Evaluate argparse, click, and typer (a click-based framework that uses type hints). For each, list pros and cons for this specific scenario. Make a recommendation and justify it.

Exercise 26: Error Message Audit

Evaluate the following error messages on a scale of 1-5 for helpfulness, and rewrite any scoring below 4:

Error: ENOENT
Error: Invalid input
FileNotFoundError: [Errno 2] No such file or directory: 'config.toml'
Error: Cannot open 'data.csv'. The file may be locked by another program or you may not have read permission. Try closing other programs that might be using this file, or check the file permissions with 'ls -la data.csv'.
Error: Expected integer for --port, got 'abc'. Valid range: 1-65535.

Exercise 27: Security Review

A CLI tool reads a config file and uses values from it to construct file paths and shell commands:

config = load_config("config.toml")
output_dir = config["output_directory"]
os.makedirs(output_dir, exist_ok=True)

command = f"pandoc {config['input_file']} -o {output_dir}/output.pdf"
os.system(command)

Identify all security vulnerabilities in this code. Explain each vulnerability, its potential impact, and how to fix it.

Exercise 28: Performance Analysis

A CLI tool processes 10,000 small JSON files (1-10 KB each). The current implementation reads all files into memory, processes them, and writes the results. It takes 45 seconds and uses 2 GB of RAM.

Explain why this approach is problematic
Propose a streaming alternative that reduces memory usage to under 100 MB
Propose a parallel processing approach using concurrent.futures to reduce wall-clock time
Discuss the trade-offs between the two approaches
Write the prompt you would give to an AI to implement your preferred solution

Exercise 29: Backward Compatibility

Your CLI tool v1.0 has this interface:

mytool convert <input> <output> --format json

In v2.0, you want to change it to:

mytool convert <input> --output <output> --format json

Design a migration strategy that: - Keeps v1.0 commands working (with a deprecation warning) - Introduces the new interface - Provides a migration guide for users - Includes a timeline for removing the old interface Write the argument parsing code that supports both interfaces simultaneously.

Exercise 30: AI Code Review

Use an AI assistant to generate a CLI tool that converts Markdown to HTML. Then critically review the generated code by answering:

Does it handle all common Markdown syntax (headers, lists, code blocks, links, images)?
Is the error handling comprehensive? What edge cases are missed?
Is the code organized into proper layers, or is it monolithic?
Does it support stdin/stdout for pipeline use?
Would you be comfortable distributing this tool to other developers? What would you change first?
What security concerns exist if processing untrusted Markdown input?

Document your review with specific code references and improvement suggestions.