Chapter 22 Exercises: Scheduling and Task Automation

These exercises are organized into five tiers. Complete each tier before advancing. All exercises use the schedule library as the baseline; Tier 4 and 5 exercises extend to APScheduler and OS-level scheduling.

Tier 1: Comprehension (Understanding the Concepts)

Exercise 22.1 — Scheduling Vocabulary

Match each term to its correct definition:

Terms: scheduler loop, job, trigger, cron expression, heartbeat, task history, grace time, coalesce

Definitions: a) A string format encoding when a task should run, using fields for minute, hour, day, month, and weekday b) A file or database record showing the execution history and results of past scheduled runs c) A small file or database record updated frequently to confirm a scheduler process is still alive and working d) The maximum delay allowed past a job's scheduled time before it is considered "misfired" e) The continuous loop that periodically checks whether any scheduled jobs are due to run f) The behavior of merging multiple missed runs into a single execution when catching up after downtime g) A unit of scheduled work: a Python function plus a rule defining when it should execute h) The definition of when a job should run (a specific time, an interval, or a calendar rule)

Exercise 22.2 — Schedule Syntax Reading

For each schedule library statement, describe in plain English when the job will run:

a) schedule.every().day.at("09:00").do(send_daily_digest) b) schedule.every(15).minutes.do(check_api_status) c) schedule.every().monday.at("07:45").do(generate_weekly_report) d) schedule.every().hour.at(":30").do(log_system_metrics) e) schedule.every().friday.at("16:30").do(send_weekend_summary) f) schedule.every(2).hours.do(backup_database) g) schedule.every().weekday.at("08:00").do(morning_checklist)

Exercise 22.3 — Crontab Reading

Translate each crontab expression into plain English:

a) 0 8 * * 1-5 b) 45 7 * * 1 c) 0 0 1 * * d) */30 9-17 * * 1-5 e) 0 6,12,18 * * * f) 0 9 * * 1 g) 30 23 * * 5

Exercise 22.4 — Identifying Scheduling Problems

Review this scheduled job implementation and identify every problem:

import schedule
import time

def generate_report():
    data = open("sales_data.csv").read()
    report = process_data(data)
    send_email("sandra@acme.com", report)
    print("Report sent!")

schedule.every().monday.at("8:00").do(generate_report)

while True:
    schedule.run_pending()

List every problem and explain why each one could cause issues in production.

Exercise 22.5 — Error Handling Design

A scheduled job load_and_process() can fail in three different ways:

The input CSV file is missing (FileNotFoundError)
The CSV is present but malformed (csv.Error)
The database connection for saving results times out (TimeoutError)

For each failure mode, describe: - What the error handling should do (log? alert? retry? skip?) - Whether the scheduler should continue running after this failure - What information the error log should contain

Tier 2: Guided Practice

Exercise 22.6 — Your First Scheduled Job

Write a script that:

Defines a function log_current_time() that appends the current timestamp to a file called time_log.txt
Schedules it to run every 1 minute
Runs the scheduler for 5 minutes total
After 5 minutes, prints how many entries are in time_log.txt

Expected output: After 5 minutes, time_log.txt should contain approximately 5 timestamp entries (plus or minus one, depending on timing).

Exercise 22.7 — Multiple Jobs on Different Schedules

Write a scheduler script that runs three jobs simultaneously:

Job 1: health_check_fast — runs every 2 minutes, prints "Fast check OK" with the current time

Job 2: health_check_medium — runs every 5 minutes, prints "Medium check OK" and counts how many times it has run

Job 3: health_check_slow — runs every 10 minutes, prints "Slow check OK" and writes to a file called slow_check.log

Run the scheduler for 15 minutes. Verify: - health_check_fast ran approximately 7-8 times - health_check_medium ran approximately 3 times - health_check_slow ran approximately 1-2 times

Exercise 22.8 — Error Handling in Scheduled Jobs

Write a scheduled job that:

Simulates a random failure: generates a random number 1-10, raises an exception if the number is less than 4 (40% failure rate)
Is wrapped in proper error handling — the scheduler must continue even when the job fails
Logs each success and each failure with the current timestamp
After 10 runs, prints a summary: "X of 10 runs succeeded, Y failed"

Requirement: The scheduler must never crash due to this job's failures. Verify by running it for at least 10 executions.

Exercise 22.9 — Logging to File

Take the health_check_fast job from Exercise 22.7 and upgrade it to:

Log to both console AND a rotating file (health_checks.log)
Use TimedRotatingFileHandler to rotate the file daily
Console output: INFO level and above (timestamp + message)
File output: DEBUG level and above (timestamp + level + function name + message)
Log a DEBUG message with the check details and an INFO message with just the status

Run for 5 minutes and verify the log file was created and contains the expected entries.

Tier 3: Applied

Exercise 22.10 — Scheduled File Monitor

Build a script that:

Monitors a directory called watched_files/ every 5 minutes
Maintains a dictionary tracking the last-modified time of each file in the directory
When a file is modified or a new file appears, logs a message: "File changed: filename.csv at HH:MM:SS"
When a file is deleted, logs: "File removed: filename.csv"
After each check, saves the current file inventory to file_inventory.json

Testing: Run the monitor, then manually create, modify, and delete files in watched_files/ while the scheduler runs. Verify the log correctly reflects your actions.

Exercise 22.11 — Business Hours Scheduler

Write a scheduler that:

Runs a process_incoming_orders() job every 30 minutes
But ONLY during business hours: Monday–Friday, 9:00 AM to 5:00 PM
If the job is triggered outside business hours (which shouldn't happen with proper scheduling, but handle it defensively), log a warning and skip
At 5:00 PM each Friday, runs a separate end_of_week_summary() job
Writes a daily log entry at 8:55 AM: "Business day starting — scheduler active"

Requirement: The scheduler must determine whether it's currently business hours and behave correctly for any time the script is started.

Exercise 22.12 — Scheduled Report with State Tracking

Build a script that generates a "daily status report" on a schedule:

Runs every day at a time you specify (or every 2 minutes for testing)
Each run calculates: - Current date and time - How many times the report has run since the script started - Days until end of current month - Days until end of current year
Saves each report to a file: daily_status_YYYY-MM-DD.txt
Maintains a run_history.csv with: run_number, timestamp, status (success/failure), report_filename

Requirement: The run history CSV must persist correctly — if you run the script multiple times, the run numbers should continue from where they left off, not restart at 1.

Exercise 22.13 — APScheduler Basic Setup

Install APScheduler (pip install apscheduler) and rewrite the scheduler from Exercise 22.7 using APScheduler's BackgroundScheduler:

Use IntervalTrigger for health_check_fast (every 2 minutes)
Use IntervalTrigger for health_check_medium (every 5 minutes)
Use CronTrigger for health_check_slow (at specific times: 00, 10, 20, 30, 40, 50 minutes past each hour)
Print the next scheduled run time for each job after setup
Handle KeyboardInterrupt to shut down the scheduler gracefully

Tier 4: Challenge

Exercise 22.14 — Complete Scheduled Pipeline

Design and implement a complete "daily business operations" scheduler for a fictional company. Requirements:

Jobs to schedule: 1. morning_data_sync — 6:00 AM weekdays: "syncs" data by reading a CSV, transforming it, and writing a processed version 2. hourly_metrics_snapshot — every hour 8 AM–6 PM weekdays: calculates and logs 5 business metrics from the CSV 3. daily_summary_email — 5:30 PM weekdays: generates a summary report from the day's metric snapshots and "sends" it (log the email content to a file) 4. weekly_trend_report — Monday 7:00 AM: reads the last 7 days of snapshots and generates a trend report

Non-negotiable requirements: - Every job must be wrapped in error handling (decorator pattern from the chapter) - Every job must log start, end, and duration - The scheduler must write a heartbeat file every 2 minutes - The scheduler must handle Ctrl+C gracefully

Testing: Run for 20 minutes and verify all jobs executed correctly and the logs are complete.

Exercise 22.15 — APScheduler with Persistent Job Store

Build a scheduler that:

Uses APScheduler with a SQLite job store (jobs persist across restarts)
Has 3 scheduled jobs
Adds a job management interface: accept keyboard input to: - list — print all current jobs and their next run time - pause JOB_ID — pause a specific job - resume JOB_ID — resume a paused job - run JOB_ID — run a job immediately - quit — shut down gracefully
When you restart the script (after Ctrl+C), the jobs are loaded from the SQLite database — no need to re-add them

Exercise 22.16 — Windows Task Scheduler / Cron Setup

Create a scheduled task at the OS level for a Python script:

Windows users: 1. Write a Python script that appends the current timestamp to scheduled_run_log.txt 2. Create a Windows Task Scheduler task that runs this script every 5 minutes 3. After 30 minutes, verify the log file has approximately 6 entries 4. Export the task definition to XML (right-click → Export in Task Scheduler) 5. Write instructions for how a colleague could import and configure this task

macOS/Linux users: 1. Write the same Python script 2. Create a crontab entry: */5 * * * * /path/to/python /path/to/script.py >> /path/to/log.txt 2>&1 3. After 30 minutes, verify the log has approximately 6 entries 4. Document the crontab entry with a comment explaining when it runs

Tier 5: Mastery

Exercise 22.17 — Extend the Scheduler Demo

The scheduler_demo.py file in this chapter demonstrates several patterns. Extend it with:

Job dependencies: Add a generate_report() job that only runs if fetch_and_save_exchange_rates() has successfully run in the past 6 hours (check the output file's modification time)
Dynamic scheduling: Add a job that reads a job_config.json file at startup and adds jobs based on the configuration (so you can add new jobs by editing the JSON without modifying Python code)
Performance metrics: After each scheduler loop iteration, log how long run_pending() took. Alert if any single call takes more than 5 seconds (indicating a slow job is blocking the loop)
Recovery: If the scheduler starts and detects that a job that was supposed to run in the past 2 hours has no successful run record, run it immediately before entering the main loop

Exercise 22.18 — Extend the Report Pipeline

The report_pipeline.py generates a Monday Excel report. Extend it with:

Incremental updates: Add a generate_daily_delta_report() function that generates a lightweight daily report showing only changes from the previous day (new customers, significant revenue changes, threshold breaches)
Multiple recipients with custom content: Modify send_report_email() to accept a recipients list, where each recipient can have a different subset of the report data (Sandra gets the executive summary; regional managers get only their region's data)
Report archiving: After generating the report, move it to an archive directory organized by year/month (reports/2024/11/acme_weekly_report_20241104.xlsx)
Email retry: If send_report_email() fails, save the failed email to a pending_emails/ directory. Add a separate job that checks this directory every 15 minutes and retries any pending emails.

Exercise 22.19 — Build Maya's cron Setup

Maya wants to deploy her automation to a Linux cloud server instead of her laptop. Design and implement the full deployment:

Write a setup_maya_cron.sh shell script that: - Creates the required directory structure - Installs Python dependencies - Sets up the .env file (with placeholder values) - Adds all crontab entries using crontab -e or direct file editing - Verifies the setup by testing each script with --dry-run
Modify both pipeline scripts to accept a --dry-run flag that: - Goes through all steps (load data, calculate metrics) but does not send emails - Prints what it would have done instead
Add a monitoring cron job that runs at 7:00 AM daily and checks: - Whether the invoice monitor ran yesterday (by checking the log file) - Whether it found any critical issues - Sends a one-line status email: "All systems go" or "CHECK LOGS: [issue]"

Answer Key Notes

Exercise 22.1: a-cron expression, b-task history, c-heartbeat, d-grace time, e-scheduler loop, f-coalesce, g-job, h-trigger

Exercise 22.2: a) Every day at 9:00 AM b) Every 15 minutes (at :00, :15, :30, :45) c) Every Monday at 7:45 AM d) Every hour at 30 minutes past (at :30) e) Every Friday at 4:30 PM f) Every 2 hours (at :00, :02, :04, ...) g) Every weekday (Monday through Friday) at 8:00 AM

Exercise 22.3: a) 8:00 AM every weekday (Monday–Friday) b) 7:45 AM every Monday c) Midnight on the first day of every month d) Every 30 minutes between 9 AM and 5 PM on weekdays e) 6:00 AM, noon, and 6:00 PM every day f) 9:00 AM every Monday g) 11:30 PM every Friday

Exercise 22.4: Problems include: (1) no try/except around job logic — any error crashes the job and may propagate; (2) open("sales_data.csv") uses a relative path — will fail unless the working directory is exactly right; (3) the file is opened but never explicitly closed (use a with statement); (4) no logging — there is no record of when the job ran or whether it succeeded; (5) the while True loop has no graceful shutdown handling; (6) "8:00" should be "08:00" — the schedule library requires zero-padded hours.

Exercises designed for Python 3.10+. schedule library 1.1+, APScheduler 3.10+.