Chapter 22 Exercises: Scheduling and Task Automation
These exercises are organized into five tiers. Complete each tier before advancing. All exercises use the schedule library as the baseline; Tier 4 and 5 exercises extend to APScheduler and OS-level scheduling.
Tier 1: Comprehension (Understanding the Concepts)
Exercise 22.1 — Scheduling Vocabulary
Match each term to its correct definition:
Terms: scheduler loop, job, trigger, cron expression, heartbeat, task history, grace time, coalesce
Definitions: a) A string format encoding when a task should run, using fields for minute, hour, day, month, and weekday b) A file or database record showing the execution history and results of past scheduled runs c) A small file or database record updated frequently to confirm a scheduler process is still alive and working d) The maximum delay allowed past a job's scheduled time before it is considered "misfired" e) The continuous loop that periodically checks whether any scheduled jobs are due to run f) The behavior of merging multiple missed runs into a single execution when catching up after downtime g) A unit of scheduled work: a Python function plus a rule defining when it should execute h) The definition of when a job should run (a specific time, an interval, or a calendar rule)
Exercise 22.2 — Schedule Syntax Reading
For each schedule library statement, describe in plain English when the job will run:
a) schedule.every().day.at("09:00").do(send_daily_digest)
b) schedule.every(15).minutes.do(check_api_status)
c) schedule.every().monday.at("07:45").do(generate_weekly_report)
d) schedule.every().hour.at(":30").do(log_system_metrics)
e) schedule.every().friday.at("16:30").do(send_weekend_summary)
f) schedule.every(2).hours.do(backup_database)
g) schedule.every().weekday.at("08:00").do(morning_checklist)
Exercise 22.3 — Crontab Reading
Translate each crontab expression into plain English:
a) 0 8 * * 1-5
b) 45 7 * * 1
c) 0 0 1 * *
d) */30 9-17 * * 1-5
e) 0 6,12,18 * * *
f) 0 9 * * 1
g) 30 23 * * 5
Exercise 22.4 — Identifying Scheduling Problems
Review this scheduled job implementation and identify every problem:
import schedule
import time
def generate_report():
data = open("sales_data.csv").read()
report = process_data(data)
send_email("sandra@acme.com", report)
print("Report sent!")
schedule.every().monday.at("8:00").do(generate_report)
while True:
schedule.run_pending()
List every problem and explain why each one could cause issues in production.
Exercise 22.5 — Error Handling Design
A scheduled job load_and_process() can fail in three different ways:
- The input CSV file is missing (
FileNotFoundError) - The CSV is present but malformed (
csv.Error) - The database connection for saving results times out (
TimeoutError)
For each failure mode, describe: - What the error handling should do (log? alert? retry? skip?) - Whether the scheduler should continue running after this failure - What information the error log should contain
Tier 2: Guided Practice
Exercise 22.6 — Your First Scheduled Job
Write a script that:
- Defines a function
log_current_time()that appends the current timestamp to a file calledtime_log.txt - Schedules it to run every 1 minute
- Runs the scheduler for 5 minutes total
- After 5 minutes, prints how many entries are in
time_log.txt
Expected output: After 5 minutes, time_log.txt should contain approximately 5 timestamp entries (plus or minus one, depending on timing).
Exercise 22.7 — Multiple Jobs on Different Schedules
Write a scheduler script that runs three jobs simultaneously:
Job 1: health_check_fast — runs every 2 minutes, prints "Fast check OK" with the current time
Job 2: health_check_medium — runs every 5 minutes, prints "Medium check OK" and counts how many times it has run
Job 3: health_check_slow — runs every 10 minutes, prints "Slow check OK" and writes to a file called slow_check.log
Run the scheduler for 15 minutes. Verify:
- health_check_fast ran approximately 7-8 times
- health_check_medium ran approximately 3 times
- health_check_slow ran approximately 1-2 times
Exercise 22.8 — Error Handling in Scheduled Jobs
Write a scheduled job that:
- Simulates a random failure: generates a random number 1-10, raises an exception if the number is less than 4 (40% failure rate)
- Is wrapped in proper error handling — the scheduler must continue even when the job fails
- Logs each success and each failure with the current timestamp
- After 10 runs, prints a summary: "X of 10 runs succeeded, Y failed"
Requirement: The scheduler must never crash due to this job's failures. Verify by running it for at least 10 executions.
Exercise 22.9 — Logging to File
Take the health_check_fast job from Exercise 22.7 and upgrade it to:
- Log to both console AND a rotating file (
health_checks.log) - Use
TimedRotatingFileHandlerto rotate the file daily - Console output: INFO level and above (timestamp + message)
- File output: DEBUG level and above (timestamp + level + function name + message)
- Log a DEBUG message with the check details and an INFO message with just the status
Run for 5 minutes and verify the log file was created and contains the expected entries.
Tier 3: Applied
Exercise 22.10 — Scheduled File Monitor
Build a script that:
- Monitors a directory called
watched_files/every 5 minutes - Maintains a dictionary tracking the last-modified time of each file in the directory
- When a file is modified or a new file appears, logs a message: "File changed: filename.csv at HH:MM:SS"
- When a file is deleted, logs: "File removed: filename.csv"
- After each check, saves the current file inventory to
file_inventory.json
Testing: Run the monitor, then manually create, modify, and delete files in watched_files/ while the scheduler runs. Verify the log correctly reflects your actions.
Exercise 22.11 — Business Hours Scheduler
Write a scheduler that:
- Runs a
process_incoming_orders()job every 30 minutes - But ONLY during business hours: Monday–Friday, 9:00 AM to 5:00 PM
- If the job is triggered outside business hours (which shouldn't happen with proper scheduling, but handle it defensively), log a warning and skip
- At 5:00 PM each Friday, runs a separate
end_of_week_summary()job - Writes a daily log entry at 8:55 AM: "Business day starting — scheduler active"
Requirement: The scheduler must determine whether it's currently business hours and behave correctly for any time the script is started.
Exercise 22.12 — Scheduled Report with State Tracking
Build a script that generates a "daily status report" on a schedule:
- Runs every day at a time you specify (or every 2 minutes for testing)
- Each run calculates: - Current date and time - How many times the report has run since the script started - Days until end of current month - Days until end of current year
- Saves each report to a file:
daily_status_YYYY-MM-DD.txt - Maintains a
run_history.csvwith: run_number, timestamp, status (success/failure), report_filename
Requirement: The run history CSV must persist correctly — if you run the script multiple times, the run numbers should continue from where they left off, not restart at 1.
Exercise 22.13 — APScheduler Basic Setup
Install APScheduler (pip install apscheduler) and rewrite the scheduler from Exercise 22.7 using APScheduler's BackgroundScheduler:
- Use
IntervalTriggerforhealth_check_fast(every 2 minutes) - Use
IntervalTriggerforhealth_check_medium(every 5 minutes) - Use
CronTriggerforhealth_check_slow(at specific times: 00, 10, 20, 30, 40, 50 minutes past each hour) - Print the next scheduled run time for each job after setup
- Handle
KeyboardInterruptto shut down the scheduler gracefully
Tier 4: Challenge
Exercise 22.14 — Complete Scheduled Pipeline
Design and implement a complete "daily business operations" scheduler for a fictional company. Requirements:
Jobs to schedule:
1. morning_data_sync — 6:00 AM weekdays: "syncs" data by reading a CSV, transforming it, and writing a processed version
2. hourly_metrics_snapshot — every hour 8 AM–6 PM weekdays: calculates and logs 5 business metrics from the CSV
3. daily_summary_email — 5:30 PM weekdays: generates a summary report from the day's metric snapshots and "sends" it (log the email content to a file)
4. weekly_trend_report — Monday 7:00 AM: reads the last 7 days of snapshots and generates a trend report
Non-negotiable requirements: - Every job must be wrapped in error handling (decorator pattern from the chapter) - Every job must log start, end, and duration - The scheduler must write a heartbeat file every 2 minutes - The scheduler must handle Ctrl+C gracefully
Testing: Run for 20 minutes and verify all jobs executed correctly and the logs are complete.
Exercise 22.15 — APScheduler with Persistent Job Store
Build a scheduler that:
- Uses APScheduler with a SQLite job store (jobs persist across restarts)
- Has 3 scheduled jobs
- Adds a job management interface: accept keyboard input to:
-
list— print all current jobs and their next run time -pause JOB_ID— pause a specific job -resume JOB_ID— resume a paused job -run JOB_ID— run a job immediately -quit— shut down gracefully - When you restart the script (after Ctrl+C), the jobs are loaded from the SQLite database — no need to re-add them
Exercise 22.16 — Windows Task Scheduler / Cron Setup
Create a scheduled task at the OS level for a Python script:
Windows users:
1. Write a Python script that appends the current timestamp to scheduled_run_log.txt
2. Create a Windows Task Scheduler task that runs this script every 5 minutes
3. After 30 minutes, verify the log file has approximately 6 entries
4. Export the task definition to XML (right-click → Export in Task Scheduler)
5. Write instructions for how a colleague could import and configure this task
macOS/Linux users:
1. Write the same Python script
2. Create a crontab entry: */5 * * * * /path/to/python /path/to/script.py >> /path/to/log.txt 2>&1
3. After 30 minutes, verify the log has approximately 6 entries
4. Document the crontab entry with a comment explaining when it runs
Tier 5: Mastery
Exercise 22.17 — Extend the Scheduler Demo
The scheduler_demo.py file in this chapter demonstrates several patterns. Extend it with:
- Job dependencies: Add a
generate_report()job that only runs iffetch_and_save_exchange_rates()has successfully run in the past 6 hours (check the output file's modification time) - Dynamic scheduling: Add a job that reads a
job_config.jsonfile at startup and adds jobs based on the configuration (so you can add new jobs by editing the JSON without modifying Python code) - Performance metrics: After each scheduler loop iteration, log how long
run_pending()took. Alert if any single call takes more than 5 seconds (indicating a slow job is blocking the loop) - Recovery: If the scheduler starts and detects that a job that was supposed to run in the past 2 hours has no successful run record, run it immediately before entering the main loop
Exercise 22.18 — Extend the Report Pipeline
The report_pipeline.py generates a Monday Excel report. Extend it with:
- Incremental updates: Add a
generate_daily_delta_report()function that generates a lightweight daily report showing only changes from the previous day (new customers, significant revenue changes, threshold breaches) - Multiple recipients with custom content: Modify
send_report_email()to accept a recipients list, where each recipient can have a different subset of the report data (Sandra gets the executive summary; regional managers get only their region's data) - Report archiving: After generating the report, move it to an archive directory organized by year/month (
reports/2024/11/acme_weekly_report_20241104.xlsx) - Email retry: If
send_report_email()fails, save the failed email to apending_emails/directory. Add a separate job that checks this directory every 15 minutes and retries any pending emails.
Exercise 22.19 — Build Maya's cron Setup
Maya wants to deploy her automation to a Linux cloud server instead of her laptop. Design and implement the full deployment:
-
Write a
setup_maya_cron.shshell script that: - Creates the required directory structure - Installs Python dependencies - Sets up the.envfile (with placeholder values) - Adds all crontab entries usingcrontab -eor direct file editing - Verifies the setup by testing each script with--dry-run -
Modify both pipeline scripts to accept a
--dry-runflag that: - Goes through all steps (load data, calculate metrics) but does not send emails - Prints what it would have done instead -
Add a monitoring cron job that runs at 7:00 AM daily and checks: - Whether the invoice monitor ran yesterday (by checking the log file) - Whether it found any critical issues - Sends a one-line status email: "All systems go" or "CHECK LOGS: [issue]"
Answer Key Notes
Exercise 22.1: a-cron expression, b-task history, c-heartbeat, d-grace time, e-scheduler loop, f-coalesce, g-job, h-trigger
Exercise 22.2: a) Every day at 9:00 AM b) Every 15 minutes (at :00, :15, :30, :45) c) Every Monday at 7:45 AM d) Every hour at 30 minutes past (at :30) e) Every Friday at 4:30 PM f) Every 2 hours (at :00, :02, :04, ...) g) Every weekday (Monday through Friday) at 8:00 AM
Exercise 22.3: a) 8:00 AM every weekday (Monday–Friday) b) 7:45 AM every Monday c) Midnight on the first day of every month d) Every 30 minutes between 9 AM and 5 PM on weekdays e) 6:00 AM, noon, and 6:00 PM every day f) 9:00 AM every Monday g) 11:30 PM every Friday
Exercise 22.4: Problems include: (1) no try/except around job logic — any error crashes the job and may propagate; (2) open("sales_data.csv") uses a relative path — will fail unless the working directory is exactly right; (3) the file is opened but never explicitly closed (use a with statement); (4) no logging — there is no record of when the job ran or whether it succeeded; (5) the while True loop has no graceful shutdown handling; (6) "8:00" should be "08:00" — the schedule library requires zero-padded hours.
Exercises designed for Python 3.10+. schedule library 1.1+, APScheduler 3.10+.