Case Study 1: Priya Automates Marcus's Morning File Ritual

Background

Every weekday at 7:15 AM, Marcus Webb runs the same twenty-five-minute routine: check the exports folder for four regional CSV files, move them to the processing directory, archive yesterday's files, and update a shared log. He has done this 1,250 times over five years without ever missing a day — including the morning he had a head cold and the morning there was a two-hour power outage.

Priya has watched this routine for four months. Her manager has asked her to document and where possible automate any processes that rely on a single person's manual attention. Marcus's morning routine is exhibit one.

Her goal: write a script that performs the entire routine automatically at 7:00 AM, before Marcus even arrives at his desk, and alerts him only when something actually needs his attention.


What the Script Needs to Do

After a conversation with Marcus, Priya documents the exact rules:

Step 1 — Archive yesterday's files: Any file in the exports folder with a modification date earlier than today should be moved to an archive directory, organized into subfolders by date (archive/2024-01-14/, archive/2024-01-15/, etc.).

Step 2 — Check for today's regional files: The overnight ETL process should have deposited exactly four CSV files in the exports folder: - chicago_sales.csv - nashville_sales.csv - cincinnati_sales.csv - stlouis_sales.csv

Step 3 — Validate and move present files: Each present file should be moved to processing/<today's date>/. If a file already exists at the destination (a rerun scenario), the newer copy should be timestamped to avoid overwriting.

Step 4 — Log everything: Every action should be logged to a daily log file at logs/organizer_YYYY-MM-DD.log. The log format should be readable by Marcus without any Python knowledge.

Step 5 — Alert on missing files: If any of the four regional files are missing, the script should exit with a non-zero exit code so the Windows Task Scheduler marks the run as failed and triggers the configured alert email to the regional IT contacts.


The Script

"""
acme_morning_organizer.py

Automated morning file organization for Marcus's data export folder.
Runs daily at 7:00 AM via Windows Task Scheduler.

Exit codes:
    0 — Success: all 4 regional files found, moved, nothing missing
    1 — Warning: some regional files missing (Task Scheduler alerts on exit code != 0)
    2 — Error: script could not run (permissions, missing directory, etc.)
"""

import argparse
import datetime
import logging
import shutil
import sys
from pathlib import Path


# ── CONFIGURATION ─────────────────────────────────────────────────────────────
# These paths match the actual Acme Corp file server structure.
# Update if the server layout changes.

EXPORTS_DIR = Path(r"\\acme-server\shared\data_exports")
PROCESSING_DIR = Path(r"\\acme-server\shared\processing")
ARCHIVE_DIR = Path(r"\\acme-server\shared\archive")
LOG_DIR = Path(r"\\acme-server\shared\logs")

EXPECTED_REGIONAL_FILES = [
    "chicago_sales.csv",
    "nashville_sales.csv",
    "cincinnati_sales.csv",
    "stlouis_sales.csv",
]


# ── LOGGING ────────────────────────────────────────────────────────────────────

def setup_logging(log_dir: Path, today_str: str) -> logging.Logger:
    """Set up logging to file and console."""
    try:
        log_dir.mkdir(parents=True, exist_ok=True)
    except OSError as error:
        # If we can't write logs, print to stderr but continue
        print(f"WARNING: Cannot create log directory: {error}", file=sys.stderr)

    log_file = log_dir / f"organizer_{today_str}.log"

    handlers = [logging.StreamHandler(sys.stdout)]
    try:
        handlers.append(logging.FileHandler(log_file))
    except OSError as error:
        print(f"WARNING: Cannot open log file: {error}", file=sys.stderr)

    logging.basicConfig(
        level=logging.INFO,
        format="%(asctime)s  %(levelname)-8s  %(message)s",
        datefmt="%Y-%m-%d %H:%M:%S",
        handlers=handlers,
    )
    return logging.getLogger("acme_organizer")


# ── STEP 1: ARCHIVE OLD FILES ─────────────────────────────────────────────────

def archive_old_exports(
    exports_dir: Path,
    archive_dir: Path,
    today: datetime.date,
    logger: logging.Logger,
    dry_run: bool = False,
) -> list[str]:
    """
    Move files from exports_dir whose modification date is before today.

    Organizes archives into date-named subfolders.
    Returns a list of filenames that were archived.
    """
    archived_filenames = []
    today_start = datetime.datetime.combine(today, datetime.time.min)

    for file_path in exports_dir.iterdir():
        if file_path.is_dir():
            continue
        if file_path.name.startswith("."):
            continue

        modified = datetime.datetime.fromtimestamp(file_path.stat().st_mtime)

        if modified < today_start:
            date_folder = modified.strftime("%Y-%m-%d")
            dest_dir = archive_dir / date_folder
            dest_file = dest_dir / file_path.name

            if not dry_run:
                dest_dir.mkdir(parents=True, exist_ok=True)
                if dest_file.exists():
                    # File was already archived in a previous run — skip silently
                    continue
                shutil.move(str(file_path), str(dest_file))
                logger.info(f"  Archived: {file_path.name} -> archive/{date_folder}/")
            else:
                logger.info(
                    f"  [DRY RUN] Would archive: {file_path.name} -> archive/{date_folder}/"
                )

            archived_filenames.append(file_path.name)

    return archived_filenames


# ── STEP 2: CHECK EXPECTED FILES ──────────────────────────────────────────────

def check_expected_files(
    exports_dir: Path,
    expected_files: list[str],
    logger: logging.Logger,
) -> tuple[list[str], list[str]]:
    """
    Check which expected files are present in exports_dir.

    Returns (present_files, missing_files).
    """
    present = []
    missing = []

    for filename in expected_files:
        if (exports_dir / filename).exists():
            present.append(filename)
            logger.info(f"  PRESENT:  {filename}")
        else:
            missing.append(filename)
            logger.warning(f"  MISSING:  {filename}")

    return present, missing


# ── STEP 3: MOVE FILES TO PROCESSING ──────────────────────────────────────────

def move_to_processing(
    exports_dir: Path,
    processing_dir: Path,
    filenames: list[str],
    today_str: str,
    logger: logging.Logger,
    dry_run: bool = False,
) -> list[str]:
    """
    Move listed files from exports_dir to processing_dir/<today>.

    Appends a timestamp to filenames if a conflict exists.
    Returns a list of destination filenames (useful for the log).
    """
    today_dir = processing_dir / today_str
    moved = []

    if not dry_run:
        today_dir.mkdir(parents=True, exist_ok=True)

    for filename in filenames:
        source = exports_dir / filename
        destination = today_dir / filename

        if not source.exists():
            logger.warning(f"  Skipping (file disappeared): {filename}")
            continue

        if destination.exists() and not dry_run:
            # Rerun scenario: add a time-based suffix
            timestamp = datetime.datetime.now().strftime("%H%M%S")
            stem = source.stem
            destination = today_dir / f"{stem}_{timestamp}.csv"
            logger.info(
                f"  Conflict resolved: {filename} -> {destination.name}"
            )

        if not dry_run:
            shutil.move(str(source), str(destination))
            logger.info(f"  Moved: {filename} -> processing/{today_str}/{destination.name}")
        else:
            logger.info(
                f"  [DRY RUN] Would move: {filename} -> processing/{today_str}/{filename}"
            )

        moved.append(destination.name)

    return moved


# ── STEP 4 & 5: LOG SUMMARY AND EXIT ─────────────────────────────────────────

def write_log_summary(
    logger: logging.Logger,
    today_str: str,
    archived: list[str],
    present: list[str],
    missing: list[str],
    moved: list[str],
    dry_run: bool,
) -> None:
    """Write the end-of-run summary to the log."""
    prefix = "[DRY RUN] " if dry_run else ""

    logger.info("")
    logger.info("=" * 60)
    logger.info(f"{prefix}RUN SUMMARY — {today_str}")
    logger.info("=" * 60)
    logger.info(f"  Files archived:  {len(archived)}")
    logger.info(f"  Files present:   {len(present)} / {len(EXPECTED_REGIONAL_FILES)}")
    logger.info(f"  Files moved:     {len(moved)}")

    if missing:
        logger.warning(f"  Files MISSING:   {len(missing)}")
        for f in missing:
            logger.warning(f"    - {f}")
        logger.warning("")
        logger.warning("ACTION REQUIRED:")
        logger.warning("  Contact regional IT contacts for the missing regions.")
    else:
        logger.info("  All regional files received — pipeline ready.")

    logger.info("=" * 60)


# ── MAIN ORCHESTRATION ────────────────────────────────────────────────────────

def run(dry_run: bool = False) -> int:
    """
    Execute the full morning file organization routine.

    Returns an exit code: 0 for success, 1 for missing files, 2 for errors.
    """
    today = datetime.date.today()
    today_str = today.strftime("%Y-%m-%d")

    logger = setup_logging(LOG_DIR, today_str)

    logger.info("=" * 60)
    logger.info(f"Acme Corp Morning File Organizer — {today_str}")
    if dry_run:
        logger.info("DRY RUN: no files will be moved or archived")
    logger.info("=" * 60)

    # Verify that the exports directory exists before doing anything
    if not EXPORTS_DIR.exists():
        logger.error(f"Exports directory not found: {EXPORTS_DIR}")
        logger.error("Cannot continue — check network connection and path.")
        return 2

    # Step 1: Archive old files
    logger.info(f"\n[1/3] Archiving files from before {today_str}...")
    archived = archive_old_exports(
        EXPORTS_DIR, ARCHIVE_DIR, today, logger, dry_run
    )
    logger.info(f"      Archived {len(archived)} file(s).")

    # Step 2: Check for expected files
    logger.info(f"\n[2/3] Checking for expected regional files...")
    present, missing = check_expected_files(EXPORTS_DIR, EXPECTED_REGIONAL_FILES, logger)
    logger.info(f"      {len(present)}/{len(EXPECTED_REGIONAL_FILES)} expected files present.")

    # Step 3: Move present files to processing
    logger.info(f"\n[3/3] Moving present files to processing/{today_str}/...")
    if present:
        moved = move_to_processing(
            EXPORTS_DIR, PROCESSING_DIR, present, today_str, logger, dry_run
        )
        logger.info(f"      Moved {len(moved)} file(s).")
    else:
        moved = []
        logger.warning("      No files to move.")

    # Summary
    write_log_summary(logger, today_str, archived, present, missing, moved, dry_run)

    # Exit code drives Task Scheduler alerting
    if missing:
        return 1   # Missing files — scheduler marks as failed and alerts
    return 0       # All clear


def main() -> None:
    parser = argparse.ArgumentParser(
        description="Acme Corp morning file organization routine"
    )
    parser.add_argument(
        "--dry-run",
        action="store_true",
        help="Show what would happen without moving any files",
    )
    args = parser.parse_args()
    exit_code = run(dry_run=args.dry_run)
    sys.exit(exit_code)


if __name__ == "__main__":
    main()

Results

Priya tests the script on Friday afternoon in dry-run mode against a copy of the actual exports folder. The preview output shows exactly the correct behavior. She adjusts one path, reruns, and it is correct.

She schedules it in Windows Task Scheduler to run at 7:00 AM Monday through Friday and configures a "send email" action when the task exits with a non-zero code.

Marcus arrives on Monday morning to find the routine already done. The log file shows it ran at exactly 7:00:04 AM — four seconds after the trigger — and all four regional files were present and moved.

He watches the log for the next two weeks. On one Thursday, the Nashville file is missing. The log notes it at 7:00 AM, the exit code triggers the alert email, and the Nashville IT contact receives a message before 7:05. By 7:30, the file is manually deposited and processing is complete — thirty minutes faster than the old routine because Marcus no longer had to discover the problem at 7:25.


What Priya Learned

Log verbosely during the build phase. Priya initially made the logging terse to keep the output clean. After the first week she wished she had logged more detail — she added the conflict resolution messages and the "already archived" skip notes based on real questions Marcus asked.

Dry-run mode saved her twice. Once when she had the archive path slightly wrong and would have moved files to the wrong server share, and once when she discovered the network path required authentication that wasn't available at 7 AM under the scheduler's service account.

Exit codes are the interface between your script and the outside world. The sys.exit(1) for missing files is what makes the whole alerting chain work. Without that, the scheduler sees a successful run even when files are missing.

The before-automation state is worth documenting. Priya wrote a one-page before/after comparison for her manager. The 25-minutes-daily × 250 days = 104 hours/year figure surprised everyone, including Marcus.