Not all tedious tasks are worth automating. Apply the Time × Frequency × Consistency matrix before writing a line of code: only invest in automation when the task takes meaningful time, occurs frequently, and follows consistent rules every time.
Calculate ROI explicitly: (minutes saved × annual occurrences) / 60 = annual hours recovered. A 4-hour script that saves 25 minutes per day pays back in 10 working days and then generates pure savings every subsequent year.
Automation is not always the answer. Tasks requiring judgment, approval, or context-sensitivity belong to humans. Automating them creates bugs masquerading as process.
File System Modules
The Python standard library provides everything needed for file automation: os for low-level system interactions, shutil for copying, moving, and archiving, and pathlib for the full modern API.
pathlib.Path is the preferred approach. The / operator builds paths readably: base / "subdir" / "file.csv" is clearer than os.path.join(base, "subdir", "file.csv").
shutil.move() overwrites the destination without warning. Always check for conflicts before moving.
shutil.make_archive() creates ZIP or TAR archives from entire directory trees in a single call.
Modern Path Handling with pathlib
Path.mkdir(parents=True, exist_ok=True) creates a directory and all missing ancestors without raising an error if the directory already exists. This is the safe, idiomatic pattern for ensuring a path exists.
Path.glob() finds files matching a pattern in one directory. Path.rglob() searches recursively through all subdirectories.
Path.stat().st_mtime returns the last modification timestamp. Wrap it with datetime.fromtimestamp() for a usable datetime object.
Path.suffix returns the file extension including the dot (.csv), preserving case. Use .suffix.lower() for case-insensitive matching.
Conflict Handling
Naming conflicts are not edge cases — they are expected. Any script that moves files in a shared or automated environment will eventually encounter a file that already exists at the destination. Every file-moving function should have an explicit conflict resolution strategy: overwrite, rename, skip, or raise.
The numeric suffix pattern (report_1.csv, report_2.csv) is the most user-friendly conflict resolution for business contexts where you want to keep all copies.
Dry-Run Mode
Always implement a --dry-run flag in file-moving scripts. A dry run previews every planned action without changing anything. This is how you verify the script before it touches real data.
Two separate bugs were caught in the case studies because of dry-run testing. Skipping this step is how you move 1,200 files to the wrong directory on a production server.
Folder Watching with watchdog
watchdog uses OS-level file system notification APIs (inotify on Linux, FSEvents on macOS, ReadDirectoryChangesW on Windows). It activates only on actual changes, unlike polling loops that waste CPU checking repeatedly.
Implement on_created and on_moved event handlers to catch files both written locally and moved in from other locations.
Add a settle delay (time.sleep(SETTLE_TIME_SECONDS)) before processing newly arrived files to ensure the writing process has finished.
Scheduling
A script that requires manual triggering is only half-automated. Use Windows Task Scheduler or cron to run scripts on a schedule without human involvement.
Exit codes are the communication channel between your script and the scheduler. sys.exit(0) signals success; sys.exit(1) signals failure and can trigger alert emails, retries, or escalation in both Task Scheduler and cron.
Logging
File automation scripts must log everything. When a script runs at 7 AM while you are asleep, the log file is the only record of what happened.
Log at the right level: logger.info() for normal operations, logger.warning() for expected problems that need attention, logger.error() for failures that prevent the script from completing.
Daily rotating log files (named organizer_2024-01-15.log) make it easy to check any specific day without wading through months of combined output.
Code Patterns to Remember
# Build paths safely
output = Path("/data") / "exports" / "2024-01" / "file.csv"
# Create directories safely
output.parent.mkdir(parents=True, exist_ok=True)
# Iterate files (skip hidden and directories)
for f in source_dir.iterdir():
if f.is_file() and not f.name.startswith("."):
process(f)
# Get modification date
mtime = datetime.fromtimestamp(f.stat().st_mtime)
# Move with conflict handling
destination = dest_dir / f.name
if destination.exists():
destination = resolve_naming_conflict(destination)
shutil.move(str(f), str(destination))
# Archive a folder tree
shutil.make_archive("/output/archive", "zip", "/source_dir")
We use cookies to improve your experience and show relevant ads. Privacy Policy