Chapter 16 Further Reading: Excel and CSV Integration

Official Documentation

openpyxl - openpyxl documentation: https://openpyxl.readthedocs.io/ The official tutorial is essential reading. The "Working with styles" section covers fonts, fills, borders, and alignments with complete examples. The "Charts" section documents every supported chart type. - openpyxl API reference: https://openpyxl.readthedocs.io/en/stable/api/openpyxl.html Complete class and method reference. The Cell, Worksheet, Font, PatternFill, and Alignment pages are the most-referenced in production work. - openpyxl defined names and named ranges: https://openpyxl.readthedocs.io/en/stable/defined_names.html When you need to reference a cell or range by name (as Excel users do with named ranges), this section explains how to create and read named ranges with openpyxl.

xlwings - xlwings documentation: https://docs.xlwings.org/ Comprehensive docs covering the Excel object model, UDFs (User Defined Functions), and the xlwings add-in for Excel. The "QuickStart" and "Syntax overview" sections are most useful for beginners. - xlwings UDF guide: https://docs.xlwings.org/en/stable/udfs.html If you want Python functions callable directly from Excel cells (like =CALCULATE_COMMISSION(C2, D2)), this is the reference. Requires the xlwings add-in installed in Excel.

pandas Excel I/O - pandas I/O documentation: https://pandas.pydata.org/docs/user_guide/io.html The "Excel files" and "CSV and text files" sections of the pandas I/O guide. Covers all parameters for read_excel, to_excel, read_csv, and to_csv with complete examples.


Books

"Python for Excel" by Felix Zumstein Zumstein created xlwings. This book covers the full spectrum of Python-Excel integration: openpyxl for file creation, xlwings for live Excel interaction, and Jupyter notebooks for data analysis workflows. Recommended if your work is deeply Excel-centric and you need Python to fit into existing Excel-heavy environments.

"Automate the Boring Stuff with Python" by Al Sweigart (2nd ed.) The Excel chapters (15-16) in this book cover openpyxl with practical automation examples. The book is available free online at https://automatetheboringstuff.com/. Sweigart's approach is more task-oriented and less business-context-oriented than this book, but the openpyxl examples are solid.


Online Resources

openpyxl Working with Styles (community cookbook) https://openpyxl.readthedocs.io/en/stable/styles.html The styles documentation is the most practical part of the openpyxl docs for formatting work. Every Font, PatternFill, Alignment, and Border parameter is documented here.

Excel Number Format String Reference openpyxl uses Excel's native number format syntax. Since the official documentation is sparse on format string syntax, Excel's own documentation is more useful: - Microsoft: "Number format codes" — search for "Excel custom number format codes" on Microsoft Support - Key rules: 0 = mandatory digit, # = optional digit, , = thousands separator, . = decimal, % multiplies by 100 and appends %, quoted text in "..." is literal

pandas ExcelWriter documentation https://pandas.pydata.org/docs/reference/api/pandas.ExcelWriter.html Complete parameter reference for ExcelWriter, including the mode parameter ("w" for create, "a" for append to existing file) and the engine parameter.

The CSV standard (RFC 4180) https://www.ietf.org/rfc/rfc4180.txt If you ever need to understand what is and isn't valid in a CSV file — quoting rules, newlines within fields, encoding — the RFC is the authoritative source. Short enough to read in full (about 2 pages of rules).


Alternative Excel Libraries

Two other libraries are worth knowing about for specific situations:

xlsxwriter https://xlsxwriter.readthedocs.io/ An alternative to openpyxl for writing (not reading) Excel files. Supports all the same formatting options plus some additional chart types and has better column auto-fit support. Cannot modify existing files — write-only. Use openpyxl when you need to read and modify; use xlsxwriter when you're creating new files and want more chart options.

pip install xlsxwriter

calamine https://github.com/tafia/calamine A Rust-based Excel reader with Python bindings. Much faster than openpyxl for reading large files. Does not support writing. Useful when pd.read_excel is too slow for a large workbook:

pip install python-calamine
df = pd.read_excel("huge_file.xlsx", engine="calamine")

CSV Resources

Python csv Module Documentation https://docs.python.org/3/library/csv.html Python's built-in CSV reader/writer. Useful when you need fine-grained control over quoting, escaping, and line endings — more control than pandas' read_csv but much more verbose.

chardet — Universal Character Encoding Detector https://chardet.readthedocs.io/ When you receive a CSV and don't know its encoding, chardet can detect it:

pip install chardet
import chardet
with open("mystery.csv", "rb") as f:
    raw_bytes = f.read(10000)
result = chardet.detect(raw_bytes)
print(result)  # {'encoding': 'UTF-8', 'confidence': 0.99, 'language': ''}

charset-normalizer https://charset-normalizer.readthedocs.io/ A more modern alternative to chardet for encoding detection. Included in the requests library by default. Use the same pattern as chardet but with from charset_normalizer import from_bytes.


Workflow Tools

Pandas Profiling (ydata-profiling) https://ydata-profiling.ydata.ai/ Before you clean and format data, it helps to understand what you have. pandas-profiling generates a comprehensive HTML report from a DataFrame: missing values, distributions, correlations, and data types. Useful when a CSV arrives from an unfamiliar source.

pip install ydata-profiling
from ydata_profiling import ProfileReport
report = ProfileReport(df, title="Data Quality Report")
report.to_file("report.html")

Next Chapter

Chapter 17 covers automating repetitive office tasks — renaming files in bulk, monitoring folders for new files, moving and organizing data files, and building the kind of small utilities that save twenty minutes every day. The skills from Chapters 9 (file I/O), 16 (Excel/CSV), and 17 (task automation) combine to form the foundation of the workflow automation toolkit that Acme Corp and Maya both use through the rest of the book.