Chapter 16 Further Reading: Excel and CSV Integration
Official Documentation
openpyxl
- openpyxl documentation: https://openpyxl.readthedocs.io/
The official tutorial is essential reading. The "Working with styles" section covers fonts, fills, borders, and alignments with complete examples. The "Charts" section documents every supported chart type.
- openpyxl API reference: https://openpyxl.readthedocs.io/en/stable/api/openpyxl.html
Complete class and method reference. The Cell, Worksheet, Font, PatternFill, and Alignment pages are the most-referenced in production work.
- openpyxl defined names and named ranges: https://openpyxl.readthedocs.io/en/stable/defined_names.html
When you need to reference a cell or range by name (as Excel users do with named ranges), this section explains how to create and read named ranges with openpyxl.
xlwings
- xlwings documentation: https://docs.xlwings.org/
Comprehensive docs covering the Excel object model, UDFs (User Defined Functions), and the xlwings add-in for Excel. The "QuickStart" and "Syntax overview" sections are most useful for beginners.
- xlwings UDF guide: https://docs.xlwings.org/en/stable/udfs.html
If you want Python functions callable directly from Excel cells (like =CALCULATE_COMMISSION(C2, D2)), this is the reference. Requires the xlwings add-in installed in Excel.
pandas Excel I/O
- pandas I/O documentation: https://pandas.pydata.org/docs/user_guide/io.html
The "Excel files" and "CSV and text files" sections of the pandas I/O guide. Covers all parameters for read_excel, to_excel, read_csv, and to_csv with complete examples.
Books
"Python for Excel" by Felix Zumstein Zumstein created xlwings. This book covers the full spectrum of Python-Excel integration: openpyxl for file creation, xlwings for live Excel interaction, and Jupyter notebooks for data analysis workflows. Recommended if your work is deeply Excel-centric and you need Python to fit into existing Excel-heavy environments.
"Automate the Boring Stuff with Python" by Al Sweigart (2nd ed.) The Excel chapters (15-16) in this book cover openpyxl with practical automation examples. The book is available free online at https://automatetheboringstuff.com/. Sweigart's approach is more task-oriented and less business-context-oriented than this book, but the openpyxl examples are solid.
Online Resources
openpyxl Working with Styles (community cookbook)
https://openpyxl.readthedocs.io/en/stable/styles.html
The styles documentation is the most practical part of the openpyxl docs for formatting work. Every Font, PatternFill, Alignment, and Border parameter is documented here.
Excel Number Format String Reference
openpyxl uses Excel's native number format syntax. Since the official documentation is sparse on format string syntax, Excel's own documentation is more useful:
- Microsoft: "Number format codes" — search for "Excel custom number format codes" on Microsoft Support
- Key rules: 0 = mandatory digit, # = optional digit, , = thousands separator, . = decimal, % multiplies by 100 and appends %, quoted text in "..." is literal
pandas ExcelWriter documentation
https://pandas.pydata.org/docs/reference/api/pandas.ExcelWriter.html
Complete parameter reference for ExcelWriter, including the mode parameter ("w" for create, "a" for append to existing file) and the engine parameter.
The CSV standard (RFC 4180) https://www.ietf.org/rfc/rfc4180.txt If you ever need to understand what is and isn't valid in a CSV file — quoting rules, newlines within fields, encoding — the RFC is the authoritative source. Short enough to read in full (about 2 pages of rules).
Alternative Excel Libraries
Two other libraries are worth knowing about for specific situations:
xlsxwriter https://xlsxwriter.readthedocs.io/ An alternative to openpyxl for writing (not reading) Excel files. Supports all the same formatting options plus some additional chart types and has better column auto-fit support. Cannot modify existing files — write-only. Use openpyxl when you need to read and modify; use xlsxwriter when you're creating new files and want more chart options.
pip install xlsxwriter
calamine
https://github.com/tafia/calamine
A Rust-based Excel reader with Python bindings. Much faster than openpyxl for reading large files. Does not support writing. Useful when pd.read_excel is too slow for a large workbook:
pip install python-calamine
df = pd.read_excel("huge_file.xlsx", engine="calamine")
CSV Resources
Python csv Module Documentation
https://docs.python.org/3/library/csv.html
Python's built-in CSV reader/writer. Useful when you need fine-grained control over quoting, escaping, and line endings — more control than pandas' read_csv but much more verbose.
chardet — Universal Character Encoding Detector
https://chardet.readthedocs.io/
When you receive a CSV and don't know its encoding, chardet can detect it:
pip install chardet
import chardet
with open("mystery.csv", "rb") as f:
raw_bytes = f.read(10000)
result = chardet.detect(raw_bytes)
print(result) # {'encoding': 'UTF-8', 'confidence': 0.99, 'language': ''}
charset-normalizer
https://charset-normalizer.readthedocs.io/
A more modern alternative to chardet for encoding detection. Included in the requests library by default. Use the same pattern as chardet but with from charset_normalizer import from_bytes.
Workflow Tools
Pandas Profiling (ydata-profiling) https://ydata-profiling.ydata.ai/ Before you clean and format data, it helps to understand what you have. pandas-profiling generates a comprehensive HTML report from a DataFrame: missing values, distributions, correlations, and data types. Useful when a CSV arrives from an unfamiliar source.
pip install ydata-profiling
from ydata_profiling import ProfileReport
report = ProfileReport(df, title="Data Quality Report")
report.to_file("report.html")
Next Chapter
Chapter 17 covers automating repetitive office tasks — renaming files in bulk, monitoring folders for new files, moving and organizing data files, and building the kind of small utilities that save twenty minutes every day. The skills from Chapters 9 (file I/O), 16 (Excel/CSV), and 17 (task automation) combine to form the foundation of the workflow automation toolkit that Acme Corp and Maya both use through the rest of the book.