Every organization produces recurring reports. Monthly financial summaries. Weekly sales dashboards. Quarterly board decks. Project status updates.
In This Chapter
- Learning Objectives
- Introduction: The Report That Writes Itself
- 36.1 The Business Case for Automated Reporting
- 36.2 Jinja2: The Template Engine
- 36.3 Loading Templates from Files
- 36.4 Embedding Charts in HTML Reports
- 36.5 Tables from DataFrames
- 36.6 Converting HTML to PDF with weasyprint
- 36.7 Building the Complete End-to-End Pipeline
- 36.8 Sending Reports by Email
- 36.9 Scheduling the Pipeline
- 36.10 Template Versioning and Maintenance
- 36.11 Conditional Formatting in Templates
- Chapter Summary
- Key Terms
- Why Automated Reports Matter
- Jinja2: The Language of Templates
- Building HTML Reports with Jinja2
- Generating PDFs from HTML with WeasyPrint
- Building Excel Reports with openpyxl
- Embedding Matplotlib Charts in HTML Reports
- Report Structure Best Practices
- Parameterizing Reports
- Scheduling Report Generation
- Email Delivery of Completed Reports
- Acme Corp: A Full Monthly Business Review Report
- Putting It All Together: The Complete Template
- Performance Considerations
- Summary
Chapter 36: Automated Report Generation
Learning Objectives
By the end of this chapter, you will be able to:
- Explain the business case for automated reporting and where it delivers the most value
- Use Jinja2 to create HTML templates with variables, loops, conditionals, and filters
- Build Jinja2 templates with inheritance for consistent report layouts
- Embed charts as base64-encoded images in HTML reports
- Convert DataFrames to formatted HTML tables for reports
- Convert HTML reports to PDF using weasyprint
- Build a complete end-to-end reporting pipeline from data to delivered document
- Schedule automated reports using techniques from Chapter 22
- Discuss template versioning and maintenance
Introduction: The Report That Writes Itself
Every organization produces recurring reports. Monthly financial summaries. Weekly sales dashboards. Quarterly board decks. Project status updates.
Most of these reports are assembled by hand. Someone opens last month's spreadsheet, copies the template, updates the numbers, re-creates the charts, adjusts the formatting, and emails it out. It takes one to three hours. It is error-prone. And it happens again next month, and the month after that, forever.
Automated report generation eliminates this cycle. Once you build the pipeline — data loading, metric calculation, chart generation, template rendering, PDF export — the report runs itself. You trigger it, or a scheduler triggers it, and sixty seconds later the PDF is in Sandra Chen's inbox.
This chapter builds that pipeline from scratch.
The tools are: - Jinja2 — The most widely used Python templating engine. If you have used Flask, you have used Jinja2. It produces HTML from templates and data. - weasyprint or reportlab — Libraries that convert HTML to PDF, or build PDFs programmatically. - matplotlib — For chart generation. - pandas — For data loading and metric calculation.
By the end of this chapter, Priya at Acme Corp will have a board report that generates and emails itself on the first of every month. Maya Reyes will have personalized client status reports that go out automatically, with each client's name, project details, and budget status filled in from her project management data.
36.1 The Business Case for Automated Reporting
Before writing a single line of code, it is worth understanding why automated reporting is one of the highest-ROI automations available to a business analyst.
36.1.1 The Hidden Cost of Manual Reports
Consider a monthly financial summary that takes one analyst two hours to produce: - 2 hours per month x 12 months = 24 hours per year - At $60/hour fully loaded cost, that is $1,440 per year in direct labor - Add manager review time (1 hour/month): another $720/year - Occasional error corrections (2-3 per year, 2 hours each): $360 - Opportunity cost of what the analyst could have done instead: unquantified
A single recurring report costs $2,500 to $3,000 per year in labor. Across an organization with a dozen recurring reports, the total is substantial.
36.1.2 The Error Problem
Manual reports have manual errors. Numbers get mis-copied. A formula references the wrong cell range. A chart title says "Q3" when the report covers Q4. A revised number is updated in one table but not another.
These errors erode trust. When stakeholders know that reports require manual assembly, they learn to double-check everything. That double-checking takes time and signals, correctly, that the reporting process is fragile.
Automated reports read from the same data source every time. The number in the chart and the number in the table are identical because they come from the same calculation. Errors move upstream to the data layer — where they belong — and the report itself becomes a reliable output.
36.1.3 The Consistency Benefit
When reports have consistent structure, readers know where to look. They develop pattern recognition: the executive summary is always on page one, the trend chart is always the second chart, the tables always have the same columns. This makes reports faster to read and easier to act on.
Manual reports drift over time. The extra chart added last October "just this once" becomes a permanent fixture. Column names change slightly. The color scheme shifts. Automated reports look the same every month because the template is the same every month.
36.1.4 When Automation Is Not the Right Answer
Automated reporting is not always appropriate:
- One-off strategic analyses: Custom exploratory analysis for a specific decision does not benefit from templating.
- Highly interpretive reports: Board narratives that require human judgment and contextual commentary should not be fully automated.
- Very small organizations: If a report takes 20 minutes and runs quarterly, the automation payoff is limited.
The sweet spot is recurring reports with stable structure that draw from consistent data sources.
36.2 Jinja2: The Template Engine
Jinja2 is a Python templating library that lets you write templates with placeholders, loops, and conditionals. When you render a template, you pass in a Python dictionary of data and Jinja2 produces the final output — HTML, in our case.
36.2.1 Installation
pip install jinja2 weasyprint
36.2.2 Basic Template Syntax
Jinja2 uses three types of delimiters:
| Delimiter | Purpose | Example |
|---|---|---|
{{ }} |
Output a variable or expression | {{ company_name }} |
{% %} |
Control flow: for loops, if statements | {% for item in items %} |
{# #} |
Comments (not included in output) | {# This is a comment #} |
Here is a minimal example:
from jinja2 import Template
template_str = """
Hello, {{ recipient_name }}!
Your monthly report for {{ month }} {{ year }} is ready.
Total revenue: {{ revenue | currency }}
"""
template = Template(template_str)
output = template.render(
recipient_name="Sandra Chen",
month="January",
year=2024,
revenue=142500.75,
)
print(output)
Output:
Hello, Sandra Chen!
Your monthly report for January 2024 is ready.
Total revenue: $142,500.75
The | currency part is a Jinja2 filter — a transformation applied to the variable value before it is output. Jinja2 has many built-in filters, and you can define your own.
36.2.3 For Loops
<table>
<thead>
<tr>
<th>Product</th>
<th>Units Sold</th>
<th>Revenue</th>
</tr>
</thead>
<tbody>
{% for product in top_products %}
<tr class="{{ 'highlight' if product.revenue > 100000 else '' }}">
<td>{{ product.name }}</td>
<td>{{ product.units_sold | number }}</td>
<td>{{ product.revenue | currency }}</td>
</tr>
{% endfor %}
</tbody>
</table>
The {% for product in top_products %} loop iterates over a Python list. Each product is a dictionary with name, units_sold, and revenue keys. Jinja2 accesses dictionary keys with dot notation (product.revenue) or bracket notation (product['revenue']) — either works.
36.2.4 Conditionals
{% if revenue_change > 0 %}
<span class="positive">Up {{ revenue_change | pct }} vs. last month</span>
{% elif revenue_change < 0 %}
<span class="negative">Down {{ revenue_change | abs | pct }} vs. last month</span>
{% else %}
<span class="neutral">Flat vs. last month</span>
{% endif %}
36.2.5 Filters
Filters transform variable values. The pipe character | applies them:
{{ revenue | round(2) }} {# Round to 2 decimal places #}
{{ name | upper }} {# Convert to uppercase #}
{{ description | truncate(100) }} {# Truncate to 100 characters #}
{{ items | length }} {# Count items in a list #}
{{ value | default('N/A') }} {# Use 'N/A' if value is None #}
{{ value | abs }} {# Absolute value #}
You can define custom filters in Python and register them with Jinja2:
from jinja2 import Environment
def currency_filter(value: float) -> str:
"""Format a float as a USD currency string."""
try:
return f"${float(value):,.2f}"
except (ValueError, TypeError):
return "N/A"
def pct_filter(value: float, decimals: int = 1) -> str:
"""Format as a percentage."""
try:
return f"{float(value):.{decimals}f}%"
except (ValueError, TypeError):
return "N/A"
env = Environment()
env.filters['currency'] = currency_filter
env.filters['pct'] = pct_filter
36.2.6 Loop Helpers
Jinja2 provides loop-specific variables inside {% for %} blocks:
{% for row in table_data %}
<tr class="{{ 'even-row' if loop.index is even else 'odd-row' }}">
<td>{{ loop.index }}</td>
<td>{{ row.name }}</td>
</tr>
{% else %}
<tr><td colspan="2">No data available.</td></tr>
{% endfor %}
The {% else %} block on a {% for %} loop renders when the iterable is empty — perfect for graceful empty-state handling.
36.3 Loading Templates from Files
For real reports, templates should be stored as HTML files, not embedded in Python strings. Jinja2's FileSystemLoader handles this:
from jinja2 import Environment, FileSystemLoader
from pathlib import Path
template_dir = Path("templates")
env = Environment(
loader=FileSystemLoader(str(template_dir)),
autoescape=True, # Escapes HTML special characters in variables for security
)
# Register custom filters
env.filters['currency'] = currency_filter
env.filters['pct'] = pct_filter
template = env.get_template("monthly_report.html")
html_output = template.render(
report_date="February 1, 2024",
company_name="Acme Corporation",
metrics=metrics_dict,
charts=charts_dict,
)
output_path = Path("output/monthly_report_2024_02.html")
output_path.parent.mkdir(parents=True, exist_ok=True)
output_path.write_text(html_output, encoding='utf-8')
print(f"Report saved: {output_path}")
36.3.1 Template Inheritance
Template inheritance lets you define a base layout once and extend it in multiple specific templates — the key to consistency across different report types.
base_report.html (the parent):
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>{% block title %}Report{% endblock %}</title>
<style>{% block styles %}/* Default styles */{% endblock %}</style>
</head>
<body>
<header>
<div class="company">{{ company_name }}</div>
<h1>{% block report_title %}Report{% endblock %}</h1>
<div class="report-date">{{ report_date }}</div>
</header>
<main>{% block content %}{% endblock %}</main>
<footer>
<p>Confidential — Generated: {{ generated_at }}</p>
</footer>
</body>
</html>
monthly_report.html (the child):
{% extends "base_report.html" %}
{% block title %}Monthly Report — {{ report_month }}{% endblock %}
{% block report_title %}Monthly Business Report: {{ report_month }}{% endblock %}
{% block content %}
<section class="executive-summary">
<h2>Executive Summary</h2>
<p>{{ summary_text }}</p>
</section>
<section class="metrics">
<h2>Key Metrics</h2>
{# Metric cards are added here #}
</section>
{% endblock %}
The {% extends %} tag links the child to the parent. The {% block %} tags are named placeholders. Any block not overridden in the child template shows the parent's default (or is empty).
36.4 Embedding Charts in HTML Reports
Charts appear in HTML reports one of two ways:
- External PNG files: Saved to disk alongside the HTML — breaks when emailing or converting to PDF
- Embedded base64: The chart image is encoded as text and embedded directly in the HTML — self-contained, works everywhere
For emailed reports and PDFs, always use embedded base64.
36.4.1 Chart to Base64 String
import io
import base64
import matplotlib.pyplot as plt
def chart_to_base64(fig: plt.Figure, dpi: int = 150) -> str:
"""
Convert a matplotlib Figure to a base64-encoded PNG string for HTML embedding.
Args:
fig: Matplotlib Figure object.
dpi: Resolution of the saved image.
Returns:
Base64 string (without the data:image/png;base64, prefix).
"""
buffer = io.BytesIO()
fig.savefig(
buffer,
format='png',
dpi=dpi,
bbox_inches='tight',
facecolor='white',
edgecolor='none',
)
buffer.seek(0)
encoded = base64.b64encode(buffer.read()).decode('utf-8')
buffer.close()
plt.close(fig)
return encoded
36.4.2 Using the Base64 Image in a Template
<figure class="chart">
<img src="data:image/png;base64,{{ charts.revenue_trend }}"
alt="Monthly Revenue Trend"
style="max-width: 100%; height: auto;" />
<figcaption>Figure 1: Monthly Revenue Trend</figcaption>
</figure>
The data:image/png;base64, prefix tells the browser how to interpret the string that follows. Jinja2 inserts the base64 string returned by chart_to_base64().
36.4.3 Building Reusable Chart Functions
Each chart type becomes a function returning a base64 string:
import matplotlib.ticker as mtick
BRAND_COLORS = {
'primary': '#2c7bb6',
'secondary': '#d7191c',
'positive': '#2ecc71',
'neutral': '#95a5a6',
'negative': '#e74c3c',
'accent': '#f39c12',
}
def make_revenue_trend_chart(
months: list[str],
revenue: list[float],
target: float | None = None,
title: str = 'Monthly Revenue',
) -> str:
"""Generate a revenue trend line chart, return as base64 PNG."""
fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(months, revenue, 'o-', color=BRAND_COLORS['primary'],
linewidth=2.5, markersize=7)
ax.fill_between(months, revenue, alpha=0.08, color=BRAND_COLORS['primary'])
if target is not None:
ax.axhline(y=target, color=BRAND_COLORS['secondary'],
linestyle='--', linewidth=1.5, label=f'Target (${target:,.0f})')
ax.legend(loc='lower right')
ax.yaxis.set_major_formatter(mtick.FuncFormatter(
lambda x, _: f'${x/1_000_000:.1f}M' if x >= 1_000_000 else f'${x/1_000:.0f}K'
))
ax.set_title(title, fontsize=13, fontweight='bold', pad=12)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.grid(axis='y', linestyle='--', alpha=0.3)
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
return chart_to_base64(fig)
def make_category_bar_chart(
categories: list[str],
values: list[float],
title: str = 'By Category',
) -> str:
"""Generate a horizontal bar chart, return as base64 PNG."""
fig, ax = plt.subplots(figsize=(9, max(3.5, len(categories) * 0.55)))
ax.barh(categories, values, color=BRAND_COLORS['primary'],
edgecolor='white', linewidth=0.5, height=0.6)
for i, val in enumerate(values):
ax.text(val * 1.01, i, f'{val:,.0f}', va='center', fontsize=9)
ax.set_title(title, fontsize=13, fontweight='bold', pad=10)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.tight_layout()
return chart_to_base64(fig)
36.5 Tables from DataFrames
Pandas DataFrames have a .to_html() method that generates HTML table markup:
import pandas as pd
def df_to_html_table(
df: pd.DataFrame,
table_id: str = 'report-table',
index: bool = True,
float_fmt: str = '{:,.2f}',
) -> str:
"""
Convert a DataFrame to a styled HTML table string.
Args:
df: DataFrame to convert.
table_id: HTML id attribute for the table element.
index: Whether to include the DataFrame index.
float_fmt: Format string for float columns.
Returns:
HTML table string suitable for embedding in a Jinja2 template.
"""
return df.to_html(
table_id=table_id,
classes='report-table',
border=0,
index=index,
float_format=float_fmt.format,
na_rep='—',
escape=True,
)
In the Jinja2 template, the | safe filter outputs the HTML as-is without escaping:
<section>
<h3>Sales by Product Category</h3>
{{ tables.product_summary | safe }}
</section>
Use | safe only for HTML strings you generate in Python — never for text submitted by users.
36.6 Converting HTML to PDF with weasyprint
weasyprint converts HTML and CSS to PDF. It supports most modern CSS including flexbox, custom fonts, and the @page rule for print-specific formatting.
from weasyprint import HTML
from pathlib import Path
def html_to_pdf(
html_string: str,
output_path: Path,
base_url: str | None = None,
) -> Path:
"""
Convert an HTML string to a PDF file.
Args:
html_string: Complete HTML document.
output_path: File path for the output PDF.
base_url: Base URL for resolving relative resource URLs.
Returns:
Path to the created PDF file.
"""
output_path = Path(output_path)
output_path.parent.mkdir(parents=True, exist_ok=True)
HTML(
string=html_string,
base_url=base_url or str(Path.cwd()),
).write_pdf(str(output_path))
size_kb = output_path.stat().st_size / 1024
print(f"PDF saved: {output_path} ({size_kb:.1f} KB)")
return output_path
36.6.1 CSS for Print-Specific Formatting
Add @page rules to your report stylesheet for professional print formatting:
@page {
size: A4;
margin: 20mm 15mm 20mm 15mm;
}
@page {
@bottom-center {
content: "Page " counter(page) " of " counter(pages);
font-size: 9px;
color: #999;
font-family: Arial, sans-serif;
}
@top-right {
content: "Confidential";
font-size: 9px;
color: #999;
}
}
/* Prevent page breaks inside key elements */
.metric-card { page-break-inside: avoid; }
.chart-figure { page-break-inside: avoid; }
tr { page-break-inside: avoid; }
36.6.2 reportlab for Programmatic PDF Generation
When you need pixel-level control over layout or your report structure does not map cleanly to HTML flow, reportlab generates PDFs by drawing elements programmatically:
from reportlab.lib.pagesizes import A4
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.units import mm
from reportlab.lib import colors
from reportlab.platypus import (
SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle, HRFlowable,
)
from io import BytesIO
def build_pdf_reportlab(output_path: str, title: str, content: list) -> None:
"""
Build a PDF using reportlab's Platypus document engine.
Args:
output_path: Destination file path.
title: Report title.
content: List of reportlab flowable objects (Paragraph, Table, Image, etc.).
"""
doc = SimpleDocTemplate(
output_path,
pagesize=A4,
leftMargin=20*mm, rightMargin=20*mm,
topMargin=25*mm, bottomMargin=25*mm,
)
styles = getSampleStyleSheet()
story = [
Paragraph(title, styles['Title']),
HRFlowable(width='100%', thickness=2, color=colors.HexColor('#2c7bb6')),
Spacer(1, 12),
]
story.extend(content)
doc.build(story)
36.7 Building the Complete End-to-End Pipeline
The complete pipeline has six stages:
1. Load Data → pandas DataFrames
↓
2. Calculate Metrics → Python dict of calculated values
↓
3. Generate Charts → dict of base64 PNG strings
↓
4. Render Template → HTML string
↓
5. Save Outputs → .html and .pdf files
↓
6. Deliver → Email PDF to recipients
Here is the complete pipeline function:
import pandas as pd
import numpy as np
from pathlib import Path
from datetime import datetime, date
from jinja2 import Environment, FileSystemLoader
import matplotlib.pyplot as plt
import io
import base64
def chart_to_base64(fig: plt.Figure, dpi: int = 150) -> str:
"""Convert matplotlib Figure to base64 PNG string."""
buf = io.BytesIO()
fig.savefig(buf, format='png', dpi=dpi, bbox_inches='tight',
facecolor='white', edgecolor='none')
buf.seek(0)
encoded = base64.b64encode(buf.read()).decode('utf-8')
buf.close()
plt.close(fig)
return encoded
def build_monthly_report(
sales_df: pd.DataFrame,
report_month: date,
template_dir: Path,
output_dir: Path,
company_name: str = "Acme Corporation",
) -> dict[str, Path]:
"""
Execute the complete monthly report pipeline.
Args:
sales_df: Sales transaction DataFrame with columns:
date, product_category, units_sold, revenue.
report_month: The month being reported.
template_dir: Directory containing Jinja2 templates.
output_dir: Directory for output files.
company_name: Organization name for report headers.
Returns:
Dictionary of output paths with keys 'html' and 'pdf'.
"""
print(f"[{datetime.now():%H:%M:%S}] Building report for {report_month.strftime('%B %Y')}")
# --- Stage 2: Calculate Metrics ---
report_data = sales_df[
pd.to_datetime(sales_df['date']).dt.to_period('M') ==
pd.Period(report_month, 'M')
]
prior_month = date(
report_month.year if report_month.month > 1 else report_month.year - 1,
report_month.month - 1 if report_month.month > 1 else 12,
1
)
prior_data = sales_df[
pd.to_datetime(sales_df['date']).dt.to_period('M') ==
pd.Period(prior_month, 'M')
]
revenue_current = float(report_data['revenue'].sum())
revenue_prior = float(prior_data['revenue'].sum())
revenue_change_pct = (
(revenue_current - revenue_prior) / revenue_prior * 100
if revenue_prior > 0 else 0.0
)
units_current = int(report_data['units_sold'].sum())
units_prior = int(prior_data['units_sold'].sum())
# Category breakdown
category_summary = (
report_data.groupby('product_category')
.agg(units_sold=('units_sold', 'sum'), revenue=('revenue', 'sum'))
.sort_values('revenue', ascending=False)
)
# 12-month trend for chart
trend_months = []
trend_revenue = []
for m in range(11, -1, -1):
target_date = date(
report_month.year if report_month.month > m else report_month.year - 1,
((report_month.month - m - 1) % 12) + 1,
1
)
month_rev = float(
sales_df[
pd.to_datetime(sales_df['date']).dt.to_period('M') ==
pd.Period(target_date, 'M')
]['revenue'].sum()
)
trend_months.append(target_date.strftime('%b %Y'))
trend_revenue.append(month_rev)
metrics = {
'revenue_total': revenue_current,
'revenue_prior': revenue_prior,
'revenue_change_pct': revenue_change_pct,
'units_sold': units_current,
'units_sold_prior': units_prior,
'units_change_pct': (
(units_current - units_prior) / units_prior * 100
if units_prior > 0 else 0.0
),
'avg_order_value': revenue_current / max(units_current, 1),
'top_category': category_summary.index[0] if len(category_summary) else 'N/A',
'category_count': len(category_summary),
'trend_months': trend_months,
'trend_revenue': trend_revenue,
}
# --- Stage 3: Generate Charts ---
charts = {
'revenue_trend': make_revenue_trend_chart(
months=trend_months[-6:],
revenue=trend_revenue[-6:],
title='Revenue — Last 6 Months',
),
'category_breakdown': make_category_bar_chart(
categories=category_summary.index.tolist(),
values=category_summary['revenue'].tolist(),
title='Revenue by Product Category',
),
}
# --- Stage 4: Build Tables ---
tables = {
'category_summary': df_to_html_table(
category_summary.reset_index(),
table_id='category-summary',
index=False,
),
}
# --- Stage 5: Render Template ---
env = Environment(
loader=FileSystemLoader(str(template_dir)),
autoescape=True,
)
env.filters['currency'] = lambda v: f"${float(v):,.2f}" if v is not None else "N/A"
env.filters['pct'] = lambda v, d=1: (
f"{'+' if float(v) > 0 else ''}{float(v):.{d}f}%"
if v is not None else "N/A"
)
env.filters['number'] = lambda v: f"{int(v):,}" if v is not None else "N/A"
template = env.get_template("monthly_report.html")
html_string = template.render(
company_name=company_name,
report_month=report_month.strftime('%B %Y'),
generated_at=datetime.now().strftime('%Y-%m-%d %H:%M'),
metrics=metrics,
charts=charts,
tables=tables,
)
# --- Stage 6: Save Outputs ---
output_dir = Path(output_dir)
output_dir.mkdir(parents=True, exist_ok=True)
month_str = report_month.strftime('%Y_%m')
paths = {}
html_path = output_dir / f"monthly_report_{month_str}.html"
html_path.write_text(html_string, encoding='utf-8')
print(f"[{datetime.now():%H:%M:%S}] HTML saved: {html_path}")
paths['html'] = html_path
try:
from weasyprint import HTML as WeasyprintHTML
pdf_path = output_dir / f"monthly_report_{month_str}.pdf"
WeasyprintHTML(
string=html_string,
base_url=str(output_dir),
).write_pdf(str(pdf_path))
print(f"[{datetime.now():%H:%M:%S}] PDF saved: {pdf_path}")
paths['pdf'] = pdf_path
except ImportError:
print("weasyprint not installed. Install with: pip install weasyprint")
return paths
36.8 Sending Reports by Email
Once the PDF is generated, delivery is straightforward (see Chapter 19 for the full email implementation):
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email.mime.text import MIMEText
from email import encoders
from pathlib import Path
def email_report(
pdf_path: Path,
recipients: list[str],
subject: str,
body_text: str,
smtp_server: str,
smtp_port: int,
sender_email: str,
sender_password: str,
) -> None:
"""
Email a PDF report as an attachment.
Args:
pdf_path: Path to the PDF file.
recipients: List of recipient email addresses.
subject: Email subject line.
body_text: Plain-text email body.
smtp_server: SMTP server hostname (e.g., 'smtp.gmail.com').
smtp_port: SMTP port (587 for TLS).
sender_email: Sender's email address.
sender_password: App-specific password (not your regular password).
"""
pdf_path = Path(pdf_path)
if not pdf_path.exists():
raise FileNotFoundError(f"PDF not found: {pdf_path}")
msg = MIMEMultipart()
msg['From'] = sender_email
msg['To'] = ', '.join(recipients)
msg['Subject'] = subject
msg.attach(MIMEText(body_text, 'plain'))
with open(pdf_path, 'rb') as f:
part = MIMEBase('application', 'pdf')
part.set_payload(f.read())
encoders.encode_base64(part)
part.add_header(
'Content-Disposition',
f'attachment; filename="{pdf_path.name}"',
)
msg.attach(part)
with smtplib.SMTP(smtp_server, smtp_port) as server:
server.starttls()
server.login(sender_email, sender_password)
server.sendmail(sender_email, recipients, msg.as_string())
print(f"Report emailed to {len(recipients)} recipient(s).")
36.9 Scheduling the Pipeline
36.9.1 Using the schedule Library
import schedule
import time
from datetime import date
from pathlib import Path
def run_monthly_report() -> None:
"""Execute the monthly report pipeline. Called daily; self-checks for the 1st."""
today = date.today()
if today.day != 1:
return
print(f"Running monthly report for {today.strftime('%B %Y')}...")
sales_df = pd.read_csv(Path("data/sales.csv"))
paths = build_monthly_report(
sales_df=sales_df,
report_month=date(today.year, today.month, 1),
template_dir=Path("templates"),
output_dir=Path("reports"),
company_name="Acme Corporation",
)
if 'pdf' in paths:
email_report(
pdf_path=paths['pdf'],
recipients=["sandra.chen@acmecorp.com"],
subject=f"Acme Monthly Report — {today.strftime('%B %Y')}",
body_text=f"Please find the report for {today.strftime('%B %Y')} attached.",
smtp_server="smtp.gmail.com",
smtp_port=587,
sender_email="reports@acmecorp.com",
sender_password="APP_PASSWORD",
)
# Run daily at 06:00; the function itself checks for the 1st
schedule.every().day.at("06:00").do(run_monthly_report)
if __name__ == "__main__":
print("Scheduler running. Press Ctrl+C to stop.")
while True:
schedule.run_pending()
time.sleep(60)
36.9.2 Production Scheduling Options
| Platform | Method | Command / Configuration |
|---|---|---|
| Linux/macOS | cron | 0 6 1 * * python3 /opt/reports/run.py >> /var/log/reports.log 2>&1 |
| Windows | Task Scheduler | Monthly trigger on day 1 at 06:00 |
| AWS | CloudWatch Events + Lambda | cron(0 6 1 * ? *) |
| GCP | Cloud Scheduler | 0 6 1 * * with Cloud Run or Cloud Functions |
36.10 Template Versioning and Maintenance
Report templates are code. They should be version-controlled, reviewed before changes go live, and documented.
36.10.1 Version Control
Store templates in the same Git repository as the Python pipeline. Commit template changes with descriptive messages explaining what changed and why:
feat(reports): add new customer acquisition section to monthly report
Sandra requested CAC and new customer count on slide 3.
Added new_customers and cac fields to calculate_metrics().
Updated monthly_report.html with the new section after revenue charts.
36.10.2 Document Template Variables
Maintain a companion document for each template listing the required context variables:
# monthly_report.html — Required Variables
## Top-Level
- company_name (str)
- report_month (str) — pre-formatted, e.g. "January 2024"
- generated_at (str) — timestamp
## metrics (dict)
- revenue_total (float)
- revenue_change_pct (float) — positive = growth
- units_sold (int)
- trend_months (list[str]) — month labels for chart
- trend_revenue (list[float]) — revenue values for chart
## charts (dict)
- revenue_trend (str) — base64 PNG
- category_breakdown (str) — base64 PNG
## tables (dict)
- category_summary (str) — HTML table string from df.to_html()
36.10.3 Testing Templates
Test your template rendering with known data before deploying:
def test_template_renders_without_error():
"""Verify template renders cleanly on minimal test data."""
from jinja2 import Environment, FileSystemLoader
from pathlib import Path
from datetime import date
env = Environment(loader=FileSystemLoader('templates'))
env.filters['currency'] = lambda v: f"${v:,.2f}"
env.filters['pct'] = lambda v: f"{v:.1f}%"
env.filters['number'] = lambda v: f"{int(v):,}"
test_data = {
'company_name': 'Test Corp',
'report_month': 'January 2024',
'generated_at': '2024-02-01 06:00',
'metrics': {
'revenue_total': 142500.0,
'revenue_change_pct': 5.2,
'units_sold': 1847,
'trend_months': ['Sep', 'Oct', 'Nov', 'Dec', 'Jan'],
'trend_revenue': [120000, 135000, 128000, 115000, 142500],
},
'charts': {'revenue_trend': 'PLACEHOLDER', 'category_breakdown': 'PLACEHOLDER'},
'tables': {'category_summary': '<table><tr><td>Test</td></tr></table>'},
}
template = env.get_template('monthly_report.html')
html = template.render(**test_data)
assert 'Test Corp' in html
assert 'January 2024' in html
assert len(html) > 500
print("Template test passed.")
36.11 Conditional Formatting in Templates
Conditional formatting draws attention to what matters — values above or below target, trend direction, risk indicators:
{# Metric card with green/red border based on performance #}
<div class="metric-card {{ 'card-positive' if metrics.revenue_change_pct >= 0 else 'card-negative' }}">
<div class="metric-label">Monthly Revenue</div>
<div class="metric-value">{{ metrics.revenue_total | currency }}</div>
<div class="metric-delta">
{% if metrics.revenue_change_pct > 0 %}
<span class="up">▲ {{ metrics.revenue_change_pct | pct }}</span>
{% elif metrics.revenue_change_pct < 0 %}
<span class="down">▼ {{ (metrics.revenue_change_pct * -1) | pct }}</span>
{% else %}
<span class="flat">→ Flat</span>
{% endif %}
vs. prior month
</div>
</div>
{# Table row with warning badge for categories below target #}
{% for row in category_rows %}
<tr>
<td>{{ row.category }}</td>
<td>{{ row.revenue | currency }}</td>
<td>
{% if row.revenue < row.target %}
<span class="badge-warning">Below Target</span>
{% else %}
<span class="badge-ok">On Track</span>
{% endif %}
</td>
</tr>
{% endfor %}
Supporting CSS:
.card-positive { border-left: 4px solid #2ecc71; }
.card-negative { border-left: 4px solid #e74c3c; }
.up { color: #27ae60; font-weight: bold; }
.down { color: #c0392b; font-weight: bold; }
.flat { color: #7f8c8d; }
.badge-warning { background: #fff3cd; color: #856404; padding: 2px 8px; border-radius: 12px; font-size: 11px; }
.badge-ok { background: #d4edda; color: #155724; padding: 2px 8px; border-radius: 12px; font-size: 11px; }
Chapter Summary
Automated report generation turns a recurring manual task into a reliable, consistent, zero-intervention process. In this chapter, you learned:
- The business case: Manual recurring reports cost more than they appear — in labor time, error correction, and stakeholder trust. Automation eliminates all three while improving consistency.
- Jinja2 templates: Variables (
{{ }}), loops ({% for %}), conditionals ({% if %}), and filters transform Python data into HTML. Template inheritance means one base layout serves all report types. - Charts as base64: Converting matplotlib figures through an in-memory buffer to base64 produces self-contained HTML that works in browsers, emails, and PDFs without external file dependencies.
- DataFrames to HTML tables:
df.to_html()combined with the| safefilter in templates produces structured, styled tables from any pandas DataFrame. - weasyprint: Converts HTML to print-quality PDF in a single function call. CSS
@pagerules control margins, page numbers, and running headers. - The six-stage pipeline: Load data, calculate metrics, generate charts, render template, save outputs, deliver by email. This structure is testable, schedulable, and maintainable.
- Scheduling: The
schedulelibrary, cron, or OS task schedulers run the pipeline automatically — the code runs in seconds; the schedule removes the human. - Template maintenance: Templates are code. Version control them, document their required variables, test them.
The code in report_generator.py and templates/monthly_report.html provides a complete, working implementation of everything in this chapter.
Key Terms
Base64 encoding — A method of encoding binary data as ASCII text. Used to embed PNG chart images directly in HTML without external file references.
Jinja2 — A Python templating engine that renders text output (typically HTML) from template files and Python data dictionaries.
Pipeline — A series of processing stages where each stage's output is the input to the next.
Template inheritance — A Jinja2 pattern where a child template extends a base template, filling in named blocks while inheriting the rest of the layout.
weasyprint — A Python library for converting HTML and CSS to PDF, producing print-ready report documents.
This is not a people problem. This is a process problem — and Python solves it completely.
Automated report generation means you write the logic once: what data to pull, how to calculate the metrics, how to format the output. After that, the report runs itself. The first of every month, the report generates, the charts update, the numbers are correct, and the PDF lands in every regional manager's inbox — without anyone touching a keyboard.
This chapter covers the full toolkit for automated reports: Jinja2 for templating HTML, WeasyPrint for converting that HTML to professional PDFs, openpyxl for Excel-native reports with charts and formatting, and the integration patterns that connect reports to your data sources and email delivery. By the end, Priya Okonkwo at Acme Corp has a complete monthly business review report running on a schedule, and Maya Reyes has automated client status reports that used to take her half a day each.
Why Automated Reports Matter
Before diving into code, let's be precise about what automated reports actually give you.
Consistency. A hand-built report reflects whoever built it that day. If Marcus is out sick, someone else builds the report slightly differently. If Sandra is in a hurry, the commentary is thinner. Automated reports are identical every time — same structure, same calculations, same formatting.
Speed. A report that takes three hours to build manually takes three seconds to generate automatically. That's not hyperbole; it's arithmetic. The computation is the same either way. The difference is that humans format slowly and computers format instantly.
Auditability. When a number is wrong in a hand-built report, you have to trace back through the Excel formulas and figure out what happened. When a number is wrong in an automated report, you look at the code. The logic is explicit, version-controlled, and testable.
Scalability. If Acme Corp expands from five regions to fifteen, a manual report process needs three times the human effort. An automated process needs a loop with fifteen iterations instead of five — the effort is the same.
Timeliness. Scheduled reports run at 2 AM when the data is fresh and no one is waiting. By the time people arrive at work, the report is already in their inbox.
The business case is overwhelming. Let's build it.
Jinja2: The Language of Templates
Jinja2 is a templating engine — a system that lets you write a document with placeholders, then fill those placeholders with real data at runtime. You've already encountered this kind of thinking if you've ever used mail merge in Word, but Jinja2 is far more powerful.
Install it first:
pip install jinja2 weasyprint openpyxl matplotlib pandas
The core Jinja2 concept is the template: an HTML (or text) file that contains both regular HTML and special Jinja2 syntax enclosed in delimiters.
The Three Delimiter Types
Jinja2 uses three types of delimiters, each with a different purpose:
{{ variable }} — Output a value
{% statement %} — Control flow (if, for, etc.)
{# comment #} — Comment (not rendered)
Here's a minimal example. Suppose you want a report header that shows the company name and report date:
<!DOCTYPE html>
<html>
<head>
<title>{{ company_name }} — Monthly Report</title>
</head>
<body>
<h1>{{ company_name }}</h1>
<h2>Monthly Business Review — {{ report_month }}</h2>
<p>Generated on {{ generation_date }}</p>
</body>
</html>
And the Python code to render it:
from jinja2 import Environment, FileSystemLoader
from datetime import date
def render_report_template(template_name: str, context: dict) -> str:
"""
Render a Jinja2 template with the provided context data.
Args:
template_name: Filename of the template in the templates/ directory.
context: Dictionary of variables to pass to the template.
Returns:
Rendered HTML string.
"""
env = Environment(loader=FileSystemLoader("templates"))
template = env.get_template(template_name)
return template.render(**context)
context = {
"company_name": "Acme Corp",
"report_month": "January 2025",
"generation_date": date.today().strftime("%B %d, %Y"),
}
html_output = render_report_template("monthly_report.html", context)
Loops in Templates
Most reports have tables — rows of data that repeat. Jinja2's for loop handles this directly in the template:
<table>
<thead>
<tr>
<th>Region</th>
<th>Revenue</th>
<th>Units Sold</th>
<th>YoY Change</th>
</tr>
</thead>
<tbody>
{% for region in regions %}
<tr>
<td>{{ region.name }}</td>
<td>${{ "{:,.2f}".format(region.revenue) }}</td>
<td>{{ "{:,}".format(region.units_sold) }}</td>
<td class="{{ 'positive' if region.yoy_change >= 0 else 'negative' }}">
{{ "{:+.1f}%".format(region.yoy_change) }}
</td>
</tr>
{% endfor %}
</tbody>
</table>
The Python side passes a list of objects or dictionaries:
regions = [
{"name": "Northeast", "revenue": 1_245_000, "units_sold": 8_420, "yoy_change": 12.3},
{"name": "Southeast", "revenue": 987_500, "units_sold": 6_891, "yoy_change": -2.1},
{"name": "Midwest", "revenue": 1_102_300, "units_sold": 7_654, "yoy_change": 8.7},
]
Conditionals in Templates
Jinja2 supports if/elif/else for conditional rendering:
{% if total_revenue >= revenue_target %}
<p class="alert-success">
Target exceeded by ${{ "{:,.0f}".format(total_revenue - revenue_target) }}
</p>
{% elif total_revenue >= revenue_target * 0.95 %}
<p class="alert-warning">
Within 5% of target. Gap: ${{ "{:,.0f}".format(revenue_target - total_revenue) }}
</p>
{% else %}
<p class="alert-danger">
Below target. Gap: ${{ "{:,.0f}".format(revenue_target - total_revenue) }}
</p>
{% endif %}
Jinja2 Filters
Filters transform values using the pipe | character. Jinja2 includes many built-in filters, and you can add custom ones:
{{ revenue | round(2) }}
{{ company_name | upper }}
{{ description | truncate(100) }}
{{ items | length }}
{{ report_date | default("Not set") }}
You can also define custom filters in Python and register them with the environment:
def currency_filter(value: float) -> str:
"""Format a number as US currency."""
return f"${value:,.2f}"
def percentage_filter(value: float, decimals: int = 1) -> str:
"""Format a number as a percentage."""
return f"{value:.{decimals}f}%"
env = Environment(loader=FileSystemLoader("templates"))
env.filters["currency"] = currency_filter
env.filters["percentage"] = percentage_filter
Now in your templates you can write {{ revenue | currency }} and {{ growth_rate | percentage }}.
Template Inheritance
For reports with consistent structure but varying content, Jinja2's template inheritance is invaluable. You define a base template with the layout and mark sections that child templates can override:
base_report.html:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>{% block title %}Report{% endblock %}</title>
<style>
{% block styles %}
body { font-family: Arial, sans-serif; margin: 40px; }
h1 { color: #2c3e50; }
{% endblock %}
</style>
</head>
<body>
<header>
<h1>{{ company_name }}</h1>
<p class="report-date">{{ report_date }}</p>
</header>
{% block content %}{% endblock %}
<footer>
<p>Confidential — Generated automatically</p>
</footer>
</body>
</html>
monthly_review.html:
{% extends "base_report.html" %}
{% block title %}Monthly Business Review — {{ report_month }}{% endblock %}
{% block content %}
<h2>Monthly Business Review</h2>
<section class="executive-summary">
{{ executive_summary }}
</section>
{% endblock %}
Building HTML Reports with Jinja2
A well-structured HTML report follows a predictable pattern: executive summary at the top (one page if printed), key metrics in a dashboard row, charts for visual context, data tables for the detail-oriented reader, and a narrative section that explains the numbers.
Let's look at the complete structure for a professional business report:
"""
report_builder.py — Core HTML report generation engine.
"""
from __future__ import annotations
import base64
import io
from datetime import date, datetime
from pathlib import Path
from typing import Any
import matplotlib
matplotlib.use("Agg") # Non-interactive backend for server environments
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
from jinja2 import Environment, FileSystemLoader, select_autoescape
class ReportBuilder:
"""
Generates professional HTML reports from data and Jinja2 templates.
Args:
template_dir: Path to the directory containing Jinja2 templates.
output_dir: Path where generated reports will be saved.
"""
def __init__(self, template_dir: str | Path, output_dir: str | Path) -> None:
self.template_dir = Path(template_dir)
self.output_dir = Path(output_dir)
self.output_dir.mkdir(parents=True, exist_ok=True)
self.env = Environment(
loader=FileSystemLoader(str(self.template_dir)),
autoescape=select_autoescape(["html"]),
)
self._register_filters()
def _register_filters(self) -> None:
"""Register custom Jinja2 filters for business formatting."""
self.env.filters["currency"] = lambda v: f"${v:,.2f}"
self.env.filters["currency_k"] = lambda v: f"${v/1000:,.1f}K"
self.env.filters["pct"] = lambda v, d=1: f"{v:.{d}f}%"
self.env.filters["pct_change"] = lambda v: f"{v:+.1f}%"
self.env.filters["integer"] = lambda v: f"{int(v):,}"
self.env.filters["report_date"] = lambda d: d.strftime("%B %d, %Y")
def chart_to_base64(self, fig: plt.Figure) -> str:
"""
Convert a matplotlib figure to a base64-encoded PNG string.
Embedding charts as base64 means the HTML report is completely
self-contained — no external image files needed.
Args:
fig: A matplotlib Figure object.
Returns:
Base64-encoded string suitable for use in an HTML img tag.
"""
buffer = io.BytesIO()
fig.savefig(buffer, format="png", dpi=150, bbox_inches="tight",
facecolor="white")
buffer.seek(0)
image_data = base64.b64encode(buffer.read()).decode("utf-8")
plt.close(fig)
return f"data:image/png;base64,{image_data}"
def render(self, template_name: str, context: dict[str, Any]) -> str:
"""
Render a Jinja2 template with the provided context.
Args:
template_name: Filename of the template.
context: Variables to inject into the template.
Returns:
Rendered HTML string.
"""
template = self.env.get_template(template_name)
return template.render(**context)
def save_html(self, html: str, filename: str) -> Path:
"""
Save rendered HTML to a file.
Args:
html: Rendered HTML string.
filename: Output filename (without directory).
Returns:
Path to the saved file.
"""
output_path = self.output_dir / filename
output_path.write_text(html, encoding="utf-8")
return output_path
Generating PDFs from HTML with WeasyPrint
WeasyPrint takes HTML/CSS as input and produces pixel-perfect PDFs as output. This means you design your report once in HTML — which is far easier than fighting with PDF generation libraries — and WeasyPrint handles the conversion.
"""
pdf_generator.py — HTML-to-PDF conversion using WeasyPrint.
"""
from pathlib import Path
import weasyprint
def html_to_pdf(html_content: str, output_path: str | Path) -> Path:
"""
Convert an HTML string to a PDF file.
Args:
html_content: Complete HTML string to convert.
output_path: Where to save the PDF.
Returns:
Path to the generated PDF.
Raises:
OSError: If the output directory does not exist or is not writable.
"""
output_path = Path(output_path)
output_path.parent.mkdir(parents=True, exist_ok=True)
document = weasyprint.HTML(string=html_content)
document.write_pdf(str(output_path))
return output_path
def html_file_to_pdf(html_path: str | Path, output_path: str | Path) -> Path:
"""
Convert an HTML file to a PDF file.
Args:
html_path: Path to the HTML file.
output_path: Where to save the PDF.
Returns:
Path to the generated PDF.
"""
html_path = Path(html_path)
output_path = Path(output_path)
document = weasyprint.HTML(filename=str(html_path))
document.write_pdf(str(output_path))
return output_path
CSS for Print-Ready PDFs
WeasyPrint respects CSS @page rules, which give you control over page size, margins, headers, and footers:
@page {
size: A4 landscape;
margin: 20mm 15mm 25mm 15mm;
@top-center {
content: "Acme Corp — Monthly Business Review";
font-size: 9pt;
color: #666;
}
@bottom-right {
content: "Page " counter(page) " of " counter(pages);
font-size: 9pt;
color: #666;
}
}
/* Force page breaks before major sections */
.section-break {
page-break-before: always;
}
/* Keep table rows together */
tr {
page-break-inside: avoid;
}
Including this CSS in your HTML template ensures that when WeasyPrint renders the PDF, it has proper page numbers, headers, and clean page breaks.
Building Excel Reports with openpyxl
Some audiences prefer Excel — they want to filter the data themselves, build their own pivot tables, or copy numbers into other spreadsheets. openpyxl lets you generate native .xlsx files with multiple sheets, charts, and full formatting.
"""
excel_report_builder.py — Excel report generation with openpyxl.
"""
from __future__ import annotations
from datetime import date
from pathlib import Path
from typing import Any
import openpyxl
from openpyxl import Workbook
from openpyxl.chart import BarChart, Reference
from openpyxl.styles import (
Alignment,
Border,
Font,
PatternFill,
Side,
)
from openpyxl.utils import get_column_letter
# ── Style constants ────────────────────────────────────────────────────────────
HEADER_FILL = PatternFill(
start_color="2C3E50", end_color="2C3E50", fill_type="solid"
)
HEADER_FONT = Font(color="FFFFFF", bold=True, size=11)
ACCENT_FILL = PatternFill(
start_color="EBF5FB", end_color="EBF5FB", fill_type="solid"
)
POSITIVE_FONT = Font(color="1E8449", bold=True)
NEGATIVE_FONT = Font(color="C0392B", bold=True)
THIN_BORDER = Border(
left=Side(style="thin"),
right=Side(style="thin"),
top=Side(style="thin"),
bottom=Side(style="thin"),
)
def style_header_row(ws: openpyxl.worksheet.worksheet.Worksheet,
row: int, num_cols: int) -> None:
"""Apply consistent header styling to a row."""
for col in range(1, num_cols + 1):
cell = ws.cell(row=row, column=col)
cell.fill = HEADER_FILL
cell.font = HEADER_FONT
cell.alignment = Alignment(horizontal="center")
cell.border = THIN_BORDER
def auto_size_columns(ws: openpyxl.worksheet.worksheet.Worksheet,
min_width: int = 10,
max_width: int = 40) -> None:
"""Automatically size all columns based on their content."""
for column_cells in ws.columns:
max_length = 0
column_letter = get_column_letter(column_cells[0].column)
for cell in column_cells:
if cell.value:
max_length = max(max_length, len(str(cell.value)))
adjusted_width = min(max(max_length + 2, min_width), max_width)
ws.column_dimensions[column_letter].width = adjusted_width
def build_revenue_chart(ws: openpyxl.worksheet.worksheet.Worksheet,
data_row_start: int,
data_row_end: int,
title: str = "Revenue by Region") -> BarChart:
"""
Create a bar chart from region revenue data.
Args:
ws: The worksheet containing the data.
data_row_start: First data row (after headers).
data_row_end: Last data row.
title: Chart title.
Returns:
A configured BarChart object ready to be added to a worksheet.
"""
chart = BarChart()
chart.type = "col"
chart.title = title
chart.y_axis.title = "Revenue ($)"
chart.x_axis.title = "Region"
chart.style = 10
chart.width = 20
chart.height = 12
revenue_data = Reference(ws, min_col=2, min_row=data_row_start - 1,
max_row=data_row_end)
region_labels = Reference(ws, min_col=1, min_row=data_row_start,
max_row=data_row_end)
chart.add_data(revenue_data, titles_from_data=True)
chart.set_categories(region_labels)
return chart
Writing Data to Multiple Sheets
A well-structured Excel report often has several sheets: a Summary dashboard, detailed data sheets, and perhaps a chart-only sheet for presentations.
def create_acme_excel_report(
report_data: dict[str, Any],
output_path: str | Path,
) -> Path:
"""
Create a multi-sheet Excel report for Acme Corp monthly review.
Args:
report_data: Dictionary containing all report metrics and tables.
output_path: Where to save the .xlsx file.
Returns:
Path to the generated Excel file.
"""
output_path = Path(output_path)
wb = Workbook()
# ── Sheet 1: Executive Summary ─────────────────────────────────────────────
ws_summary = wb.active
ws_summary.title = "Executive Summary"
ws_summary["A1"] = "Acme Corp Monthly Business Review"
ws_summary["A1"].font = Font(size=18, bold=True, color="2C3E50")
ws_summary["A2"] = f"Report Period: {report_data['report_month']}"
ws_summary["A2"].font = Font(size=12, italic=True)
# KPI metrics block
kpi_start_row = 4
kpis = [
("Total Revenue", report_data["total_revenue"], "currency"),
("Units Sold", report_data["total_units"], "integer"),
("Gross Margin", report_data["gross_margin_pct"], "percentage"),
("New Customers", report_data["new_customers"], "integer"),
("Customer Retention", report_data["retention_rate"], "percentage"),
]
ws_summary[f"A{kpi_start_row}"] = "Key Performance Indicators"
ws_summary[f"A{kpi_start_row}"].font = Font(bold=True, size=12)
for i, (label, value, fmt) in enumerate(kpis, start=kpi_start_row + 1):
ws_summary[f"A{i}"] = label
ws_summary[f"A{i}"].font = Font(bold=True)
if fmt == "currency":
ws_summary[f"B{i}"] = value
ws_summary[f"B{i}"].number_format = '"$"#,##0.00'
elif fmt == "integer":
ws_summary[f"B{i}"] = value
ws_summary[f"B{i}"].number_format = "#,##0"
elif fmt == "percentage":
ws_summary[f"B{i}"] = value / 100
ws_summary[f"B{i}"].number_format = "0.0%"
# ── Sheet 2: Regional Breakdown ────────────────────────────────────────────
ws_regions = wb.create_sheet("Regional Breakdown")
headers = ["Region", "Revenue", "Units Sold", "Avg Order Value",
"New Customers", "YoY Change"]
for col, header in enumerate(headers, start=1):
ws_regions.cell(row=1, column=col, value=header)
style_header_row(ws_regions, 1, len(headers))
for row_idx, region in enumerate(report_data["regions"], start=2):
ws_regions.cell(row=row_idx, column=1, value=region["name"])
ws_regions.cell(row=row_idx, column=2, value=region["revenue"]).number_format = '"$"#,##0.00'
ws_regions.cell(row=row_idx, column=3, value=region["units_sold"]).number_format = "#,##0"
ws_regions.cell(row=row_idx, column=4, value=region["avg_order_value"]).number_format = '"$"#,##0.00'
ws_regions.cell(row=row_idx, column=5, value=region["new_customers"]).number_format = "#,##0"
yoy_cell = ws_regions.cell(row=row_idx, column=6, value=region["yoy_change"] / 100)
yoy_cell.number_format = "+0.0%;-0.0%"
yoy_cell.font = POSITIVE_FONT if region["yoy_change"] >= 0 else NEGATIVE_FONT
if row_idx % 2 == 0:
for col in range(1, len(headers) + 1):
ws_regions.cell(row=row_idx, column=col).fill = ACCENT_FILL
# Add chart
chart = build_revenue_chart(ws_regions, 2, len(report_data["regions"]) + 1)
ws_regions.add_chart(chart, "H2")
auto_size_columns(ws_regions)
wb.save(output_path)
return output_path
Embedding Matplotlib Charts in HTML Reports
Charts make reports scannable. A reader can absorb the trend from a chart in two seconds; the same information in a table takes thirty. The trick for self-contained HTML reports is converting charts to base64-encoded images that are embedded directly in the HTML.
"""
chart_generators.py — Reusable business chart creation functions.
"""
from __future__ import annotations
import base64
import io
from typing import Sequence
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import numpy as np
# ── Consistent visual style ────────────────────────────────────────────────────
ACME_COLORS = ["#2C3E50", "#3498DB", "#2ECC71", "#E74C3C", "#F39C12"]
CHART_STYLE = {
"figure.facecolor": "white",
"axes.facecolor": "#F8F9FA",
"axes.spines.top": False,
"axes.spines.right": False,
"axes.grid": True,
"grid.alpha": 0.4,
"font.family": "DejaVu Sans",
}
def fig_to_base64(fig: plt.Figure) -> str:
"""Convert a matplotlib figure to a base64 data URI."""
buf = io.BytesIO()
fig.savefig(buf, format="png", dpi=150, bbox_inches="tight",
facecolor="white")
buf.seek(0)
encoded = base64.b64encode(buf.read()).decode("utf-8")
plt.close(fig)
return f"data:image/png;base64,{encoded}"
def revenue_trend_chart(
months: Sequence[str],
revenue: Sequence[float],
target: float | None = None,
title: str = "Monthly Revenue Trend",
) -> str:
"""
Generate a line chart showing revenue over time.
Args:
months: List of month labels for the x-axis.
revenue: List of revenue values corresponding to each month.
target: Optional horizontal target/budget line.
title: Chart title.
Returns:
Base64-encoded PNG data URI.
"""
with plt.rc_context(CHART_STYLE):
fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(months, revenue, color=ACME_COLORS[0], linewidth=2.5,
marker="o", markersize=6, label="Actual Revenue")
if target is not None:
ax.axhline(y=target, color=ACME_COLORS[3], linestyle="--",
linewidth=1.5, label=f"Target: ${target:,.0f}")
ax.fill_between(months, revenue, alpha=0.1, color=ACME_COLORS[0])
ax.yaxis.set_major_formatter(mticker.FuncFormatter(
lambda x, _: f"${x/1_000_000:.1f}M"
))
ax.set_title(title, fontsize=14, fontweight="bold", pad=15)
ax.set_xlabel("Month")
ax.set_ylabel("Revenue")
ax.legend()
if len(months) > 6:
plt.xticks(rotation=45, ha="right")
plt.tight_layout()
return fig_to_base64(fig)
def regional_comparison_chart(
regions: Sequence[str],
current_values: Sequence[float],
prior_values: Sequence[float] | None = None,
title: str = "Revenue by Region",
) -> str:
"""
Generate a grouped bar chart comparing regional performance.
Args:
regions: Region names for x-axis labels.
current_values: Current period values.
prior_values: Optional prior period values for comparison.
title: Chart title.
Returns:
Base64-encoded PNG data URI.
"""
with plt.rc_context(CHART_STYLE):
fig, ax = plt.subplots(figsize=(10, 5))
x = np.arange(len(regions))
if prior_values is not None:
bar_width = 0.35
ax.bar(x - bar_width / 2, current_values, bar_width,
label="Current Period", color=ACME_COLORS[0])
ax.bar(x + bar_width / 2, prior_values, bar_width,
label="Prior Period", color=ACME_COLORS[1], alpha=0.7)
ax.legend()
else:
ax.bar(x, current_values, color=ACME_COLORS[:len(regions)])
ax.set_xticks(x)
ax.set_xticklabels(regions)
ax.yaxis.set_major_formatter(mticker.FuncFormatter(
lambda v, _: f"${v/1_000:.0f}K"
))
ax.set_title(title, fontsize=14, fontweight="bold", pad=15)
ax.set_ylabel("Revenue")
plt.tight_layout()
return fig_to_base64(fig)
def kpi_gauge_chart(value: float, target: float, label: str) -> str:
"""
Generate a simple gauge/progress chart for a single KPI.
Args:
value: Actual value.
target: Target value.
label: KPI label.
Returns:
Base64-encoded PNG data URI.
"""
pct = min(value / target, 1.2) # Cap at 120% for display
color = ACME_COLORS[2] if pct >= 1.0 else (
ACME_COLORS[4] if pct >= 0.9 else ACME_COLORS[3]
)
with plt.rc_context(CHART_STYLE):
fig, ax = plt.subplots(figsize=(3, 3))
ax.axis("off")
# Draw progress arc
theta = np.linspace(np.pi, np.pi - np.pi * min(pct, 1.0), 100)
x_arc = np.cos(theta)
y_arc = np.sin(theta)
ax.plot(np.cos(np.linspace(np.pi, 0, 100)),
np.sin(np.linspace(np.pi, 0, 100)),
color="#E0E0E0", linewidth=15, solid_capstyle="round")
ax.plot(x_arc, y_arc, color=color, linewidth=15,
solid_capstyle="round")
ax.text(0, 0.1, f"{pct * 100:.0f}%", ha="center", va="center",
fontsize=22, fontweight="bold", color=color)
ax.text(0, -0.3, label, ha="center", va="center",
fontsize=10, color="#555")
ax.set_xlim(-1.3, 1.3)
ax.set_ylim(-0.5, 1.3)
plt.tight_layout()
return fig_to_base64(fig)
Report Structure Best Practices
A professional business report follows a recognizable structure that experienced readers navigate instinctively. Violating this structure — putting detailed tables before the executive summary, burying the key insight on page four — loses readers before they reach the important parts.
Executive Summary (first page, always) - Three to five bullet points maximum - The most important numbers: total revenue, whether you hit your target, the biggest outlier - Written for someone who will read nothing else in the report - Auto-generate this from your data: "Revenue of $X.XM exceeded target by X%, driven by strong performance in the Northeast region."
KPI Dashboard (second section) - Key metrics displayed prominently, ideally with sparklines or mini-charts - Traffic-light coloring: green for on/above target, yellow for within 5%, red for below - Prior period comparisons alongside current values
Charts Section - Two to four charts maximum — more than that and none of them get absorbed - Trend chart over time, regional or product breakdown, and one diagnostic chart - Every chart has a one-sentence takeaway caption written into the report
Data Tables - Complete detail for readers who want it - Sortable in Excel; paginated in HTML - Footnotes explaining methodology or data sources
Commentary/Narrative - Human language explaining what happened and why - Can be partially auto-generated: "The Midwest region increased revenue 8.7% year-over-year. This follows three months of new territory expansion announced in Q3." - Flag anomalies explicitly: "Southeast revenue declined 2.1%. Investigation shows this coincides with a major competitor's pricing promotion."
Parameterizing Reports
The whole point of automated reports is that the same template serves many scenarios. You parameterize reports by separating the template from the data, and the data from the query parameters.
"""
parameterized_report.py — Report generation with flexible date range and region filtering.
"""
from __future__ import annotations
from dataclasses import dataclass, field
from datetime import date
from typing import Literal
@dataclass
class ReportParameters:
"""
Configuration for a parameterized report run.
Attributes:
start_date: First date of the reporting period.
end_date: Last date of the reporting period.
regions: List of regions to include. Empty list means all regions.
report_type: Level of detail in the report.
recipient_emails: Who should receive the completed report.
"""
start_date: date
end_date: date
regions: list[str] = field(default_factory=list)
report_type: Literal["summary", "detailed", "executive"] = "detailed"
recipient_emails: list[str] = field(default_factory=list)
@property
def period_label(self) -> str:
"""Human-readable period label for report headers."""
if self.start_date.month == self.end_date.month:
return self.start_date.strftime("%B %Y")
return (f"{self.start_date.strftime('%b %d')} – "
f"{self.end_date.strftime('%b %d, %Y')}")
@property
def filename_slug(self) -> str:
"""URL/filesystem-safe filename component."""
return self.start_date.strftime("%Y-%m")
def generate_report_for_params(params: ReportParameters) -> dict:
"""
Fetch and process data for the given report parameters.
In production, this would query your database or data warehouse.
The key insight is that params drives all the filtering —
the template and rendering logic don't change.
Args:
params: ReportParameters controlling what data to fetch.
Returns:
Context dictionary ready for template rendering.
"""
# This would be a real database query in production:
# data = db.query_revenue(params.start_date, params.end_date, params.regions)
# For illustration, we return a structure:
return {
"report_month": params.period_label,
"generation_date": date.today(),
"regions_filter": params.regions or ["All Regions"],
"report_type": params.report_type,
}
Scheduling Report Generation
Automated reports are only truly automated if they run without human intervention. Chapter 22 covered scheduling in depth; here's how to connect report generation to a schedule.
Using cron (Linux/macOS):
# Run monthly report on the 1st at 2:00 AM
0 2 1 * * /usr/bin/python3 /opt/reports/acme_monthly_report.py >> /var/log/reports.log 2>&1
Using Windows Task Scheduler:
# schedule_report.py — Register the report with Windows Task Scheduler
import subprocess
task_xml = """<?xml version="1.0" ?>
<Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
<Triggers>
<CalendarTrigger>
<StartBoundary>2025-01-01T02:00:00</StartBoundary>
<ScheduleByMonth>
<DaysOfMonth><Day>1</Day></DaysOfMonth>
<Months><January/><February/>...</Months>
</ScheduleByMonth>
</CalendarTrigger>
</Triggers>
<Actions>
<Exec>
<Command>python</Command>
<Arguments>C:\\reports\\acme_monthly_report.py</Arguments>
</Exec>
</Actions>
</Task>"""
Using the schedule library for in-process scheduling:
import schedule
import time
from datetime import date
def run_monthly_report() -> None:
"""Entry point called by the scheduler."""
today = date.today()
if today.day == 1: # Only run on the 1st of the month
print(f"Generating monthly report for {today.strftime('%B %Y')}...")
# generate_and_email_report(today)
schedule.every().day.at("02:00").do(run_monthly_report)
while True:
schedule.run_pending()
time.sleep(60)
Email Delivery of Completed Reports
Chapter 19 covered email automation. Reports typically go out as PDF attachments with an HTML email body containing the executive summary.
"""
report_emailer.py — Send completed reports via email with PDF attachment.
"""
from __future__ import annotations
import smtplib
from email.mime.application import MIMEApplication
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from pathlib import Path
def send_report_email(
recipients: list[str],
subject: str,
html_body: str,
pdf_path: str | Path,
smtp_host: str,
smtp_port: int,
smtp_user: str,
smtp_password: str,
sender_name: str = "Acme Corp Reports",
) -> None:
"""
Send a report email with a PDF attachment.
Args:
recipients: List of email addresses to send to.
subject: Email subject line.
html_body: HTML content for the email body (executive summary).
pdf_path: Path to the PDF report file to attach.
smtp_host: SMTP server hostname.
smtp_port: SMTP server port (typically 587 for TLS).
smtp_user: SMTP authentication username.
smtp_password: SMTP authentication password.
sender_name: Display name for the sender.
Raises:
smtplib.SMTPException: If email delivery fails.
FileNotFoundError: If the PDF file does not exist.
"""
pdf_path = Path(pdf_path)
if not pdf_path.exists():
raise FileNotFoundError(f"PDF report not found: {pdf_path}")
message = MIMEMultipart("alternative")
message["Subject"] = subject
message["From"] = f"{sender_name} <{smtp_user}>"
message["To"] = ", ".join(recipients)
# Plain text fallback
plain_text = "Please view this email in an HTML-capable client."
message.attach(MIMEText(plain_text, "plain"))
message.attach(MIMEText(html_body, "html"))
# Attach PDF
with pdf_path.open("rb") as pdf_file:
pdf_attachment = MIMEApplication(pdf_file.read(), _subtype="pdf")
pdf_attachment.add_header(
"Content-Disposition",
"attachment",
filename=pdf_path.name,
)
message.attach(pdf_attachment)
try:
with smtplib.SMTP(smtp_host, smtp_port) as server:
server.ehlo()
server.starttls()
server.login(smtp_user, smtp_password)
server.sendmail(smtp_user, recipients, message.as_string())
print(f"Report emailed to {len(recipients)} recipient(s)")
except smtplib.SMTPException as exc:
print(f"Email delivery failed: {exc}")
raise
Acme Corp: A Full Monthly Business Review Report
Let's tie all of these pieces together in a complete example. Priya Okonkwo at Acme Corp needs to generate and distribute the monthly business review report. Here's the full pipeline:
"""
acme_monthly_report.py — Complete monthly business review report pipeline.
"""
from __future__ import annotations
import os
from datetime import date, timedelta
from pathlib import Path
import sys
# (In production: from report_builder import ReportBuilder, etc.)
def get_report_period(reference_date: date | None = None) -> tuple[date, date]:
"""
Calculate the start and end dates for the prior month's report.
Args:
reference_date: Date to calculate from. Defaults to today.
Returns:
Tuple of (start_date, end_date) for the reporting period.
"""
if reference_date is None:
reference_date = date.today()
# Last day of prior month
end_date = date(reference_date.year, reference_date.month, 1) - timedelta(days=1)
# First day of that month
start_date = date(end_date.year, end_date.month, 1)
return start_date, end_date
def fetch_monthly_data(start_date: date, end_date: date) -> dict:
"""
Fetch all data needed for the monthly business review.
In production this queries the Acme Corp database. For this
example, representative data is returned directly.
Args:
start_date: First day of the reporting period.
end_date: Last day of the reporting period.
Returns:
Complete data dictionary for template rendering.
"""
return {
"company_name": "Acme Corp",
"report_month": start_date.strftime("%B %Y"),
"start_date": start_date,
"end_date": end_date,
"generation_date": date.today(),
"total_revenue": 4_334_800,
"revenue_target": 4_200_000,
"prior_month_revenue": 3_987_500,
"total_units": 29_847,
"gross_margin_pct": 42.3,
"new_customers": 127,
"retention_rate": 94.2,
"regions": [
{
"name": "Northeast",
"revenue": 1_245_000,
"units_sold": 8_420,
"avg_order_value": 147.86,
"new_customers": 38,
"yoy_change": 12.3,
},
{
"name": "Southeast",
"revenue": 987_500,
"units_sold": 6_891,
"avg_order_value": 143.30,
"new_customers": 22,
"yoy_change": -2.1,
},
{
"name": "Midwest",
"revenue": 1_102_300,
"units_sold": 7_654,
"avg_order_value": 144.01,
"new_customers": 41,
"yoy_change": 8.7,
},
{
"name": "West",
"revenue": 1_000_000,
"units_sold": 6_882,
"avg_order_value": 145.30,
"new_customers": 26,
"yoy_change": 5.2,
},
],
"top_products": [
{"name": "Premium Copy Paper (Case)", "units": 4_201, "revenue": 252_060},
{"name": "Executive Desk Chair", "units": 892, "revenue": 312_200},
{"name": "Laser Toner Bundle", "units": 2_143, "revenue": 193_870},
{"name": "Standing Desk", "units": 447, "revenue": 268_200},
{"name": "Wireless Keyboard/Mouse Set", "units": 1_893, "revenue": 113_580},
],
"monthly_trend": {
"months": ["Jul", "Aug", "Sep", "Oct", "Nov", "Dec", "Jan"],
"revenue": [3_800_000, 3_920_000, 4_100_000, 3_987_500,
3_750_000, 4_050_000, 4_334_800],
},
}
def generate_executive_summary(data: dict) -> str:
"""
Auto-generate the executive summary text from report data.
Args:
data: The complete report data dictionary.
Returns:
HTML string for the executive summary section.
"""
revenue = data["total_revenue"]
target = data["revenue_target"]
variance = revenue - target
variance_pct = (variance / target) * 100
top_region = max(data["regions"], key=lambda r: r["revenue"])
worst_region = min(data["regions"], key=lambda r: r["yoy_change"])
if variance >= 0:
revenue_sentence = (
f"Revenue of <strong>${revenue:,.0f}</strong> exceeded target "
f"by <strong>${variance:,.0f} ({variance_pct:.1f}%)</strong>."
)
else:
revenue_sentence = (
f"Revenue of <strong>${revenue:,.0f}</strong> fell short of target "
f"by <strong>${abs(variance):,.0f} ({abs(variance_pct):.1f}%)</strong>."
)
top_region_sentence = (
f"The {top_region['name']} region led performance with "
f"${top_region['revenue']:,.0f} in revenue "
f"({top_region['yoy_change']:+.1f}% year-over-year)."
)
if worst_region["yoy_change"] < 0:
concern_sentence = (
f"The {worst_region['name']} region warrants attention, "
f"with revenue declining {worst_region['yoy_change']:.1f}% versus prior year."
)
else:
concern_sentence = (
f"All regions posted positive year-over-year growth this month."
)
return f"""
<ul>
<li>{revenue_sentence}</li>
<li>New customer acquisition: <strong>{data['new_customers']}</strong>
customers this month.</li>
<li>{top_region_sentence}</li>
<li>Gross margin: <strong>{data['gross_margin_pct']:.1f}%</strong>
(target: 41.0%).</li>
<li>{concern_sentence}</li>
</ul>
"""
def main() -> None:
"""Run the complete monthly report pipeline."""
print("Acme Corp Monthly Report Generator")
print("=" * 40)
# Determine report period
start_date, end_date = get_report_period()
print(f"Report period: {start_date} to {end_date}")
# Fetch data
print("Fetching report data...")
data = fetch_monthly_data(start_date, end_date)
# Generate executive summary
data["executive_summary"] = generate_executive_summary(data)
print("Executive summary generated.")
# Output paths
output_dir = Path("reports") / data["report_month"].replace(" ", "_")
output_dir.mkdir(parents=True, exist_ok=True)
html_path = output_dir / "acme_monthly_review.html"
pdf_path = output_dir / "acme_monthly_review.pdf"
excel_path = output_dir / "acme_monthly_review.xlsx"
print(f"Output directory: {output_dir}")
print("Report generation complete. Files written:")
print(f" HTML: {html_path}")
print(f" PDF: {pdf_path}")
print(f" Excel:{excel_path}")
if __name__ == "__main__":
main()
Putting It All Together: The Complete Template
Here's the HTML template that ties all the pieces together. This lives in your templates/ directory and is rendered by ReportBuilder:
{# templates/acme_monthly_report.html #}
{% extends "base_report.html" %}
{% block title %}Acme Corp Monthly Business Review — {{ report_month }}{% endblock %}
{% block content %}
{# ── Executive Summary ─────────────────────────────────────────── #}
<section class="executive-summary">
<h2>Executive Summary</h2>
<div class="summary-box">
{{ executive_summary | safe }}
</div>
</section>
{# ── KPI Dashboard ─────────────────────────────────────────────── #}
<section class="kpi-dashboard">
<h2>Key Performance Indicators</h2>
<div class="kpi-grid">
<div class="kpi-card {{ 'kpi-positive' if total_revenue >= revenue_target else 'kpi-negative' }}">
<div class="kpi-label">Total Revenue</div>
<div class="kpi-value">{{ total_revenue | currency }}</div>
<div class="kpi-sub">Target: {{ revenue_target | currency }}</div>
</div>
<div class="kpi-card kpi-neutral">
<div class="kpi-label">Units Sold</div>
<div class="kpi-value">{{ total_units | integer }}</div>
</div>
<div class="kpi-card kpi-positive">
<div class="kpi-label">Gross Margin</div>
<div class="kpi-value">{{ gross_margin_pct | pct }}</div>
<div class="kpi-sub">Target: 41.0%</div>
</div>
<div class="kpi-card kpi-neutral">
<div class="kpi-label">New Customers</div>
<div class="kpi-value">{{ new_customers | integer }}</div>
</div>
<div class="kpi-card kpi-positive">
<div class="kpi-label">Retention Rate</div>
<div class="kpi-value">{{ retention_rate | pct }}</div>
</div>
</div>
</section>
{# ── Charts ────────────────────────────────────────────────────── #}
<section class="charts section-break">
<h2>Performance Charts</h2>
<div class="chart-grid">
<figure>
<img src="{{ revenue_trend_chart }}" alt="Revenue trend chart">
<figcaption>
Monthly revenue trend over the trailing 7 months.
Current month exceeds target by {{ ((total_revenue - revenue_target) / revenue_target * 100) | pct }}.
</figcaption>
</figure>
<figure>
<img src="{{ regional_chart }}" alt="Regional comparison chart">
<figcaption>
Revenue by region, current period vs. prior period.
</figcaption>
</figure>
</div>
</section>
{# ── Regional Breakdown Table ──────────────────────────────────── #}
<section class="data-tables section-break">
<h2>Regional Performance Detail</h2>
<table class="report-table">
<thead>
<tr>
<th>Region</th>
<th>Revenue</th>
<th>Units Sold</th>
<th>Avg Order</th>
<th>New Customers</th>
<th>YoY Change</th>
</tr>
</thead>
<tbody>
{% for region in regions %}
<tr class="{{ 'row-alt' if loop.index is even else '' }}">
<td>{{ region.name }}</td>
<td>{{ region.revenue | currency }}</td>
<td>{{ region.units_sold | integer }}</td>
<td>{{ region.avg_order_value | currency }}</td>
<td>{{ region.new_customers | integer }}</td>
<td class="{{ 'positive' if region.yoy_change >= 0 else 'negative' }}">
{{ region.yoy_change | pct_change }}
</td>
</tr>
{% endfor %}
</tbody>
<tfoot>
<tr>
<td><strong>Total</strong></td>
<td><strong>{{ total_revenue | currency }}</strong></td>
<td><strong>{{ total_units | integer }}</strong></td>
<td>—</td>
<td><strong>{{ new_customers | integer }}</strong></td>
<td>—</td>
</tr>
</tfoot>
</table>
</section>
{% endblock %}
Performance Considerations
When reports involve large datasets, generation time can become an issue. A few practical guidelines:
Pre-aggregate in the database. Don't pull 500,000 transaction rows into Python to compute monthly totals. Write a SQL query that aggregates at the database level and pull the summary.
Cache charts. If you're generating regional charts and the regional data hasn't changed, reuse the cached base64 string. This is especially useful for reports that run throughout the day with small data changes.
Generate asynchronously. For reports triggered by user action (not just scheduled), generate them in a background task and email the result, rather than making the user wait for generation to complete.
Limit to what's needed. Executive reports need five metrics and two charts. Detailed operational reports need everything. Don't add detail to the executive report just because the data is available.
Summary
You now have a complete, professional report generation toolkit:
- Jinja2 handles templating: HTML with loops, conditionals, and filters that transform your data into formatted output
- WeasyPrint converts that HTML to print-ready PDFs with proper page headers, footers, and breaks
- openpyxl generates native Excel files with multiple sheets, charts, and professional formatting for audiences who work in Excel
- Matplotlib creates charts that embed directly in HTML as base64 images, making reports fully self-contained
- Parameterization means the same template serves different date ranges, regions, or report types
- Scheduling connects reports to the calendar, so they run automatically without human intervention
- Email delivery puts the completed report in the right inboxes within seconds of generation
The monthly effort Priya used to spend building Acme's business review report by hand is now a three-second Python execution. The report is more consistent, more accurate, and more complete than the manual version — and it arrives in every regional manager's inbox at 6:00 AM on the first of every month, reliably and automatically.
In Chapter 37, we'll build the Flask web application that lets stakeholders access reports interactively through a browser. In Chapter 38, we'll deploy that application to the cloud so it's accessible anywhere, anytime — completing Maya's full journey from a CSV file to a live, deployed client portal.