Three years ago, Acme Corp's inventory process looked like this: Marcus Webb's team exported data from the ERP system into spreadsheets, Priya Okonkwo manually consolidated four regional files every Monday morning, and the purchasing team made...
In This Chapter
- The Milestone Chapter: When Python Becomes Essential
- 32.1 Why Supply Chain Analytics Is Different From Sales Analytics
- 32.2 Core Supply Chain Metrics
- 32.3 Economic Order Quantity: How Much to Order
- 32.4 ABC Analysis: Focus Where It Matters
- 32.5 Lead Time Analysis
- 32.6 Supplier Performance Scorecards
- 32.7 Demand Forecasting for Inventory
- 32.8 Connecting to acme_inventory.db
- 32.9 Building Reorder Alerts
- 32.10 The Complete Acme Supply Chain Dashboard
- 32.11 The Milestone: Priya Presents to Leadership
- 32.12 Putting It All Together: The Complete Workflow
- 32.13 What You Have Built, and What Comes Next
- Chapter Summary
- Key Terms
Chapter 32: Inventory and Supply Chain Analytics
The Milestone Chapter: When Python Becomes Essential
Three years ago, Acme Corp's inventory process looked like this: Marcus Webb's team exported data from the ERP system into spreadsheets, Priya Okonkwo manually consolidated four regional files every Monday morning, and the purchasing team made reorder decisions based on a combination of gut feel and a printed report that was already five days old by the time it reached anyone's desk.
Today, you are reading the chapter where that story ends — and a new one begins.
By the time you finish this chapter, you will understand how to build the kind of system Priya built: one that pulls live inventory data from a database, joins it with sales history, calculates the metrics that actually drive purchasing decisions, and surfaces the right information to the right people automatically. You will know what inventory turnover means and why it matters. You will know how to calculate a reorder point that accounts for lead time variability. You will know how ABC analysis can tell you where to focus your attention in a thousand-SKU catalog.
And you will see Priya present her complete supply chain analytics system to Acme's leadership team — the moment that earns her a promotion and changes how the company thinks about data.
Let's build it.
32.1 Why Supply Chain Analytics Is Different From Sales Analytics
In Chapter 11, you worked with Acme's sales data and helped answer the question: "What have we sold?" In Chapter 26, you built forecasting models to answer: "What will we sell?" Supply chain analytics asks a different set of questions entirely:
- Do we have enough of the right products, in the right place, at the right time?
- Are we carrying too much of things that are not selling?
- Which suppliers are reliable, and which are putting our fill rates at risk?
- When should we order, and how much should we order?
These questions sit at the intersection of historical data, current state, and forward-looking decisions. Getting them wrong is expensive in both directions. Too much inventory means cash tied up in product sitting on shelves, warehouse space consumed, and the risk of obsolescence. Too little inventory means stockouts, lost sales, and unhappy customers who may not come back.
Python is exceptionally well-suited to this work because supply chain analytics requires joining multiple data sources (sales history, current stock levels, supplier records, purchase orders), applying formulas that update as data changes, and surfacing results in a format that non-technical stakeholders can act on. All of that is exactly what Python does well.
The Data Sources You Will Work With
In this chapter, you will connect three data sources that have appeared throughout the book:
- acme_inventory.db — The SQLite database from Chapter 23, containing current stock levels, reorder parameters, and supplier information for Acme's product catalog
- acme_sales_2023.csv — The sales transaction file from Chapter 11, which tells you how fast products are actually moving
- Supplier lead time records embedded in the inventory database
By joining these sources, you can calculate metrics that neither source could provide alone.
32.2 Core Supply Chain Metrics
Before you write a single line of code, you need to understand what you are measuring. These metrics are the vocabulary of supply chain management, and every purchasing manager, operations director, and CFO knows them. When Priya presented her dashboard, she did not have to explain what inventory turnover was — she just had to show that she could calculate it accurately and automatically.
Inventory Turnover
Inventory turnover measures how many times a company sells through its entire inventory in a given period. A higher number generally means you are managing inventory efficiently; a lower number may mean you are carrying too much stock.
Formula: Inventory Turnover = Cost of Goods Sold / Average Inventory Value
For Acme, this calculation is done at the product level, the category level, and the company level. A product with turnover of 24 is selling through its average stock twice a month. A product with turnover of 2 is sitting on the shelf for six months between restocks.
Industry benchmarks vary significantly:
- Grocery: 20-30x (perishables move fast)
- Apparel retail: 4-6x
- Industrial/office distributor (like Acme): 6-10x
- Auto dealer: 8-12x
Higher turnover is generally better — it means less working capital tied up in stock — but not infinitely better. Extremely high turnover often accompanies stockout risk.
def calculate_inventory_turnover(
cost_of_goods_sold: float,
opening_inventory_value: float,
closing_inventory_value: float,
) -> float:
"""
Calculate inventory turnover ratio for a given period.
Higher values indicate faster-moving inventory. Industry benchmarks
vary widely: grocery might target 20-30x annually, while industrial
distributors typically run 6-10x.
Args:
cost_of_goods_sold: Total COGS for the period.
opening_inventory_value: Inventory value at period start (at cost).
closing_inventory_value: Inventory value at period end (at cost).
Returns:
Inventory turnover ratio. Returns 0.0 if average inventory is zero.
"""
average_inventory_value = (opening_inventory_value + closing_inventory_value) / 2
if average_inventory_value <= 0:
return 0.0
return cost_of_goods_sold / average_inventory_value
Days Inventory Outstanding (DIO)
DIO converts turnover into a number that most people find more intuitive: how many days, on average, does a unit of inventory sit on the shelf before being sold?
Formula: DIO = 365 / Inventory Turnover
Or equivalently: DIO = (Average Inventory Value / COGS) × 365
A DIO of 30 means you are holding, on average, about a month of inventory. Whether that is good or bad depends entirely on your industry, your supplier lead times, and your service level targets.
def calculate_days_inventory_outstanding(
inventory_turnover: float,
) -> float:
"""
Convert inventory turnover ratio to days inventory outstanding (DIO).
DIO represents the average number of days inventory is held before sale.
Lower values indicate more efficient inventory management, but DIO must
be evaluated in the context of industry norms and lead times.
Args:
inventory_turnover: Inventory turnover ratio (from calculate_inventory_turnover).
Returns:
Days inventory outstanding. Returns float('inf') if turnover is zero.
"""
if inventory_turnover <= 0:
return float("inf")
return 365.0 / inventory_turnover
Stockout Rate
A stockout occurs when a customer wants to buy something and you do not have it. The stockout rate measures how often this happens — expressed as a percentage of order lines, SKUs, or time periods, depending on how you define it.
def calculate_stockout_rate(
total_sku_days: int,
stockout_sku_days: int,
) -> float:
"""
Calculate the percentage of SKU-days where stock level was zero.
A SKU-day is one product on one day. If you have 100 SKUs and track
30 days, you have 3,000 SKU-days. If 60 of those had zero stock,
your stockout rate is 2%.
Args:
total_sku_days: Total number of SKU-day observations.
stockout_sku_days: Number of observations where stock was zero.
Returns:
Stockout rate as a percentage (0-100).
"""
if total_sku_days <= 0:
return 0.0
return (stockout_sku_days / total_sku_days) * 100.0
Reorder Point
The reorder point is the stock level at which you should place a new purchase order. It needs to account for two things: how much you will sell while waiting for the order to arrive (demand during lead time), and a safety buffer for when things do not go as planned.
Formula: Reorder Point = (Average Daily Demand × Average Lead Time) + Safety Stock
This is one of the most actionable calculations in supply chain analytics because it tells you exactly when to act. When stock drops to or below the reorder point, it is time to order.
def calculate_reorder_point(
average_daily_demand: float,
average_lead_time_days: float,
safety_stock: float,
) -> float:
"""
Calculate the reorder point for a product.
When current stock falls to or below this level, a purchase order
should be placed to avoid a stockout under normal demand and lead
time conditions.
Args:
average_daily_demand: Mean units sold per day over the analysis period.
average_lead_time_days: Mean days from purchase order placement to receipt.
safety_stock: Buffer stock to cover demand and lead time variability.
Returns:
Reorder point in units (the stock level that triggers a new order).
"""
demand_during_lead_time = average_daily_demand * average_lead_time_days
return demand_during_lead_time + safety_stock
Safety Stock
Safety stock is the buffer inventory you hold to absorb variability. If demand is always exactly predictable and suppliers always deliver on time, you theoretically need zero safety stock. In the real world, neither of those things is true.
The standard formula uses the z-score corresponding to your desired service level (the probability of not stocking out during a replenishment cycle). For a 95% service level, z = 1.645.
Formula: Safety Stock = Z × sqrt(Lead Time × variance_demand + Demand_squared × variance_lead_time)
This looks intimidating, but Python makes it straightforward:
import math
from scipy import stats
def calculate_safety_stock(
average_daily_demand: float,
demand_std_dev: float,
average_lead_time_days: float,
lead_time_std_dev: float,
service_level: float = 0.95,
) -> dict:
"""
Calculate safety stock using the combined variability formula.
Accounts for both demand variability and lead time variability,
which together drive the risk of a stockout during replenishment.
Args:
average_daily_demand: Mean daily demand in units.
demand_std_dev: Standard deviation of daily demand.
average_lead_time_days: Mean supplier lead time in days.
lead_time_std_dev: Standard deviation of lead time in days.
service_level: Target service level (e.g., 0.95 for 95%).
Common values: 0.90 (z=1.28), 0.95 (z=1.645),
0.98 (z=2.05), 0.99 (z=2.33)
Returns:
Dictionary with safety_stock, z_score, and component details.
"""
z_score = stats.norm.ppf(service_level)
demand_variance_component = average_lead_time_days * (demand_std_dev ** 2)
lead_time_variance_component = (average_daily_demand ** 2) * (lead_time_std_dev ** 2)
combined_std_dev = math.sqrt(
demand_variance_component + lead_time_variance_component
)
safety_stock = z_score * combined_std_dev
return {
"safety_stock": safety_stock,
"z_score": round(z_score, 3),
"service_level": service_level,
"combined_std_dev": round(combined_std_dev, 2),
}
32.3 Economic Order Quantity: How Much to Order
Once you know when to order, you need to know how much to order. The Economic Order Quantity (EOQ) formula gives you the mathematically optimal order size — the quantity that minimizes the total cost of ordering and holding inventory.
The intuition is this: ordering in large quantities means fewer orders (lower ordering cost) but more inventory sitting on shelves (higher holding cost). Ordering in small quantities means you order more frequently (higher ordering cost) but carry less inventory at any time (lower holding cost). The EOQ is the sweet spot.
Formula: Q* = sqrt(2DS / H)
Where: - D = Annual demand in units - S = Cost per order (the cost of placing one purchase order, including administrative time, shipping minimums, etc.) - H = Annual holding cost per unit (warehouse space, insurance, cost of capital — typically 20-30% of unit cost per year)
import math
import numpy as np
def calculate_economic_order_quantity(
annual_demand_units: float,
order_cost_dollars: float,
unit_cost: float,
holding_cost_rate: float = 0.20,
) -> dict:
"""
Calculate the Economic Order Quantity (EOQ) and supporting metrics.
The EOQ minimizes total inventory costs by balancing the cost of placing
orders against the cost of holding inventory.
Args:
annual_demand_units: Expected annual unit demand.
order_cost_dollars: Cost to place one purchase order
(admin time, freight minimums, receiving labor, etc.).
unit_cost: Cost per unit (landed cost at your facility).
holding_cost_rate: Annual holding cost as fraction of unit cost.
Default 0.20 means 20% of unit cost per year to hold one unit.
Returns:
Dictionary with:
- eoq: Optimal order quantity in units
- orders_per_year: How often to order at EOQ
- cycle_days: Average days between orders
- avg_inventory_units: Average inventory held (EOQ / 2)
- total_annual_cost: Total ordering + holding costs at EOQ
- annual_holding_cost: Just the holding component
- annual_ordering_cost: Just the ordering component
Raises:
ValueError: If any input is zero or negative.
"""
if annual_demand_units <= 0 or order_cost_dollars <= 0 or unit_cost <= 0:
raise ValueError(
"All EOQ inputs must be positive. "
f"Got demand={annual_demand_units}, "
f"order_cost={order_cost_dollars}, "
f"unit_cost={unit_cost}"
)
holding_cost_per_unit = unit_cost * holding_cost_rate
eoq = math.sqrt(
(2 * annual_demand_units * order_cost_dollars) / holding_cost_per_unit
)
orders_per_year = annual_demand_units / eoq
cycle_days = 365 / orders_per_year
avg_inventory = eoq / 2
annual_holding_cost = avg_inventory * holding_cost_per_unit
annual_ordering_cost = orders_per_year * order_cost_dollars
total_annual_cost = annual_holding_cost + annual_ordering_cost
return {
"eoq": round(eoq, 1),
"orders_per_year": round(orders_per_year, 1),
"cycle_days": round(cycle_days, 1),
"avg_inventory_units": round(avg_inventory, 1),
"annual_holding_cost": round(annual_holding_cost, 2),
"annual_ordering_cost": round(annual_ordering_cost, 2),
"total_annual_cost": round(total_annual_cost, 2),
}
A Note on EOQ in Practice
EOQ assumes constant demand, known and stable costs, and immediate replenishment. Real businesses never perfectly match these assumptions. But EOQ is still extremely useful as a starting point and sanity check.
If your current order quantity is five times the EOQ, you are probably carrying too much inventory and tying up working capital. If it is one-fifth of the EOQ, you are probably ordering too frequently and incurring unnecessary ordering costs. The formula does not give you the final answer — it gives you the right question to ask.
32.4 ABC Analysis: Focus Where It Matters
If you have 1,000 SKUs in your catalog, you cannot give equal attention to all of them. ABC analysis — named for the three categories it produces — helps you prioritize.
The principle is simple: in almost every inventory, a small percentage of SKUs account for a large percentage of total value consumed. These are your "A" items. A somewhat larger group accounts for a moderate share. These are "B" items. The majority of SKUs account for a small share of total value. These are "C" items.
A typical split: - A items: Top 10-20% of SKUs by value, accounting for 70-80% of total spend - B items: Next 30% of SKUs, accounting for 15-20% of total spend - C items: Bottom 50-60% of SKUs, accounting for 5-10% of total spend
The business implication is direct: A items deserve tight controls, frequent cycle counts, and careful reorder management. C items can be managed more loosely, perhaps with simpler min/max rules.
This is the Pareto principle (the 80/20 rule) applied to inventory, and it is remarkably consistent across industries. It does not matter whether you are selling office supplies, pharmaceuticals, automotive parts, or fashion apparel — a small number of products drive most of the value.
import pandas as pd
def perform_abc_analysis(
inventory_df: pd.DataFrame,
sku_column: str = "sku_id",
annual_value_column: str = "annual_consumption_value",
a_threshold: float = 0.80,
b_threshold: float = 0.95,
) -> pd.DataFrame:
"""
Perform ABC inventory classification based on annual consumption value.
Classifies SKUs into A, B, and C categories based on their cumulative
contribution to total annual inventory consumption value.
Args:
inventory_df: DataFrame with at least SKU identifier and annual value columns.
sku_column: Column name for the SKU/product identifier.
annual_value_column: Column name for annual consumption value
(units consumed * unit cost per year).
a_threshold: Cumulative value share at which A category ends
(default 0.80 = top 80% of total value).
b_threshold: Cumulative value share at which B category ends
(default 0.95 = next 15% of total value).
Returns:
Original DataFrame sorted by value descending, with added columns:
- value_share: This SKU's fractional share of total annual value
- cumulative_share: Running cumulative share from top
- abc_category: 'A', 'B', or 'C'
"""
result = inventory_df.copy()
# Sort highest-value SKUs first
result = result.sort_values(annual_value_column, ascending=False).reset_index(drop=True)
total_value = result[annual_value_column].sum()
if total_value <= 0:
result["value_share"] = 0.0
result["cumulative_share"] = 0.0
result["abc_category"] = "C"
return result
result["value_share"] = result[annual_value_column] / total_value
result["cumulative_share"] = result["value_share"].cumsum()
def assign_category(cumulative_share: float) -> str:
if cumulative_share <= a_threshold:
return "A"
elif cumulative_share <= b_threshold:
return "B"
else:
return "C"
result["abc_category"] = result["cumulative_share"].apply(assign_category)
return result
def summarize_abc_results(abc_df: pd.DataFrame) -> pd.DataFrame:
"""
Produce a summary table from ABC analysis results.
Args:
abc_df: Output from perform_abc_analysis().
Returns:
Summary DataFrame with one row per category showing SKU count,
total value, and percentage shares.
"""
summary = abc_df.groupby("abc_category").agg(
sku_count=("abc_category", "count"),
total_annual_value=("annual_consumption_value", "sum"),
).reset_index()
total_skus = summary["sku_count"].sum()
total_value = summary["total_annual_value"].sum()
summary["sku_pct"] = (summary["sku_count"] / total_skus * 100).round(1)
summary["value_pct"] = (summary["total_annual_value"] / total_value * 100).round(1)
summary["avg_value_per_sku"] = (
summary["total_annual_value"] / summary["sku_count"]
).round(2)
return summary.sort_values("abc_category")
Interpreting ABC Results at Acme
When Priya ran ABC analysis on Acme's full catalog of 623 active SKUs, the results aligned almost exactly with the Pareto principle:
- 47 A items (7.5% of SKUs) accounted for 80% of annual consumption value — primarily copy paper, toner cartridges, and high-velocity office furniture
- 138 B items (22% of SKUs) accounted for the next 15% of value
- 438 C items (70% of SKUs) accounted for only 5% of value — mostly specialty items, slow-moving accessories, and products that had been in the catalog for years without formal review
David Nakamura, Acme's VP of Operations, had never seen this breakdown before. "We've been treating all 600 products the same way," he said. That was the beginning of a significant shift in how Acme managed purchasing priorities.
32.5 Lead Time Analysis
Lead time is the number of days between placing a purchase order and receiving the goods. It sounds simple, but lead time variability is one of the most common causes of supply chain problems. A supplier who averages 14 days but sometimes takes 25 days is much harder to manage than one who consistently delivers in 16 days.
The coefficient of variation (standard deviation divided by mean) gives you a dimensionless measure of consistency that is comparable across suppliers regardless of their average lead times.
import pandas as pd
import numpy as np
def analyze_lead_times(
purchase_orders_df: pd.DataFrame,
supplier_column: str = "supplier_id",
order_date_column: str = "order_date",
receipt_date_column: str = "receipt_date",
) -> pd.DataFrame:
"""
Analyze lead time statistics by supplier.
Calculates mean, standard deviation, minimum, maximum, and coefficient
of variation for each supplier based on historical purchase order data.
Args:
purchase_orders_df: DataFrame with purchase order history.
supplier_column: Column identifying the supplier.
order_date_column: Column with purchase order placement date.
receipt_date_column: Column with goods receipt date.
Returns:
DataFrame with one row per supplier containing:
- mean_lead_time_days: Average lead time
- std_lead_time_days: Standard deviation (variability)
- min_lead_time_days: Fastest delivery seen
- max_lead_time_days: Slowest delivery seen
- cv_lead_time: Coefficient of variation (lower = more consistent)
- order_count: Number of purchase orders analyzed
"""
df = purchase_orders_df.copy()
df[order_date_column] = pd.to_datetime(df[order_date_column])
df[receipt_date_column] = pd.to_datetime(df[receipt_date_column])
df["lead_time_days"] = (
df[receipt_date_column] - df[order_date_column]
).dt.days
lead_time_stats = df.groupby(supplier_column)["lead_time_days"].agg(
mean_lead_time_days="mean",
std_lead_time_days="std",
min_lead_time_days="min",
max_lead_time_days="max",
order_count="count",
).round(2)
# Coefficient of variation: std / mean
# Lower CV means more predictable delivery times, all else equal
lead_time_stats["cv_lead_time"] = (
lead_time_stats["std_lead_time_days"] / lead_time_stats["mean_lead_time_days"]
).round(3)
return lead_time_stats.reset_index()
Why Lead Time Variability Matters More Than Average Lead Time
Imagine two suppliers for the same product:
- Supplier A: averages 10 days, standard deviation of 1 day (CV = 0.10)
- Supplier B: averages 8 days, standard deviation of 4 days (CV = 0.50)
Supplier B looks better on average lead time. But Supplier B's 95th-percentile delivery time is roughly 14 days (8 + 1.645×4 ≈ 15), while Supplier A's is 12 days (10 + 1.645×1 ≈ 12). More importantly, Supplier A is predictable — you can set your reorder point with confidence. Supplier B requires substantially more safety stock to maintain the same service level, which translates directly to higher working capital costs.
Your Python analysis will surface these differences automatically.
32.6 Supplier Performance Scorecards
Two metrics define supplier performance for most businesses:
On-Time Delivery Rate (OTDR): What percentage of purchase orders arrived on or before the promised delivery date?
Fill Rate: What percentage of the ordered quantity was actually delivered? A supplier who promises 1,000 units but ships 800 has an 80% fill rate.
def calculate_supplier_scorecard(
purchase_orders_df: pd.DataFrame,
supplier_column: str = "supplier_id",
supplier_name_column: str = "supplier_name",
promised_date_column: str = "promised_delivery_date",
actual_date_column: str = "receipt_date",
ordered_qty_column: str = "ordered_quantity",
received_qty_column: str = "received_quantity",
) -> pd.DataFrame:
"""
Generate a supplier performance scorecard with OTDR and fill rate.
Args:
purchase_orders_df: DataFrame with purchase order and receipt data.
supplier_column: Column identifying the supplier ID.
supplier_name_column: Column with human-readable supplier name.
promised_date_column: Column with the supplier's promised delivery date.
actual_date_column: Column with the actual receipt date.
ordered_qty_column: Column with quantity ordered.
received_qty_column: Column with quantity actually received.
Returns:
DataFrame with one row per supplier containing:
- order_count: Total purchase orders analyzed
- on_time_delivery_rate: Percentage of orders on time or early
- fill_rate: Percentage of ordered units actually received
- performance_grade: A (>=95%), B (>=85%), C (>=75%), D (<75%)
"""
df = purchase_orders_df.copy()
df[promised_date_column] = pd.to_datetime(df[promised_date_column])
df[actual_date_column] = pd.to_datetime(df[actual_date_column])
df["on_time"] = df[actual_date_column] <= df[promised_date_column]
# Cap fill ratio at 1.0 — occasional over-shipments should not inflate the metric
df["fill_ratio"] = (
df[received_qty_column] / df[ordered_qty_column].replace(0, pd.NA)
).clip(upper=1.0)
group_cols = [col for col in [supplier_column, supplier_name_column]
if col in df.columns]
scorecard = df.groupby(group_cols).agg(
order_count=(supplier_column, "count"),
on_time_delivery_rate=("on_time", lambda x: round(x.mean() * 100, 1)),
fill_rate=("fill_ratio", lambda x: round(x.mean() * 100, 1)),
).reset_index()
def grade_supplier(row: pd.Series) -> str:
combined_score = (row["on_time_delivery_rate"] + row["fill_rate"]) / 2
if combined_score >= 95:
return "A"
elif combined_score >= 85:
return "B"
elif combined_score >= 75:
return "C"
else:
return "D"
scorecard["performance_grade"] = scorecard.apply(grade_supplier, axis=1)
return (
scorecard
.sort_values("on_time_delivery_rate", ascending=False)
.reset_index(drop=True)
)
32.7 Demand Forecasting for Inventory
In Chapter 26, you built forecasting models for revenue planning. The same techniques apply to inventory management, but the goal shifts: instead of forecasting total revenue, you are forecasting demand for individual SKUs to set appropriate reorder quantities and safety stock levels.
The key challenge with inventory forecasting is that you often have hundreds or thousands of SKUs, and many of them have noisy, irregular demand. For practical inventory forecasting, a tiered approach works well:
Tier 1: High-volume A items — Use exponential smoothing or seasonal decomposition where you have sufficient data (at least 6-12 months). Forecasting errors for A items are costly; more sophisticated models are justified.
Tier 2: Medium-volume B items — Use simple moving averages with a 4-8 week window. The added complexity of more sophisticated models rarely pays off for B items.
Tier 3: Low-volume C items — Use the annual average or simply set fixed min/max levels. C items often lack sufficient history for statistical forecasting to add value.
import pandas as pd
import numpy as np
from typing import Optional
def forecast_sku_demand(
daily_sales_df: pd.DataFrame,
sku_id: str,
sku_column: str = "sku_id",
date_column: str = "sale_date",
quantity_column: str = "quantity_sold",
forecast_days: int = 30,
smoothing_span: int = 14,
) -> dict:
"""
Forecast daily demand for a single SKU using exponential smoothing.
Uses exponentially weighted moving average (EWMA) appropriate for
stable demand patterns. For SKUs with strong trend or clear seasonality,
consider Holt-Winters or other methods (see Chapter 26).
Args:
daily_sales_df: DataFrame with daily sales data for all SKUs.
sku_id: The specific SKU to forecast.
sku_column: Column containing SKU identifiers.
date_column: Column containing sale dates.
quantity_column: Column containing units sold.
forecast_days: Number of days to forecast forward.
smoothing_span: Span for EWMA (higher = smoother, slower to react to changes).
Returns:
Dictionary with:
- sku_id: The SKU that was forecast
- average_daily_demand: EWMA-smoothed daily demand estimate
- demand_std_dev: Standard deviation of daily demand (used in safety stock)
- forecast_period_total: Forecasted total units for the forecast window
- history_days: Number of days of sales history used
"""
sku_data = daily_sales_df[daily_sales_df[sku_column] == sku_id].copy()
sku_data[date_column] = pd.to_datetime(sku_data[date_column])
sku_data = sku_data.sort_values(date_column)
# Aggregate to daily totals in case of multiple records per day
daily_demand = sku_data.groupby(date_column)[quantity_column].sum()
if len(daily_demand) < 7:
# Insufficient history — use simple mean
avg_demand = float(daily_demand.mean()) if len(daily_demand) > 0 else 0.0
std_demand = float(daily_demand.std()) if len(daily_demand) > 1 else avg_demand * 0.30
else:
avg_demand = float(daily_demand.ewm(span=smoothing_span).mean().iloc[-1])
std_demand = float(daily_demand.std())
return {
"sku_id": sku_id,
"average_daily_demand": round(avg_demand, 2),
"demand_std_dev": round(max(std_demand, 0.0), 2),
"forecast_period_total": round(avg_demand * forecast_days, 0),
"history_days": len(daily_demand),
}
32.8 Connecting to acme_inventory.db
In Chapter 23, you learned to work with SQLite databases. Acme's inventory database stores the current state of every SKU across all four regional warehouses. Now you will query it and combine it with the sales data you have worked with throughout the book.
The key insight is that the database gives you the current state (what do we have right now?) while the sales CSV gives you historical flow (how fast are we consuming it?). Together, they answer the operational question: given how fast this product moves, how much runway does the current stock provide?
import sqlite3
import pandas as pd
from pathlib import Path
from typing import Optional
def load_inventory_from_database(
db_path: str,
region: Optional[str] = None,
) -> pd.DataFrame:
"""
Load current inventory levels from the Acme inventory SQLite database.
Joins the inventory table with the suppliers table to enrich each
SKU record with supplier name and lead time parameters.
Args:
db_path: Path to the acme_inventory.db SQLite database file.
region: If provided, filter to a specific region
('North', 'South', 'East', or 'West').
If None, returns inventory for all four regions.
Returns:
DataFrame with columns: sku_id, product_name, category, region,
current_stock, unit_cost, reorder_point, reorder_quantity,
supplier_id, supplier_name, average_lead_time_days,
lead_time_std_dev, last_count_date.
Raises:
FileNotFoundError: If the database file does not exist at db_path.
sqlite3.Error: If the database query fails.
"""
db_file = Path(db_path)
if not db_file.exists():
raise FileNotFoundError(f"Inventory database not found at: {db_path}")
base_query = """
SELECT
i.sku_id,
i.product_name,
i.category,
i.region,
i.current_stock,
i.unit_cost,
i.reorder_point,
i.reorder_quantity,
i.supplier_id,
i.last_count_date,
s.supplier_name,
s.average_lead_time_days,
s.lead_time_std_dev
FROM inventory i
LEFT JOIN suppliers s ON i.supplier_id = s.supplier_id
"""
params: tuple = ()
if region is not None:
base_query += " WHERE i.region = ?"
params = (region,)
try:
with sqlite3.connect(db_path) as conn:
inventory_df = pd.read_sql_query(base_query, conn, params=params)
except sqlite3.Error as e:
raise sqlite3.Error(f"Failed to query inventory database: {e}") from e
inventory_df["last_count_date"] = pd.to_datetime(
inventory_df["last_count_date"], errors="coerce"
)
return inventory_df
Joining Inventory Data With Sales Data
Once you have current stock levels from the database and historical sales from the CSV, you can calculate consumption rates and identify items at risk:
def join_inventory_with_sales(
inventory_df: pd.DataFrame,
sales_df: pd.DataFrame,
analysis_days: int = 90,
) -> pd.DataFrame:
"""
Enrich inventory data with sales velocity from transaction history.
Calculates average daily demand and annualized consumption value for
each SKU-region combination by joining current inventory with sales
history over a recent analysis window.
Args:
inventory_df: Current inventory from load_inventory_from_database().
sales_df: Sales transactions DataFrame with columns: sku_id, region,
sale_date, quantity, and any price columns.
analysis_days: Number of recent days to use for velocity calculation.
90 days (one quarter) works well for most businesses.
Returns:
Enriched DataFrame with additional columns:
- units_sold_period: Total units sold in the analysis window
- avg_daily_demand: Average daily unit demand
- days_of_supply: Current stock / avg_daily_demand
- annual_consumption_value: avg_daily_demand * 365 * unit_cost
- below_reorder_point: True if current_stock <= reorder_point
"""
sales_df = sales_df.copy()
sales_df["sale_date"] = pd.to_datetime(sales_df["sale_date"])
cutoff_date = sales_df["sale_date"].max() - pd.Timedelta(days=analysis_days)
recent_sales = sales_df[sales_df["sale_date"] >= cutoff_date]
velocity = (
recent_sales.groupby(["sku_id", "region"])["quantity"]
.sum()
.reset_index()
.rename(columns={"quantity": "units_sold_period"})
)
velocity["avg_daily_demand"] = velocity["units_sold_period"] / analysis_days
enriched = inventory_df.merge(
velocity,
on=["sku_id", "region"],
how="left",
)
enriched["units_sold_period"] = enriched["units_sold_period"].fillna(0)
enriched["avg_daily_demand"] = enriched["avg_daily_demand"].fillna(0)
# Days of supply: how long will current stock last at the current demand rate?
enriched["days_of_supply"] = enriched.apply(
lambda row: (
row["current_stock"] / row["avg_daily_demand"]
if row["avg_daily_demand"] > 0
else float("inf")
),
axis=1,
)
# Annual consumption value: the input to ABC analysis
enriched["annual_consumption_value"] = (
enriched["avg_daily_demand"] * 365 * enriched["unit_cost"]
)
enriched["below_reorder_point"] = (
enriched["current_stock"] <= enriched["reorder_point"]
)
return enriched
32.9 Building Reorder Alerts
With the enriched inventory data in hand, generating reorder alerts is straightforward. The goal is a clear, actionable output that a purchasing manager can work from without needing to understand the underlying analysis.
The urgency classification uses pd.cut() to create categorical bins from the continuous days_of_supply variable — transforming a number into a business action:
- CRITICAL (fewer than 3 days): Order immediately
- HIGH (3-7 days): Order today
- MEDIUM (7-14 days): Order this week
- LOW (at reorder point but more than 14 days supply): Schedule for next ordering cycle
def generate_reorder_alerts(
enriched_inventory_df: pd.DataFrame,
include_critical_only: bool = False,
) -> pd.DataFrame:
"""
Generate prioritized reorder alerts for items at or below reorder point.
Flags items that need replenishment and assigns urgency levels based
on remaining days of supply.
Args:
enriched_inventory_df: Output from join_inventory_with_sales().
include_critical_only: If True, only include items with fewer than
7 days of supply. If False, include all items at or below
their reorder point.
Returns:
DataFrame sorted by urgency (days_of_supply ascending), containing:
sku_id, product_name, region, current_stock, reorder_point,
days_of_supply, suggested_order_qty, supplier_name, urgency.
Returns an empty DataFrame if no alerts are triggered.
"""
df = enriched_inventory_df.copy()
if include_critical_only:
alerts = df[df["days_of_supply"] < 7].copy()
else:
alerts = df[df["below_reorder_point"]].copy()
if alerts.empty:
return pd.DataFrame(columns=[
"sku_id", "product_name", "region", "current_stock",
"reorder_point", "days_of_supply", "suggested_order_qty",
"urgency",
])
alerts["suggested_order_qty"] = alerts["reorder_quantity"]
alerts["urgency"] = pd.cut(
alerts["days_of_supply"],
bins=[-float("inf"), 3, 7, 14, float("inf")],
labels=["CRITICAL", "HIGH", "MEDIUM", "LOW"],
)
output_columns = [
"sku_id", "product_name", "region", "current_stock",
"reorder_point", "days_of_supply", "suggested_order_qty",
"supplier_id", "supplier_name", "urgency",
]
available_columns = [col for col in output_columns if col in alerts.columns]
return (
alerts[available_columns]
.sort_values("days_of_supply")
.reset_index(drop=True)
)
32.10 The Complete Acme Supply Chain Dashboard
This is what all the pieces have been building toward: a dashboard that pulls together inventory health, reorder alerts, ABC classification, and supplier scorecards into a single, coherent view of Acme's supply chain.
When Priya ran this for the first time against live data, she found:
- 23 SKUs across the four regions were below their reorder point, including three items flagged as CRITICAL
- Three suppliers had on-time delivery rates below 80%
- The paper products category was showing a DIO of 47 days against an industry benchmark of 30, indicating excess inventory
That was actionable information. That was the difference between a data analyst and a strategic partner.
def build_supply_chain_dashboard(
db_path: str,
sales_csv_path: str,
purchase_orders_csv_path: str,
output_path: Optional[str] = None,
) -> dict:
"""
Build the complete Acme Corp supply chain analytics dashboard.
Integrates inventory database, sales history, and purchase order records
to produce a comprehensive supply chain health report.
Args:
db_path: Path to acme_inventory.db.
sales_csv_path: Path to acme_sales_2023.csv.
purchase_orders_csv_path: Path to purchase order history CSV.
output_path: If provided, saves the complete dashboard to an Excel
file with separate sheets for each analysis component.
Returns:
Dictionary with keys:
- inventory_health: Enriched inventory DataFrame
- reorder_alerts: Items needing immediate purchasing action
- abc_classification: A/B/C category assignments
- supplier_scorecard: Supplier performance grades
- summary_stats: Headline KPIs as a plain dictionary
"""
print("Loading inventory database...")
inventory_df = load_inventory_from_database(db_path)
print("Loading sales data...")
sales_df = pd.read_csv(sales_csv_path, parse_dates=["sale_date"])
print("Loading purchase order history...")
po_df = pd.read_csv(purchase_orders_csv_path)
print("Calculating demand velocity...")
enriched_inventory = join_inventory_with_sales(inventory_df, sales_df)
print("Running ABC analysis...")
abc_results = perform_abc_analysis(
enriched_inventory,
annual_value_column="annual_consumption_value",
)
print("Generating reorder alerts...")
reorder_alerts = generate_reorder_alerts(enriched_inventory)
print("Building supplier scorecard...")
supplier_scorecard = calculate_supplier_scorecard(po_df)
total_inventory_value = (
enriched_inventory["current_stock"] * enriched_inventory["unit_cost"]
).sum()
summary_stats = {
"total_sku_count": len(enriched_inventory),
"total_inventory_value_usd": round(total_inventory_value, 2),
"items_below_reorder_point": int(enriched_inventory["below_reorder_point"].sum()),
"critical_items_under_7_days": int(
(enriched_inventory["days_of_supply"] < 7).sum()
),
"a_item_count": int((abc_results["abc_category"] == "A").sum()),
"b_item_count": int((abc_results["abc_category"] == "B").sum()),
"c_item_count": int((abc_results["abc_category"] == "C").sum()),
"suppliers_graded_a": int(
(supplier_scorecard["performance_grade"] == "A").sum()
),
"suppliers_graded_d": int(
(supplier_scorecard["performance_grade"] == "D").sum()
),
}
dashboard = {
"inventory_health": enriched_inventory,
"reorder_alerts": reorder_alerts,
"abc_classification": abc_results,
"supplier_scorecard": supplier_scorecard,
"summary_stats": summary_stats,
}
if output_path:
print(f"Saving dashboard to {output_path}...")
with pd.ExcelWriter(output_path, engine="openpyxl") as writer:
enriched_inventory.to_excel(
writer, sheet_name="Inventory Health", index=False
)
reorder_alerts.to_excel(
writer, sheet_name="Reorder Alerts", index=False
)
abc_results.to_excel(
writer, sheet_name="ABC Analysis", index=False
)
supplier_scorecard.to_excel(
writer, sheet_name="Supplier Scorecard", index=False
)
pd.DataFrame(
list(summary_stats.items()), columns=["metric", "value"]
).to_excel(writer, sheet_name="Summary", index=False)
print("Dashboard saved successfully.")
print("\n=== SUPPLY CHAIN DASHBOARD COMPLETE ===")
print(f"Total SKUs analyzed: {summary_stats['total_sku_count']:,}")
print(f"Total inventory value: ${summary_stats['total_inventory_value_usd']:,.2f}")
print(f"Items below reorder point: {summary_stats['items_below_reorder_point']}")
print(f"Critical items (< 7 days): {summary_stats['critical_items_under_7_days']}")
print(
f"ABC split: "
f"{summary_stats['a_item_count']} A / "
f"{summary_stats['b_item_count']} B / "
f"{summary_stats['c_item_count']} C"
)
return dashboard
32.11 The Milestone: Priya Presents to Leadership
It was a Tuesday morning in November, the kind of grey mid-autumn day that makes fluorescent conference room lighting seem comparatively cheerful. The quarterly business review had been on the calendar for weeks. Sandra Chen had asked Priya to take fifteen minutes — "no more, no less" — to walk the leadership team through the new supply chain analytics system.
Not a dry run. Not a demo environment. Live data, live system, the real thing.
Marcus Webb had stayed late the previous Friday to make sure the database connections were stable and that Priya's scripts would run cleanly from the conference room laptop. He had tested six different scenarios, including deliberately corrupting a data file to make sure the error handling worked gracefully. "The system is solid," he told Priya that afternoon. "You've built something genuinely good here."
At 9:47 AM, Priya opened her laptop. She typed a single command:
python acme_supply_chain.py --mode dashboard --output quarterly_review.xlsx
The script ran in 23 seconds. Loading data from three sources, calculating velocity metrics for all 623 active SKUs, running ABC analysis, generating reorder alerts, scoring all twelve active suppliers, and writing the output to a formatted Excel workbook — all of it, in 23 seconds.
The Excel file appeared on the shared drive. Priya opened it on the projector.
What the leadership team saw was not a collection of numbers. It was a story.
The summary sheet showed, at a glance: $1.4 million in current inventory across four regions. 23 SKUs below their reorder point. 3 items classified as CRITICAL with fewer than 7 days of supply remaining.
The ABC analysis tab drew a long silence from David Nakamura, VP of Operations. Forty-seven SKUs — just 7.5% of Acme's catalog — accounted for 80% of annual inventory spend. "We've been treating all 600 products the same way," he said quietly. "That's not right."
Sandra Chen leaned forward when Priya pulled up the reorder alerts tab and filtered for the West region. "Three critical items," Sandra said. "I got a complaint from a customer out there last week about a back-order situation on copy paper. This is why." She looked directly at the purchasing manager, who was already writing in her notebook.
Priya walked through the supplier scorecard without hesitation. One supplier — responsible for a significant portion of Acme's paper products — had a 71% on-time delivery rate. The data was based on 47 purchase orders over twelve months. It was clear and objective, and no one in the room challenged it.
When the fifteen minutes ended, David Nakamura asked how long the system had taken to build.
Priya thought carefully about the right way to answer. "The first version of this specific module took about two months," she said. "But it built on work I had already done — the sales analytics system from Q1, the forecasting models we deployed in Q3. Each piece connected to the ones before it. The data infrastructure that Marcus's team put in place made all of it possible."
Marcus Webb caught Priya's eye across the table and gave a single nod.
Sandra Chen said, "This is exactly what I hired an analyst to do."
The meeting moved on. But something had shifted.
Three weeks later, on a Wednesday morning in early December, Priya received a formal offer letter for promotion to Senior Analyst, effective January 1st. The offer included a salary increase and expanded scope covering supply chain, forecasting, and customer analytics across all four regions.
The letter cited, specifically, "the development and deployment of Acme's supply chain analytics capability."
That afternoon, Marcus Webb sent her a message: "Congratulations. For what it's worth — Python is now essential to how this company operates. We're not going back."
32.12 Putting It All Together: The Complete Workflow
Here is the end-to-end architecture in summary form, showing how each piece connects:
DATA SOURCES
acme_inventory.db ----> load_inventory_from_database()
acme_sales_2023.csv ----> pd.read_csv()
purchase_orders.csv ----> pd.read_csv()
|
v
ENRICHMENT LAYER
join_inventory_with_sales()
Adds: avg_daily_demand per SKU-region
Adds: days_of_supply (current_stock / avg_daily_demand)
Adds: annual_consumption_value (for ABC analysis)
Flags: below_reorder_point
|
v
ANALYTICS MODULES (each operates on the enriched DataFrame)
perform_abc_analysis() --> A/B/C priority classification
generate_reorder_alerts() --> Sorted, urgency-labeled action items
analyze_lead_times() --> Supplier consistency metrics
calculate_supplier_scorecard() --> Supplier A/B/C/D grades
|
v
OUTPUT ASSEMBLY
build_supply_chain_dashboard()
Sheet 1: Inventory Health (full enriched dataset)
Sheet 2: Reorder Alerts (sorted by urgency, CRITICAL first)
Sheet 3: ABC Analysis (sorted by annual consumption value)
Sheet 4: Supplier Scorecard (sorted by on-time delivery rate)
Sheet 5: Summary (headline KPIs for leadership view)
The beauty of this architecture is that each function is independent and testable. You can update the reorder alert logic without touching the ABC analysis. You can add a new data source by enriching the DataFrame at the enrichment layer. You can swap the output from Excel to a web dashboard (Chapter 33) without changing any calculation logic.
This separation of concerns is not just good programming style. It is what makes a system maintainable when business requirements change — and they always change.
32.13 What You Have Built, and What Comes Next
Take a moment to consider what you now know how to do:
- Pull inventory data from a SQLite database and join it with transactional sales data from a CSV file
- Calculate the core metrics that drive purchasing and operations decisions: inventory turnover, DIO, reorder point, safety stock, and EOQ
- Classify inventory using ABC analysis to focus effort and capital where it matters most
- Evaluate supplier performance with objective, data-driven scorecards based on actual purchase order history
- Generate prioritized, actionable reorder alerts categorized by urgency
- Assemble all of this into a multi-sheet dashboard that can be run on demand by anyone in the organization
This is not a toy project. This is the kind of work that earns promotions. Priya Okonkwo is the proof.
In Chapter 33, we will look at building interactive web dashboards using Streamlit — taking the Excel outputs you have built here and turning them into live, browser-accessible views that update automatically when the underlying data changes.
In Chapter 34, we move into predictive modeling: using machine learning to forecast not just what will happen, but which customers, products, and risks deserve attention right now.
Chapter Summary
Supply chain analytics is about answering operational questions with data: when to reorder, how much to order, which suppliers are reliable, and where to focus inventory management effort.
The core metrics — inventory turnover, DIO, stockout rate, reorder point, safety stock, and EOQ — form a vocabulary that connects your Python code to the business decisions of purchasing managers, operations directors, and CFOs.
ABC analysis acknowledges that not all inventory deserves equal attention. Roughly 20% of SKUs drive 80% of inventory value. Managing those 20% carefully is where the leverage is.
Demand forecasting for inventory uses the same exponential smoothing and moving average techniques from Chapter 26, applied to individual SKUs to set accurate reorder parameters and safety stock levels.
Lead time analysis reveals which suppliers are genuinely reliable and which require additional safety stock buffers — a quantified way to evaluate something that was previously managed on reputation and intuition.
Supplier scorecards — built from purchase order history — give objective, data-driven assessments of the partners your business depends on.
When you assemble all of these pieces into a single automated dashboard connected to live data sources, you have built something that genuinely changes how a business operates. That is what Python for business is for.
Key Terms
| Term | Definition |
|---|---|
| Inventory Turnover | Times inventory is fully sold and replaced in a given period; higher indicates more efficient use of working capital |
| Days Inventory Outstanding (DIO) | Average number of days a unit is held before sale; 365 / inventory turnover |
| Safety Stock | Buffer inventory held to protect against demand spikes and supplier delays |
| Reorder Point (ROP) | The stock level that triggers a new purchase order to be placed |
| Economic Order Quantity (EOQ) | The order size that minimizes total inventory cost (ordering cost + holding cost) |
| ABC Analysis | Classification of inventory into three tiers by value contribution: A (top ~80%), B (next ~15%), C (remaining ~5%) |
| Lead Time | Calendar days between placing a purchase order and receiving the goods |
| On-Time Delivery Rate (OTDR) | Percentage of purchase orders received on or before the promised delivery date |
| Fill Rate | Percentage of the ordered quantity actually delivered by a supplier |
| Stockout Rate | Percentage of time (or SKU-days) where available inventory was zero |
| Coefficient of Variation (CV) | Standard deviation / mean; a dimensionless measure of variability useful for comparing supplier consistency |
| Days of Supply | Current stock level / average daily demand; how many days the current inventory will last |