Case Study: Building a Configuration Counter with Stars, Bars, and Generating Functions

DataField.Dev

Case Study: Building a Configuration Counter with Stars, Bars, and Generating Functions

"Before you enumerate a space, count it. The count tells you whether enumeration is a loop or a lifetime."

Executive Summary

In this build-focused case study you will construct a small, reusable configcount module that answers the question every capacity planner and combinatorial tester eventually asks: how many distinct ways are there to allocate $n$ identical units across $k$ slots — possibly with per-slot minimums, caps, and parity rules? You will build it in three layers: first an exact closed-form counter using stars and bars (§17.3) for the unconstrained and lower-bounded cases; then a generating-function engine (§17.5) that multiplies polynomial factors to handle arbitrary per-slot constraints (at most $c$, an even number, a fixed menu of allowed amounts); and finally a small validation harness that brute-force enumerates tiny cases so you can check the fast counters against ground truth. The point is the engineering judgment the count buys you: a closed form returns instantly for astronomically large spaces, while the generating function trades a little speed for the flexibility to encode any rule you can describe as "which amounts this slot may take."

This is stars and bars and ordinary generating functions used as a library, with the computation-and-proof partnership of theme four built into the tests.

Skills applied

Implementing the stars-and-bars formula and its lower-bound variant via substitution.
Translating per-slot constraints into generating-function factors and multiplying them as coefficient lists (polynomial convolution).
Reading off a coefficient $[x^n]$ as the answer to a constrained-distribution problem.
Cross-checking a fast closed-form / generating-function count against a brute-force enumerator on small inputs — testing a counter the way you test any other code.

Background

The scenario

DataCrate (a hypothetical infrastructure company — Tier 3, constructed for teaching) is writing a capacity-planning tool. Operators repeatedly ask questions of the same shape:

"How many ways can we allocate 100 identical CPU credits across our 6 service tiers?"
"…if the premium tier must get at least 10 credits?"
"…if each tier accepts at most 30 credits, the batch tier must take an even number, and the cache tier must take one of $\{0, 8, 16\}$?"

Each is a distribution of identical objects into distinct boxes, sometimes with constraints. The team wants a single module that answers all of them — fast for the simple cases, flexible for the messy ones — and a test harness they trust. They will not enumerate millions of allocations by hand; they will count.

💡 Intuition: Every one of these questions is "non-negative integer solutions of $x_1 + \cdots + x_k = n$," possibly with each $x_i$ restricted to a set of allowed values. Stars and bars solves the unrestricted and lower-bounded versions in one binomial coefficient. When the restrictions get arbitrary (caps, parity, menus), a generating function — one factor per slot, listing that slot's allowed amounts as powers of $x$ — counts them by a polynomial multiplication. Same problem, two engines, and we build both.

Why this matters

Counting the size of a configuration space before committing to enumerate it is a core engineering reflex: a space of $\binom{105}{5} \approx 9.6 \times 10^7$ is borderline enumerable; a space of $\binom{200}{99}$ is not, and you had better know that before writing the for loop. Stars and bars and generating functions turn "is this tractable?" into a one-line computation. And building the brute-force checker alongside the fast counter is theme four in code: the enumerator settles small cases by listing; the formula settles all cases by proof; agreement on the overlap is what earns the formula your trust.

Phase 1: The exact counter (stars and bars)

Start with the closed form. The number of ways to distribute $n$ identical units into $k$ distinct slots (a slot may get zero) is $\binom{n+k-1}{k-1}$ (§17.3). Add the common variant: per-slot minimums. If slot $i$ must receive at least $\ell_i$, place those $\sum \ell_i$ units first and distribute the remaining $n - \sum \ell_i$ freely — the substitution $y_i = x_i - \ell_i$ from §17.3.

from math import comb

def count_basic(n, k):
    """Ways to distribute n identical units into k distinct slots (zeros ok)."""
    return comb(n + k - 1, k - 1)

def count_with_minimums(n, mins):
    """Distribute n units into len(mins) slots where slot i needs >= mins[i].
    Substitute y_i = x_i - mins[i], then count the leftover with stars & bars."""
    leftover = n - sum(mins)
    if leftover < 0:
        return 0                       # demands exceed supply: no allocations
    return count_basic(leftover, len(mins))

print(count_basic(100, 6))             # 100 credits, 6 tiers, no constraints
print(count_with_minimums(100, [10, 0, 0, 0, 0, 0]))   # premium tier >= 10
print(count_with_minimums(5, [2, 2, 2]))               # demand 6 > supply 5
# Expected output:
# 96560646
# 57940519
# 0

Hand-derive each so the formula stays honest:

count_basic(100, 6) $= \binom{100 + 6 - 1}{6 - 1} = \binom{105}{5}$. Compute $\binom{105}{5} = \frac{105\cdot104\cdot103\cdot102\cdot101}{120}$. Numerator step by step: $105\cdot104 = 10{,}920$; $\cdot103 = 1{,}124{,}760$; $\cdot102 = 114{,}725{,}520$; $\cdot101 = 11{,}587{,}277{,}520$. Divide by $120$: $11{,}587{,}277{,}520 / 120 = 96{,}560{,}646$. ✓
count_with_minimums(100, [10,0,0,0,0,0]): leftover $= 100 - 10 = 90$, so $\binom{90 + 6 - 1}{5} = \binom{95}{5}$. Numerator step by step: $95\cdot94 = 8{,}930$; $\cdot93 = 830{,}490$; $\cdot92 = 76{,}405{,}080$; $\cdot91 = 6{,}952{,}862{,}280$. Divide by $120$: $6{,}952{,}862{,}280 / 120 = 57{,}940{,}519$. ✓
count_with_minimums(5, [2,2,2]): leftover $= 5 - 6 = -1 < 0$, so there is no allocation; return 0. ✓ (The guard encodes "you cannot satisfy minimums that already exceed the supply.")

⚠️ Common Pitfall: minimums vs. the formula. It is tempting to "adjust" the binomial coefficient directly for minimums. Don't — substitute first. The only correct move is to spend the minimums up front ($n \to n - \sum \ell_i$) and then apply the unconstrained formula on the leftover. Forgetting the leftover-could-be-negative guard silently returns a nonsense positive count when demands exceed supply.

🔄 Check Your Understanding An operator asks for the number of ways to split 50 credits across 4 tiers with every tier getting at least 1. Compute it two ways: via count_with_minimums(50, [1,1,1,1]), and via the closed form $\binom{n-1}{k-1}$ for all-positive distributions (§17.3). Do they agree?

Answer

count_with_minimums(50, [1,1,1,1]): leftover $= 50 - 4 = 46$, giving $\binom{46+4-1}{3} = \binom{49}{3} = \frac{49\cdot48\cdot47}{6} = 18{,}424$. The all-positive closed form is $\binom{n-1}{k-1} = \binom{49}{3} = 18{,}424$. They agree — count_with_minimums(n, [1]*k) is the non-empty stars-and-bars count.

Phase 2: The generating-function engine

Minimums are the easy constraint. Operators also want caps ("at most 30 per tier"), parity ("an even number"), and menus ("one of $\{0, 8, 16\}$"). Stars and bars has no single formula for these, but generating functions do: each slot becomes a polynomial factor whose powers are exactly that slot's allowed amounts, and the answer is $[x^n]$ of the product (§17.5). Build the engine from two primitives — polynomial multiplication (the convolution from §17.5) and a coefficient reader.

def poly_mult(p, q):
    """Multiply two polynomials given as coefficient lists (index = power)."""
    result = [0] * (len(p) + len(q) - 1)
    for i, a in enumerate(p):
        for j, b in enumerate(q):
            result[i + j] += a * b          # x^i * x^j -> x^(i+j)
    return result

def count_from_factors(n, factors):
    """[x^n] of the product of the given slot factors (coefficient lists).
    Each factor lists a slot's allowed amounts as powers present in the list."""
    product = [1]                            # the constant polynomial 1
    for f in factors:
        product = poly_mult(product, f)
    return product[n] if n < len(product) else 0

# Slot factors, each truncated to degree 6 (enough to read [x^6]):
any_amount = [1, 1, 1, 1, 1, 1, 1]          # 1/(1-x): 0..6 of this item
at_most_2  = [1, 1, 1]                       # 0,1, or 2 of this item
even_only  = [1, 0, 1, 0, 1, 0, 1]          # 0,2,4,6 of this item

# How many ways to total 6 units across three slots with these rules?
print(count_from_factors(6, [any_amount, at_most_2, even_only]))
# Expected output:
# 10

Hand-derive count_from_factors(6, [any_amount, at_most_2, even_only]) to keep the engine honest. Each legal triple is $(a, b, c)$ with $a$ unrestricted, $b \in \{0,1,2\}$, and $c \in \{0,2,4,6\}$, summing to $6$. Because $a$ is free, every choice of $(b, c)$ with $b + c \le 6$ determines a unique $a = 6 - b - c \ge 0$ — so we just count valid $(b, c)$ pairs, organized by $c$:

$c = 0$: $b \in \{0,1,2\}$ (all satisfy $b + 0 \le 6$) → 3 pairs.
$c = 2$: $b \in \{0,1,2\}$ ($b + 2 \le 6$) → 3 pairs.
$c = 4$: $b \in \{0,1,2\}$ ($b + 4 \le 6$) → 3 pairs.
$c = 6$: need $b + 6 \le 6$, so only $b = 0$ → 1 pair.

Total $3 + 3 + 3 + 1 = \mathbf{10}$, matching the engine. The polynomial multiplication performed exactly this enumeration: it formed every product of one term from each factor and summed the contributions whose exponents total 6.

⚠️ Common Pitfall: truncate carefully. To read $[x^6]$, every factor needs its terms up to $x^6$ — even_only must include the $x^6$ term, at_most_2 legitimately stops at $x^2$ (it has no higher terms by its rule, not by truncation). Dropping a needed low-order term silently undercounts; keeping extra high terms is harmless. When in doubt, extend each list to length $n+1$.

🔗 Connection: count_from_factors with every factor any_amount (i.e. $\frac{1}{1-x}$ for each of $k$ slots) recomputes stars and bars, because $\frac{1}{(1-x)^k} = \sum \binom{n+k-1}{k-1}x^n$ (§17.5). The generating-function engine is a strict generalization of Phase 1 — it just pays a polynomial-multiplication cost for the freedom to use any per-slot rule.

🔄 Check Your Understanding Write the factor list for a slot that must take a positive even number ($2, 4, 6, \dots$) of units, truncated to degree 6. How does it differ from even_only?

Answer

[0, 0, 1, 0, 1, 0, 1] — the coefficient of $x^0$ is now 0 (zero is not allowed), with 1's at $x^2, x^4, x^6$. It is even_only with its constant term zeroed out, encoding "even and at least 2."

Phase 3: The brute-force validator (test, then trust)

A counter you cannot check is a liability. Build a tiny enumerator that lists every allocation of $n$ units into $k$ slots (feasible only for small $n, k$) and counts those satisfying a predicate. Then make the fast counters agree with it on small inputs — the theme-four partnership.

def enumerate_allocations(n, k):
    """Yield every (x_1,...,x_k) of non-negative ints summing to n. Small n,k only!"""
    if k == 1:
        yield (n,)
        return
    for first in range(n + 1):
        for rest in enumerate_allocations(n - first, k - 1):
            yield (first,) + rest

def brute_count(n, allowed):
    """Count allocations of n units into len(allowed) slots where x_i is in
    allowed[i] (a set of permitted amounts for slot i)."""
    k = len(allowed)
    return sum(1 for a in enumerate_allocations(n, k)
               if all(a[i] in allowed[i] for i in range(k)))

# Ground truth for the Phase 2 question, with the same three rules at n = 6:
rules = [set(range(7)),          # any amount 0..6
         {0, 1, 2},              # at most 2
         {0, 2, 4, 6}]           # even
print(brute_count(6, rules))
print(count_basic(5, 3))                              # cross-check Phase 1
print(brute_count(5, [set(range(6))] * 3))            # ...against brute force
# Expected output:
# 10
# 21
# 21

The validator confirms Phase 2: brute_count(6, rules) enumerates all allocations of 6 units into 3 slots and keeps those obeying the three rules, yielding 10 — agreeing with both the generating function and our hand count. The Phase 1 cross-check is just as clean: count_basic(5, 3) $= \binom{7}{2} = 21$, and brute_count over all unrestricted allocations of 5 units into 3 slots also yields 21. The closed form, the generating-function engine, and the enumerator all agree where they overlap — strong evidence (not proof, but strong evidence) that the fast counters are transcribed correctly.

Now watch the harness earn its keep. Suppose a teammate truncates the even-slot factor one term too short — a real and common bug from §17.5's "keep enough terms" pitfall — dropping the $x^6$ term:

even_too_short = [1, 0, 1, 0, 1]      # BUG: stops at x^4, missing the x^6 term
print(count_from_factors(6, [any_amount, at_most_2, even_too_short]))
# Expected output:
# 9

The buggy engine returns 9, not 10: it silently lost the single allocation with $c = 6$ (namely $(0, 0, 6)$) because the truncated factor had no $x^6$ term to contribute. A casual eyeball would never catch a count that is off by one — but brute_count(6, rules) returns 10, so the disagreement is immediate and loud. The enumerator did exactly what tests are for: it flagged a wrong-but-plausible answer that the proof-backed correct factor would never produce.

🚪 Threshold Concept: a counter is code, and code gets tested. It is easy to treat a combinatorics formula as unimpeachable and skip testing it. But you can transcribe $\binom{n+k-1}{k-1}$ wrongly, mix up $k$ and $k-1$, or truncate a factor — and a wrong counter is worse than no counter, because it looks authoritative. The discipline that scales: enumerate small cases by brute force, demand the fast counter match, and only then run it on the large inputs where enumeration is impossible. The brute force cannot run on $n = 100$ — but it does not need to; it certifies the logic on cases you can exhaust, and the proof behind stars and bars certifies the rest. Computation and proof, each doing what the other cannot.

🔄 Check Your Understanding Why can enumerate_allocations(100, 6) not be used to answer the operator's original question, even though it is "correct"? Tie your answer to the value of count_basic(100, 6) from Phase 1.

Answer

It would have to yield $\binom{105}{5} = 96{,}560{,}646$ tuples — about $10^8$ allocations — which is far too many to enumerate in reasonable time or memory. The closed form returns the same number instantly. The enumerator's role is to validate the counter on small inputs, not to answer the production question; that is exactly the "count before you enumerate" reflex.

Phase 4: Putting it together — a capacity-planning answer

Assemble the module into the answer DataCrate wanted for its hardest question: "100 credits across 6 tiers, premium $\ge 10$, every tier $\le 30$, batch tier even." Because there is both a global total and per-slot caps/parity, the generating-function engine is the right tool — build one factor per tier, truncated to degree 100.

def slot_factor(n, allowed_amounts):
    """A coefficient list of length n+1 with 1's at the allowed amounts <= n."""
    f = [0] * (n + 1)
    for a in allowed_amounts:
        if a <= n:
            f[a] = 1
    return f

n = 100
premium = slot_factor(n, range(10, 31))         # 10..30 inclusive
capped  = slot_factor(n, range(0, 31))          # 0..30, ordinary tier
batch   = slot_factor(n, range(0, 31, 2))       # 0,2,...,30, even only
factors = [premium, capped, capped, capped, capped, batch]
print(count_from_factors(n, factors) >= 0)      # the count is well-defined
# Expected output:
# True

The exact value of count_from_factors(100, factors) is a specific (large) integer the engine computes by five polynomial multiplications; we print only that it is well-defined (a non-negative integer) because the precise figure is not something to hand-derive in this margin — the method is the deliverable. The operator now has one module that answers the unconstrained question by closed form (Phase 1, instant) and any constrained question by generating function (Phase 2, flexible), each validated against brute force on small cases (Phase 3). That is a tool you can ship.

Discussion Questions

Phase 1's count_basic returns in constant time regardless of $n$; Phase 2's count_from_factors does work proportional to $k$ polynomial multiplications, each up to degree $n$. For the unconstrained problem, both are correct — when is the generating-function engine nonetheless worth its extra cost, and when should you prefer the closed form?
The brute-force validator in Phase 3 could not run on $n = 100$. Articulate precisely what role it plays if it can only ever check small cases. How is this the same relationship as "tests vs. proof" (theme four)?
In Phase 3 a truncated factor made count_from_factors return 9 instead of 10, and the brute-force enumerator caught it. If instead the enumerator and the generating function disagreed, how would you adjudicate which is wrong? What makes the small-case enumerator a trustworthy referee — and what are its limits as a referee?
The generating-function factor for a slot encodes "which amounts are allowed" as which powers are present. Express, as a factor truncated to degree 8: "this slot takes a multiple of 3, up to 6." Then explain how count_from_factors would combine it with other slots.

Your Turn: Extensions

Option A (closed form for caps). Caps can also be handled without generating functions, by inclusion–exclusion (§17.4) on "slots that exceed the cap." Derive the count of solutions to $x_1 + \cdots + x_k = n$ with every $x_i \le c$ as an alternating sum of stars-and-bars terms, and implement count_with_caps(n, k, c). Validate it against count_from_factors with each factor [1]*(c+1) and against brute_count on small inputs.
Option B (combinations with repetition). Add count_multisets(n, types) that counts multisets of size $n$ drawn from types kinds, repetition allowed — and note (with a one-line comment citing §17.3) that it is identical to count_basic(n, types). Demonstrate the equality on n=4, types=3 against brute_count.
Option C (the make-change classic). Specialize the engine to coin systems: write ways_to_make_change(amount, coins) where each coin denomination $d$ contributes the factor $1 + x^d + x^{2d} + \cdots$ (truncated to amount). Confirm it reproduces the chapter's pennies-and- nickels result (amount=7, coins=[1,5] → 2) before trying a full $\{1, 5, 10, 25\}$ system.

Key Takeaways

Build the counter in layers: closed form first, generating function for flexibility. Stars and bars (and its lower-bound substitution) answers the simple and minimum-constrained cases in one binomial coefficient; the generating-function engine handles caps, parity, and menus by multiplying one factor per slot and reading $[x^n]$.
A generating-function factor is a slot's allowed-amounts list. Constraints become "which powers the factor may contribute"; multiplication is the convolution that forms all combinations and totals them — exactly the distribution being counted.
Validate fast counters against brute force on small inputs. The enumerator is ground truth where it can run; the formula's proof covers the rest. When the code and a careful enumerator agree, trust them over a hand count — that is theme four in practice.
Count before you enumerate. $\binom{105}{5} \approx 9.7 \times 10^7$ is the instant answer to a space far too large to list; knowing the count is what tells you whether enumeration is a loop or a lifetime.