Chapter 28 Quiz: Sales and Revenue Analytics

Instructions: Answer all 20 questions. For multiple-choice questions, choose the single best answer. For True/False, write True or False. For short-answer questions, write 2–4 complete sentences. Attempt all questions before consulting the answer key.

Section A: Multiple Choice (Questions 1–10)

Question 1

When calculating average order value (AOV) from a line-item sales DataFrame, which approach is correct?

A) df["revenue"].mean() B) df.groupby("order_id")["revenue"].sum().mean() C) df["revenue"].sum() / df.shape[0] D) df.groupby("customer_id")["revenue"].mean().mean()

Question 2

You want to compare December 2023 revenue to December 2022 revenue to understand whether your business is growing. Which growth metric is most appropriate?

A) Month-over-month (MoM) growth B) Year-over-year (YoY) growth C) Quarter-over-quarter (QoQ) growth D) Rolling 3-month average

Question 3

A Pareto analysis of your customer base shows that 15 customers generate 80% of revenue, out of 200 total customers. What is the approximate percentage of customers generating 80% of revenue?

A) 5% B) 7.5% C) 10% D) 15%

Question 4

In the monthly_revenue_trend() function from the chapter, what does monthly["revenue"].shift(12) compute?

A) Revenue from 12 days ago B) Revenue from the same month one year ago C) Revenue 12 months in the future (projected) D) The 12-month rolling average

Question 5

A Herfindahl-Hirschman Index (HHI) of 3,200 for a company's customer revenue distribution indicates:

A) Low concentration — revenue is spread evenly across customers B) Moderate concentration — some dependence on large customers C) High concentration — a small number of customers dominate revenue D) The company is in a monopoly market

Question 6

Product A generates 25% of total revenue with a 30% gross margin. Product B generates 12% of total revenue with a 55% gross margin. Which statement is most accurate?

A) Product A is definitely the more valuable product to the business B) Product B is worth less strategically because it generates less revenue C) Product B may contribute a higher share of total gross margin than its revenue share suggests D) Both products should receive equal marketing investment

Question 7

In a sales cohort analysis, what is "period 0"?

A) The month before a customer first purchased B) The month a customer first purchased (their acquisition month) C) The month with zero revenue from a given cohort D) The baseline period for the entire analysis

Question 8

What does an RFM score of R=1, F=1, M=1 (total score: 3) indicate about a customer?

A) They are a brand new customer with their first purchase B) They bought a long time ago, infrequently, and for small amounts — likely lost C) They are in the top quartile for all three dimensions D) The scoring system encountered an error

Question 9

Priya calculated that the West region's average order value was $2,180 compared to $2,200 for other regions, but West had only 15 customers vs. 40+ in other regions. What does this combination of facts most strongly suggest?

A) West customers are price-sensitive and require more discounting B) West region has a product mix problem C) West region has a sales capacity problem, not a deal quality problem D) West region's salesperson is underperforming per account

Question 10

Which pandas method is used in monthly_revenue_trend() to compute the percentage change from the prior period automatically?

A) .diff() B) .rolling() C) .pct_change() D) .cumsum()

Section B: True or False (Questions 11–15)

Question 11

df["revenue"].mean() and df.groupby("order_id")["revenue"].sum().mean() will always give the same result for a well-formed sales DataFrame.

Question 12

In a Pareto analysis, the cumulative percentage column will always reach exactly 100% at the last row if you include all items (not just top_n).

Question 13

Year-over-year growth is generally more useful than month-over-month growth for identifying seasonal businesses because YoY compares equivalent periods.

Question 14

A customer with an RFM recency score of 4 has purchased most recently (within the last quarter), while a customer with a recency score of 1 has not purchased in a long time.

Question 15

Revenue concentration risk is only relevant for companies with fewer than 100 customers. Large companies with hundreds of customers never face meaningful concentration risk.

Section C: Short Answer (Questions 16–20)

Question 16

Explain the difference between revenue_pct and margin_mix_pct in a product mix analysis. Give a concrete example of when these two metrics would tell different stories about the same product, and explain what strategic action that divergence might suggest.

Question 17

A company's cohort analysis shows strong revenue in period 0 (acquisition month) but near-zero revenue in periods 3 and beyond for every cohort. What business problem does this reveal, and what are two possible causes?

Question 18

Describe the four RFM customer segments introduced in the chapter (Champions, Loyal Customers, At Risk, Lost) and give one specific marketing or sales action appropriate for each segment. Why would you apply different strategies to customers in different segments rather than using a single approach for everyone?

Question 19

The chapter shows that the West region's underperformance was a capacity problem rather than a market problem. Explain in your own words what data points distinguish these two hypotheses, and describe how Priya used Python to test them. What would the data have looked like if it were a market problem instead?

Question 20

Sandra Chen asks you to produce a one-number summary of sales performance for the board meeting: "Just tell me how we did in 2023 compared to 2022." What are the limitations of using only a single metric (such as total revenue growth rate) to answer this question? Name at least three additional metrics you would include in a more complete answer, and explain what each adds.

Answer Key

Section A: Multiple Choice

Answer 1: B df.groupby("order_id")["revenue"].sum().mean() is correct. This groups revenue by order, getting one total per order, then takes the mean of those order totals. Option A gives mean revenue per line item (not per order), which understates AOV when orders contain multiple products. Options C and D are also incorrect — C is identical to A, D gives revenue per customer not per order.

Answer 2: B Year-over-year (YoY) growth compares December 2023 to December 2022, which controls for seasonal patterns. Month-over-month (Option A) would compare December to November — but November to December is always higher in most businesses due to seasonality, which would give a misleadingly positive picture.

Answer 3: B 15 customers / 200 total customers = 7.5%. This is slightly better than the classic 20/80 split, meaning revenue is more concentrated than average — a finding worth noting in a risk analysis.

Answer 4: B shift(12) on monthly sorted data moves each value 12 positions back, which — for 12 months of data sorted chronologically — corresponds to the same month one year prior. This enables year-over-year comparison without any date manipulation.

Answer 5: C An HHI of 3,200 exceeds the 2,500 threshold noted in the chapter as indicating high concentration. This means a small number of customers dominate revenue, creating elevated churn risk.

Answer 6: C Product B, with its 55% gross margin vs. Product A's 30%, contributes disproportionately to gross profit relative to its revenue share. For example, if total revenue is $1M: Product A contributes $250k revenue × 30% = $75k gross margin. Product B contributes $120k revenue × 55% = $66k gross margin. Despite generating about half the revenue, Product B generates 88% as much gross margin as Product A. Option A is wrong — revenue share alone does not determine strategic value.

Answer 7: B Period 0 is the acquisition month — the month the customer made their first purchase and "joined" the cohort. Period 1 is the next month, period 2 the month after that, and so on.

Answer 8: B R=1 means the customer has not purchased recently (in the bottom quartile for recency — longest time since last purchase). F=1 means they buy very infrequently. M=1 means they spend very little. This customer profile represents the "Lost" segment. Note: Option A is wrong — a brand new customer would have a high recency score (4), not a low one.

Answer 9: C The near-identical average order value ($2,180 vs. $2,200) proves that when West region does make sales, the deals are comparable in size. The problem is the number of customers — 15 vs. 40+. This is definitively a capacity problem. Option D is wrong because Dave Nguyen's per-account revenue actually compares favorably to other reps.

Answer 10: C .pct_change() automatically computes (current - previous) / previous for each element. .diff() (Option A) computes the absolute difference, not the percentage. .rolling() (Option B) computes rolling window calculations. .cumsum() (Option D) computes the running total.

Section B: True or False

Answer 11: False They will give the same result only when every order contains exactly one line item (one product). In most real sales data, orders contain multiple products and thus multiple rows. df["revenue"].mean() divides total revenue by the number of rows (line items), while the groupby approach divides total revenue by the number of unique orders. For a multi-product order, the latter is the correct AOV.

Answer 12: True The cumulative percentage is the running sum of individual percentages. If you include all items and the individual percentages sum to 100% (which they always will, since they are derived by dividing each value by the total), the cumulative column will reach exactly 100% at the last row.

Answer 13: True YoY growth compares equivalent calendar periods (same month, same quarter, same season). A business seeing 50% MoM growth from November to December is likely experiencing normal holiday seasonality, not 50% business growth. YoY growth for December vs. the prior December removes that seasonal effect and shows whether the business is actually improving.

Answer 14: True This is correct as implemented in the chapter. The recency score is inverted — fewer days since last purchase means more recent, which means higher score. A score of 4 means the customer is in the top quartile for recency (bought most recently), and a score of 1 means they are in the bottom quartile (bought longest ago).

Answer 15: False Revenue concentration risk is relevant at any company size. A company with 1,000 customers where the top 10 account for 80% of revenue faces significant concentration risk despite having many customers. Conversely, a company with 50 customers but very even revenue distribution has low concentration risk. The number of customers is less relevant than the distribution of revenue across them.

Section C: Short Answer

Answer 16 revenue_pct is a product's share of total revenue (dollars sold). margin_mix_pct is a product's share of total gross margin (profit generated). These diverge when products have different margin rates.

Example: A printer paper product generating 23% of revenue at a 44% gross margin contributes 17.5% of total gross margin — it under-indexes on margin relative to revenue. A shredder generating 8% of revenue at a 42% margin might contribute 10% of total gross margin — it over-indexes on margin.

Strategic implication: the shredder is more profitable per dollar of revenue. If the company has limited marketing budget, investing it in promoting shredders over printer paper would likely improve overall profitability even if it does not maximize total revenue.

Answer 17 This pattern reveals a customer retention problem: customers are acquiring (first purchase) but not returning. The company appears to have strong sales and marketing attracting first-time buyers, but something is preventing repeat purchases.

Two possible causes: (1) Product or service quality issues — customers are disappointed with what they receive and do not return. (2) Competitive alternatives — customers find a better supplier or product after their first purchase and switch. A third possibility is that the company has no re-engagement or follow-up process after the first sale.

Answer 18 Champions (score 10–12): Bought recently, buy often, spend a lot. Action: offer loyalty rewards, early access to new products, and ask for referrals. You want to maintain the relationship and leverage their advocacy.

Loyal Customers (8–9): Strong customers who are not yet at Champion level. Action: offer cross-sell and upsell opportunities, check in regularly, and make them feel valued before they drift. You want to deepen the relationship.

At Risk (4–5): Used to buy regularly but have not recently. Action: personal outreach from account manager, win-back offers, or a survey to understand what changed. Acting before they leave completely is more cost-effective than acquiring a replacement.

Lost (3): Bought a long time ago, infrequently, for small amounts. Action: minimal investment — at most a low-cost automated re-engagement email. Resources are better spent acquiring new customers than trying to revive customers who were never highly engaged.

Different strategies are appropriate because customer behavior signals different relationship stages. Sending a "win-back" discount to a Champion is unnecessary and potentially devalues the relationship. Sending a loyalty reward to a Lost customer wastes budget on someone unlikely to respond.

Answer 19 A market problem would show lower deal sizes or different product preferences — indicators that West region customers are inherently different from customers in other regions. The data would show average order values significantly below other regions, perhaps $1,200 vs. $2,200, suggesting customers in the West market simply do not buy as much or have different needs.

The actual data showed the opposite: average order values of $2,180 (West) vs. $2,200 (other regions) were nearly identical. This ruled out a market problem. The only meaningful difference was customer count (15 vs. 40+ per region) and sales headcount (1 vs. 2–3 per region). Priya verified this by running revenue_by_dimension() on region, then computing calculate_revenue_metrics() separately for West and non-West, and comparing average order values directly.

If it had been a market problem, the recommended action would have been different — perhaps pricing adjustments, different product offerings for the West region, or targeting different customer segments. Because it was a capacity problem, the correct recommendation was simply to add salespeople.

Answer 20 Total revenue growth rate is a useful single number but hides important complexity. A company could show 10% revenue growth while simultaneously becoming less profitable (if margins are shrinking), more dependent on fewer customers (concentration risk is rising), or growing only because of one record quarter masking a declining trend elsewhere.

Three additional metrics that would add substance:

(1) Gross margin growth rate — Revenue growing while margins compress could mean the business is discounting more aggressively or product mix is shifting toward lower-margin items. If revenue grew 10% but gross margin grew only 3%, the business is actually less financially healthy despite the top-line growth.

(2) Customer count change (net adds) — Did the company grow revenue by winning more customers or by selling more to existing customers? Both are valuable, but new customer acquisition signals market health while expansion of existing accounts signals product depth. If revenue grew 10% but customer count declined, the business may be increasingly vulnerable.

(3) Revenue concentration change — If the top 10 customers now represent 80% of revenue compared to 65% last year, risk is increasing even if total revenue is up. This is a strategic vulnerability that the board needs to understand alongside the growth rate.