Chapter 32 Exercises
Work these as a model-era underwriter: for every model, ask what is it optimizing, what can it not see, and could I defend its number — or my override of it — to a regulator and an audit? Items marked with a dagger (†) have worked solutions in Appendix: Answers to Selected Exercises; the rest are for discussion or self-test. All figures are illustrative teaching examples. Section references like (§32.2) point back to the chapter.
A. Recall and definitions
- † Define a generalized linear model (GLM) in one sentence, and name the two component models an insurance GLM typically uses for pricing (the distributions and what each predicts). (§32.2)
- State the single statistical advantage a predictive model has over a classical one-way rate table, and explain in one sentence why it matters when rating factors are correlated. (§32.1)
- Define a gradient boosting machine (GBM) in plain language. What does it find automatically that a GLM must be told about by hand? (§32.3)
- † Define feature engineering, and explain why the chapter calls it the underwriter's natural seat at the modeling table. (§32.5)
- Distinguish lift from the Gini coefficient. What does each measure, and what do both fail to prove about a model? (§32.6)
- State the cardinal rule of model validation in one sentence. Why is a model's accuracy on its training data nearly meaningless? (§32.6)
- Name the three corners of the actuary–underwriter–data-scientist triangle and the part of the truth each owns. (§32.7)
- Define the pricing-model lifecycle and name the two points in it where the underwriter contributes. (§32.7)
B. GLMs and the frequency–severity split
- † A personal-auto frequency GLM (log link) returns these relativities: base 1.00; driver-age-22 ×1.85; dense-urban territory ×1.30; high-performance vehicle ×1.20; continuous-prior-coverage ×0.92. (a) Compute the predicted relative frequency for a 22-year-old in a dense-urban territory driving a high-performance vehicle with continuous prior coverage. (b) Explain what the 1.85 relativity means in words, being precise about what is "held constant." (§32.2)
- Explain why a GLM for insurance uses a Poisson distribution for the frequency model and a gamma distribution for the severity model. What property of claim counts, and what property of claim amounts, drives each choice? (§32.2)
- † The log link makes a GLM "feel like a rate table." Explain why predicting the log of expected loss makes the individual factors multiply on the normal scale, and why that is convenient for an underwriter reading the output. (§32.2)
- A severity GLM returns a relativity of 0.65 for a rating variable that underwriting intuition says should make the risk worse, not cheaper. Give two distinct explanations for this "wrong sign," and state what you would do before trusting it. (§32.2)
- Why does a GLM not, on its own, discover that the effect of a sports car depends on the driver's age? What must the modeler do, and what is this kind of effect called? (§32.2)
C. GBMs, neural networks, and the accuracy–transparency trade
- † Your analytics team offers two models for the same task: a GLM with a Gini of 0.31 and a GBM with a Gini of 0.37, both measured out-of-sample. For each of the following uses, say which model you'd choose and why: (a) setting the filed rate; (b) triaging which submissions a human reviews first. (§32.3, §32.6)
- Explain overfitting in one or two sentences, and explain why the losses from an overfit pricing model resemble the losses from soft-market underpricing — including when they show up. (§32.3, Ch.11)
- A vendor says their GBM is "98% accurate on our historical book." Write the one-sentence question you must ask in response, and explain why the answer decides whether the number is impressive or worthless. (§32.3, §32.6)
- † Find the red flag. An image model returns: "roof material = metal; condition = excellent; confidence = high" for a building you have separately been told has a tarped, patched, end-of-life built-up roof. List three distinct reasons the image model could be confidently wrong, and state the posture an underwriter should take toward any high-confidence image score. (§32.4)
- Why are neural networks rarely the tool of choice for tabular insurance pricing, yet transformative for image-based underwriting? What changes between the two cases? (§32.4)
D. Feature engineering and fairness
- † For each raw field, propose one engineered feature that would carry more predictive signal, and say what underwriting pattern it captures: (a) year built + today's date; (b) a list of prior claim dates; (c) a street address; (d) an industry/class code for a welding shop. (§32.5)
- The chapter says "the model is rarely what decides the result; the inputs are." Argue this claim using the idea that a mediocre algorithm on excellent features beats a brilliant algorithm on raw ones. (§32.5)
- Ethics dilemma. Your team proposes adding a feature to a personal-auto model that is strongly predictive but is, on inspection, largely a proxy for a protected class via ZIP code. The Gini improves measurably. Write the argument you would make to keep it out of the model, and name what would have to be true for a correlated variable to be defensible. (§32.5; Ch.4, Ch.35 preview)
- † Explain the difference between "garbage in, garbage out" and the sharper modern version, "bias in, bias out." Why does feature selection, more than algorithm choice, determine whether a model is fair? (§32.5)
E. Validation, lift, and the Gini
- † You are shown a lift chart whose ten deciles run, best-to-worst: 45%, 58%, 66%, 74%, 82%, 90%, 104%, 120%, 145%, 190% loss ratio. (a) Does this model have good lift? How can you tell? (b) Does this chart prove the model's prices are adequate? Explain. (§32.6, Ch.11)
- A model has strong lift overall, but when the book is cut by state, the lift is flat in your largest coastal state. Explain precisely what this means and why deploying the model statewide on the strength of the overall number is dangerous. (§32.6)
- Price this risk (model triage). A commercial-property model sorts your book into deciles. The best three deciles run at a ~55% loss ratio; the worst two run at ~170%. Your target combined ratio implies a ~62% loss ratio. In plain terms, what underwriting actions does this lift chart suggest for the best deciles and for the worst two — and what does the chart not tell you about whether your overall rate is high enough? (§32.6, Ch.3, Ch.11)
- † List the four questions the chapter says to ask whenever a model is put in front of you, and for each, name the failure it is designed to catch. (§32.6)
- Explain why a Gini of 0.34 might be excellent for one line and mediocre for another. Why is there no universal "good" Gini? (§32.6)
F. The triangle, the override, and the memo
- † Underwrite this submission. A GBM scores a mid-market commercial risk an 8/10 (decline-leaning). On reading the file you find: the score was computed before the broker supplied an updated, post-loss sprinkler upgrade and a new safety-management contract; the risk is in an industry your model has very few historical examples of. Decide whether to override, and write three or four sentences justifying your decision in the language of §32.7 (name the override justification(s) you're relying on). (§32.7)
- Name the three situations that justify overriding a model. What single thing do all three have in common, and what justification — common in practice — is not on the list? (§32.7)
- † Write the memo. In 120–180 words, write the override note you would put in the file for the risk in Exercise 28 — the artifact that must survive an audit and a regulator's question. Include: the model's score, the specific facts the model lacked, why those facts change the grade, the terms or subjectivities you're attaching, and the new grade. (§32.7)
- The chapter says a model "no one overrides is a model no one is watching." Explain how logged overrides feed the pricing-model lifecycle, and why an underwriter who overrides well is the model's complement rather than its adversary. (§32.7)
- Ethics dilemma. An underwriter on your team overrides the model's accept to a decline on a particular kind of risk and cannot articulate a specific fact the model lacked — only that the risk "feels wrong." Why is this override a problem even if the risk does turn out to be a loss? Answer for the auditor, the regulator, and the integrity of the model. (§32.7)
G. The Underwriting File
- † Underwriting-File extension. The model scored Harbor Steel a 7/10 and you overrode to a 6. (a) List the specific facts the model could see that drove the 7. (b) List the specific facts the model could not see that justify the 6. (c) State which of the three §32.7 override justifications you are relying on, and why. (The Underwriting File, §32.7)
- Write, in your own words, the single sentence you would log alongside the Harbor Steel override so that a future model team could turn it into a new feature. (What is the model currently blind to?) (The Underwriting File, §32.5, §32.7)
- The Harbor Steel override moves the grade from 7 to 6 — not from 7 to 2. Explain why a disciplined override is a modest, defensible adjustment grounded in named facts, and why an override that swung the grade dramatically on the same evidence would itself be a red flag. (The Underwriting File, §32.7)
- Connect the chapters. In Chapter 33 you will learn that the Harbor Steel application understated the 2023 fire's cause. If that disclosure gap were fed to the model as a corrected feature, would you expect the model's score to move up or down, and what does that tell you about the limits of overriding a model that was given incomplete or inaccurate inputs? (The Underwriting File, §32.5, Ch.33 preview)