Chapter 31 Exercises

DataField.Dev

Chapter 31 Exercises

Work these with the chapter's central habit of mind: a data-driven risk picture is a first draft written by a machine — fast, useful, and to be checked. For every pre-filled field ask says who, and how do they know?; for every score ask what could it not see? Items marked with a dagger (†) have worked solutions in Appendix: Answers to Selected Exercises; the rest are for discussion or self-test. Section references like (§31.3) point you back to the relevant part of the chapter. All names, figures, and submissions are constructed teaching examples.

A. Recall and definitions

† Define pre-fill (data enrichment) in one sentence, and name the three failure modes the chapter identifies (wrong match, stale data, false precision). (§31.3)
Define real-time risk scoring and state, in one sentence, the central psychological problem created by the score arriving before the underwriter reads the file. (§31.4)
What is the underwriting workstation, and how does it differ from the "fax-and-folder" workflow it replaced? (§31.4)
† List the four families of alternative data sources in the chapter. For each, give one thing it tells you well and one thing it cannot tell you. (§31.2)
Name the six dimensions of data quality the chapter lists (accuracy, currency, completeness, consistency, provenance, relevance) and give a one-line example of a failure in each. (§31.7)
In one sentence each, state what the data revolution changed about underwriting and what it did not change. (§31.1)
What is the referral logic in a straight-through-processing system, and why does the chapter call a good referral rule "an honest statement of where the data and the model stop being trustworthy"? (§31.5)

B. Alternative data and what it can and cannot show

† An aerial image flags a commercial roof as "poor / likely end-of-life." (a) Name two things this reasonably confirms. (b) Name three things the image cannot establish that still bear on the price. (c) What single follow-up step would resolve the most important of the three? (§31.2, §31.3)
A personal-auto applicant submits six months of telematics data with a near-perfect score. Explain why this is strong evidence and yet not proof of a low-risk driver. Which dimension of risk (recall Chapter 6) does telematics observe well, and which can it miss? (§31.2)
Two third-party aggregators return different square footages for the same commercial building. The chapter says "a data feed is a claim, not a fact." Explain what you should do, and why averaging the two figures is the wrong instinct. (§31.2, §31.7)
For each source, name its single most important limitation: (a) satellite/aerial imagery; (b) public assessor records; (c) IoT/telematics; (d) a third-party aggregator feed. (§31.2)

C. Pre-fill: speed, honesty, and its traps

† The chapter argues pre-fill can make a submission more honest, not just faster. Explain the mechanism, and connect it explicitly to adverse selection (recall Chapter 1). (§31.3)
Define automation bias on a pre-filled field. Why does a false roof age slide through more easily when a system pre-filled it than when a producer typed it? (§31.3)
A pre-filled file is fully populated, every field green, the model confident — and the address resolved to the parcel next door. Walk through how this single error propagates if the risk is then straight-through bound. Which data-quality dimension failed first? (§31.3, §31.7)
The chapter calls a pre-filled price-driving field "a hypothesis to be confirmed." Give two examples of price-driving fields you would always verify on a commercial property risk, and one low-stakes field you would reasonably let stand on the pre-fill alone. (§31.3)

D. The score as an input, not the verdict

† A workstation returns a 7-out-of-10, decline-leaning score on a commercial submission before you have read anything. State the three habits §31.4 prescribes for using the score well, and explain why "read the file as if the score did not exist, then compare" is more than a formality. (§31.4)
Explain the difference between ignoring the score, deferring to the score, and reading the score as one voice in the room. Why does the chapter reject the first two? (§31.4)
"The system said so" is described as "not a defense an underwriter can give." To which three audiences must an underwriter be able to defend a price instead, and what does each require of the file? (§31.4; recall Chapter 13)

E. Straight-through processing and the automation frontier

† Underwrite this submission (route it). For each, decide BIND via STP or REFER to a human, and name the single feature that decides it: (a) a personal-auto renewal, clean record, no changes; (b) a \$20M-limit coastal commercial property with a loss flag; (c) a small low-hazard office BOP, clean pre-fill; (d) a brand-new exposure class with no loss history; (e) a mid-size commercial auto fleet with one conflicting MVR field. (§31.5, §31.6)
The chapter says that for the simplest risks, "technology genuinely does replace the underwriter, and that is fine." Reconcile this with theme five ("technology augments underwriters; it does not replace them"). Where exactly does the apparent contradiction dissolve? (§31.5, §31.6)
† Describe what happens to the mix of risks that reach a human underwriter as STP's frontier advances. Why does the chapter call this "a concentration, not a demotion," and what does it imply for the skills a new underwriter should build? (§31.1, §31.5)
A referral grid refers every risk above a premium threshold and nothing below it. Critique this design using the chapter's failure-mode logic, and propose three referral triggers that better track where data and models actually break down. (§31.5)

F. Data quality and governance

† Find the red flag. A submission arrives fully pre-filled and scored "green / bind." Buried in it: the building's flood-zone code is blank but the model priced it as "low hazard"; the aerial image is dated two years ago; and the prior-loss field shows "0" though the producer mentioned a fire in a phone call. Identify each data-quality problem by its dimension (§31.7), say which is most dangerous and why, and state what should have stopped the bind. (§31.7)
Explain the silent default failure mode. Why is a missing price-driving field that gets filled with a class average more dangerous than a field left visibly blank? (§31.7)
The chapter says a carrier that buys a data feed has "outsourced the data but kept the risk." Explain what this means for liability and for the controls a data-driven carrier must run on the feed itself, not just on its decisions. (§31.7)
Why is "garbage in, garbage out" described as more dangerous in the data age than before, rather than less? Use the idea of "the friction that used to catch errors." (§31.7)

G. Memo, ethics, and the combined ratio

† Write the memo. In 150–200 words, write a short note to your underwriting manager recommending against straight-through-binding a particular class of small commercial risk you believe is being bound too loosely. Cite at least two specific data-quality or referral concerns and tie your recommendation to the combined ratio. (§31.5, §31.6, §31.7)
Ethics dilemma. A vendor offers a new "alternative data" attribute that materially improves your model's predictive accuracy, but you suspect it may correlate with a protected class. Sales assures you it is "just a data signal, not a protected characteristic." Using the chapter's compliance rule — if you could not use a fact when a human collected it, you cannot use it because a vendor collected it for you — lay out how you would decide, and what you would need to know first. (§31.2; preview of Chapter 35)
The chapter warns that an automation initiative can show a lower expense ratio in year one and a worse combined ratio later. Explain the mechanism and the time lag, and state the single number against which any automation project must ultimately be judged. (§31.6; recall Chapter 3)
Ethics / fairness. IoT and telematics let an insurer observe an insured's actual behavior continuously. Name one genuine underwriting benefit and one genuine privacy or fairness concern, and explain why the chapter routes both back to the FCRA/state-privacy themes of Chapter 8 rather than treating "new data" as a clean slate. (§31.2)

H. The Underwriting File

† Underwriting-File extension. The Harbor Steel pre-fill returns a satellite roof flag that agrees with the loss runs, the inspection, and the broker's note. Write, in your own words, (a) why this chapter says the data corroborates rather than changes the disposition, (b) the one hypothesis the quarter-old image still cannot settle, and (c) the one open thread this chapter hands forward to Chapter 32. (The Underwriting File)
The real-time score recommends declining Harbor Steel, and your file does not. This chapter only logs the divergence. Explain why the chapter deliberately stops short of the override here, and what Chapter 32 must supply before the override is a defensible judgment rather than a stubborn one. (§31.4, The Underwriting File)
Tindall Stores (the post-breach cyber submission) is enriched in parallel with Harbor Steel. Explain the chapter's claim that "the enriched feed tells you the breach happened; only judgment tells you whether the company actually fixed it." What kind of evidence — and from where — would move that judgment, and why can't a data feed supply it? (The Underwriting File; recall Chapter 24)
Underwriting-File extension (design). Propose three referral triggers that should guarantee a risk like Harbor Steel is never straight-through bound, drawing each trigger directly from a frozen fact in the file (the limits, the catastrophe exposure, the loss flags, the multi-line program). (§31.5, The Underwriting File)