Chapter 25 Quiz
Machine Learning in Fraud Detection
15 questions. Answers follow.
1. Card payment fraud data typically has a fraud rate of 0.1%–1%. A model that predicts "legitimate" for every transaction will achieve approximately 99% accuracy. This illustrates why:
A) Accuracy is the best metric for fraud detection models B) Accuracy is misleading for imbalanced classification; precision and recall are the operative metrics C) 99% accuracy is the regulatory minimum for fraud detection systems D) A model with 99% accuracy should be approved for production
2. A fraud detection model trained on data from January to December 2023 is deployed in March 2024. By September 2024, fraudsters have adopted a new attack pattern that did not exist in 2023. This scenario illustrates:
A) The model was poorly trained and should be retrained from scratch B) Temporal dynamics in fraud — patterns evolve continuously, requiring ongoing retraining C) The model should have been trained on synthetic data to prevent this D) GDPR requires that models be retrained every 6 months
3. Which of the following is the best description of the "feedback loop" in fraud detection?
A) The feedback from regulators about model performance following enforcement actions B) The process by which customer complaints are used to retrain the model C) The process by which model predictions → cases for investigation → investigation labels → model retraining creates an ongoing improvement cycle D) The technical process by which the model's output is fed back as an input for the next scoring run
4. Sample selection bias in fraud detection refers to:
A) The bias introduced when training data contains more data from large banks than small ones B) The risk that transactions not reviewed by investigators (because the model scored them low) generate no training labels, causing the model's blind spots to compound C) The tendency for fraud detection models to be biased against certain demographic groups D) The over-representation of online transactions in training data relative to in-person transactions
5. SMOTE (Synthetic Minority Over-sampling Technique) addresses which fraud detection challenge?
A) Adversarial adaptation by fraudsters B) Class imbalance — generating synthetic fraud examples to balance the training dataset C) Label noise from inaccurate investigation dispositions D) Latency requirements for real-time scoring
6. A fraud detection model achieves 45% precision and 92% recall at its current threshold. Which of the following correctly interprets these results?
A) The model is poorly performing — both metrics should exceed 90% B) The model catches most fraud (92% recall) but generates many false positives — 55% of flagged transactions are legitimate C) The model is operating correctly — recall matters more than precision in all fraud scenarios D) The model should be retrained because precision below 50% is not permitted under GDPR
7. An isolation forest is an unsupervised anomaly detection algorithm. In fraud detection, it is primarily useful for:
A) Detecting known fraud patterns from labeled historical data B) Detecting novel fraud patterns that have never appeared in labeled training data C) Computing SHAP values for explainability purposes D) Meeting GDPR Article 22 requirements for automated decision-making
8. In a real-time card payment fraud architecture, the feature store (e.g., Redis-based) serves what primary purpose?
A) Storing the trained model for fast access B) Archiving all transactions permanently for regulatory reporting C) Maintaining pre-computed behavioral features (velocity, baselines) that can be retrieved in milliseconds, enabling sub-200ms scoring latency D) Storing customer complaints and dispute records
9. A Cornerstone fraud analyst receives a fraud alert and accesses the SHAP waterfall chart for the flagged transaction. The chart shows: new_country (+0.18), velocity_1h (+0.14), new_device (+0.11). The analyst calls the customer, who confirms they are traveling and used a new laptop. What should the analyst do?
A) Confirm as true positive — the model score is above the threshold B) Disposition as false positive and update the training label; consider the context (travel + new device is consistent with legitimate behavior) C) Escalate to the model development team — the model is malfunctioning D) Block the card permanently — the presence of three risk factors is automatically fraud
10. Explain why sharing SHAP feature importance details externally with customers or regulators presents a risk in fraud detection contexts that differs from credit decision contexts.
A) SHAP values are too complex for non-technical audiences to understand B) SHAP values in fraud detection reveal which behaviors the model flags, giving fraudsters a roadmap to defeat the model — an adversarial risk not present in credit scoring C) SHAP values are not legally admissible as evidence in fraud proceedings D) Sharing SHAP values would require firms to retrain the model after each disclosure
11. Under GDPR, fraud prevention is generally supported as a lawful basis for processing personal data. Which provision of GDPR most directly supports this?
A) Article 9 (special category data) B) Recital 47 (legitimate interests — which specifically names fraud prevention) C) Article 6(1)(c) (legal obligation) D) Article 17 (right to erasure)
12. A fraud model has a PSI (Population Stability Index) of 0.28 after 8 months in production. What does this indicate and what action is appropriate?
A) PSI of 0.28 is within the acceptable range (< 0.30); no action required B) PSI > 0.25 indicates significant population shift from training data; the model should be retrained as a matter of urgency C) PSI of 0.28 indicates the model is performing better than at training; no action required D) PSI > 0.25 requires immediate regulatory notification under GDPR
13. Verdant Bank's fraud investigation team frequently resolves alerts as "customer confirmed legitimate" based on brief customer service calls. A subsequent review finds that some of these calls were from social engineering impersonators. What is the most significant compliance/operational risk of this practice?
A) The investigation team may face GDPR enforcement for processing customer voice data without consent B) Fraudulent transactions labeled as "legitimate" corrupt the training dataset, causing the model to learn that certain fraud patterns are legitimate — degrading performance over time C) The call recordings must be retained for 7 years, creating data storage costs D) The FCA's Consumer Duty requires that all calls be reviewed by a senior manager
14. The FCA's Consumer Duty (2023) is relevant to fraud detection systems because:
A) It requires all fraud detection decisions to be approved by a qualified compliance officer B) It requires firms to monitor customer outcomes — a high false positive rate that blocks many legitimate transactions causes customer harm that firms must address C) It prohibits the use of machine learning in consumer-facing decisions D) It requires all fraud models to be disclosed to customers before deployment
15. Priya's review of Verdant Bank's fraud system found that the root cause of rising fraud losses was not model quality but investigation process quality. Which of the following best describes the lesson?
A) Machine learning fraud models are generally ineffective and should be replaced with rules-based systems B) The model is only as good as its feedback loop — corrupt investigation labels corrupt the training data, degrading future model performance regardless of model sophistication C) Verdant Bank needed to retrain its model on a longer historical window D) Third-party fraud models should not be trusted; firms should only use internally developed models
Answer Key
| Q | A | Explanation |
|---|---|---|
| 1 | B | With 99% legitimate transactions, a trivial "always predict legitimate" classifier achieves 99% accuracy but 0% recall. Accuracy is useless for imbalanced classification. Precision and recall measure what matters. |
| 2 | B | Temporal dynamics: fraud evolves constantly. A model trained on 2023 data cannot recognize novel attack vectors first appearing in 2024. Ongoing retraining is the solution, not a design defect. |
| 3 | C | The feedback loop: model scores → cases reviewed → dispositions (labels) → model retrained on labels. Clean, accurate labels are essential for the loop to function correctly. |
| 4 | B | Sample selection bias: unreviewed (low-scored) transactions generate no labels. If the model systematically misses a fraud type, that type never appears in training data, creating a compounding blind spot. |
| 5 | B | SMOTE addresses class imbalance by generating synthetic minority-class (fraud) examples. It does not address adversarial adaptation, label noise, or latency. |
| 6 | B | 92% recall means the model catches 92% of fraud. 45% precision means 55% of flagged transactions are false positives — a high false positive burden for investigators. Neither is universally "right" — the tradeoff depends on business priorities. |
| 7 | B | Isolation forest is unsupervised — it doesn't use fraud labels. It detects anomalies (unusual behavior) which may indicate novel fraud. Supervised models detect known patterns; unsupervised models detect unusual patterns. |
| 8 | C | Feature store enables real-time scoring by pre-computing behavioral features (velocity, baselines) that cannot be computed from scratch within latency constraints. Redis provides sub-millisecond reads. |
| 9 | B | The SHAP explanation, combined with context (customer confirmed travel + new laptop), indicates a false positive. The analyst should disposition accordingly and update the training label. Human judgment in the investigation loop is essential. |
| 10 | B | In fraud detection, the model's decision logic is adversarially sensitive. Revealing which features trigger alerts gives fraudsters a playbook. In credit scoring, the borrower cannot manufacture a higher income or longer credit history on demand. The adversarial dynamic is what makes fraud explanation uniquely sensitive. |
| 11 | B | GDPR Recital 47 explicitly names fraud prevention as a legitimate interest of data controllers, supporting processing without explicit consent where necessary for fraud prevention. |
| 12 | B | PSI > 0.25 is the universally recognized threshold for "significant shift" requiring model retraining. PSI 0.1–0.25 is "minor shift — monitor." PSI < 0.1 is "stable." Immediate retraining is the appropriate response at PSI 0.28. |
| 13 | B | The investigation label quality problem: fraudulent transactions labeled as "legitimate" by the investigation team (due to social engineering verification failure) teach the model that those fraud patterns are legitimate. This is the feedback loop failure Priya identified at Verdant. |
| 14 | B | Consumer Duty requires monitoring customer outcomes. A fraud detection system with excessive false positives causes customer harm (declined legitimate transactions, payment delays, customer friction). Firms must balance fraud prevention against customer impact. |
| 15 | B | The key insight: a sophisticated model deployed with a broken feedback loop will degrade. The model learns from labels; corrupt labels produce a corrupt model. Process quality (investigation accuracy, label quality) is as important as model quality. |