European Banking Authority (2021). Report on Big Data and Advanced Analytics. The EBA's comprehensive analysis of how financial institutions are using machine learning and advanced analytics in credit risk, fraud detection, AML, and customer segmentation. Covers the governance expectations and risk management considerations for AI/ML in financial services. Free at eba.europa.eu. Essential for understanding the regulatory expectations that frame fraud detection ML use.

Federal Reserve (2011). SR 11-7: Guidance on Model Risk Management. The foundational US regulatory guidance on model risk management — covering model development, validation, and ongoing monitoring for all models used in risk management and financial decision-making. While a US document, the principles have been widely adopted globally and are the standard against which fraud detection model governance is assessed. Free at federalreserve.gov.

FCA (2022). AI and Machine Learning — Discussion Paper DP22/4. The FCA's discussion of AI and ML use in financial services — covering explainability, fairness, governance, and regulatory expectations. Essential for understanding the UK regulatory direction for ML-based fraud detection. Free at fca.org.uk.

For Practitioners

Dal Pozzolo, A., Caelen, O., Johnson, R.A., & Bontempi, G. (2015). Calibrating Probability with Undersampling for Unbalanced Classification. IEEE Symposium on Computational Intelligence and Data Mining. A seminal paper on class imbalance handling in fraud detection. Demonstrates that undersampling during training requires probability recalibration for correct threshold setting — a practical finding that affects production fraud model deployment.

Lundberg, S.M., & Lee, S.I. (2017). A Unified Approach to Interpreting Model Predictions. NeurIPS. The original SHAP paper. Provides the theoretical foundation for Shapley-value-based model explanations. Available at arxiv.org. Essential reading for anyone implementing explainability in fraud detection.

Phua, C., Lee, V., Smith, K., & Gayler, R. (2010). A Comprehensive Survey of Data Mining-Based Fraud Detection Research. arXiv:1009.6119. Comprehensive survey of fraud detection approaches across payment fraud, insurance fraud, healthcare fraud, and telecommunications fraud. Provides historical context for the evolution from rules-based to ML-based detection.

Bolton, R.J., & Hand, D.J. (2002). Statistical Fraud Detection: A Review. Statistical Science, 17(3), 235–255. Classic academic treatment of statistical approaches to fraud detection. Establishes the formal framework for class imbalance, behavioral profiling, and network-based detection that underpins modern ML approaches.

Technical References

XGBoost Documentation: xgboost.readthedocs.io — The XGBoost library is the most widely used gradient boosting implementation in production fraud detection. Documentation includes worked examples with tabular data.

LightGBM Documentation: lightgbm.readthedocs.io — Microsoft's LightGBM is faster than XGBoost for large datasets and particularly efficient for fraud detection at scale. Includes categorical feature handling useful for MCC codes.

SHAP Python Library: shap.readthedocs.io — The standard library for SHAP-based model explanation. TreeSHAP for gradient boosted models provides O(T × L × M) computation — fast enough for production use. Includes SHAP waterfall plots, summary plots, and force plots for fraud alert investigation dashboards.

imbalanced-learn: imbalanced-learn.org — Python library implementing SMOTE and other class imbalance handling techniques. Integrates with scikit-learn pipelines.

Scikit-learn: scikit-learn.org — Standard Python ML library with Isolation Forest implementation (sklearn.ensemble.IsolationForest), logistic regression, and model evaluation utilities including precision_recall_curve for threshold selection.

Feast (Feature Store): docs.feast.dev — Open-source feature store for ML. Relevant for the architectural pattern of pre-computing and serving behavioral features at real-time fraud detection latency.

Regulatory Primary Sources

Document	Jurisdiction	Key Relevance
GDPR Recital 47	EU	Legitimate interest for fraud prevention
GDPR Article 22	EU	Right not to be subject to solely automated decisions
Data Protection Act 2018 Schedule 2 Para 14	UK	Financial crime processing exemption from subject access
FCA Consumer Duty PS22/9	UK	Good outcomes requirement; false positive harm
ECOA / Regulation B	US	Adverse action reasons for credit decisions
SR 11-7	US	Model risk management for all models
EBA Guidelines on Internal Governance	EU	Model risk governance for EBA-supervised firms
PRA SS1/23	UK	Model risk management for major banks and insurers

For the Curious

Provost, F., & Fawcett, T. (2013). Data Science for Business. O'Reilly Media. Accessible treatment of machine learning for business applications, with extensive discussion of classifier evaluation, precision-recall tradeoffs, and the cost-sensitive learning framework that underlies fraud detection threshold calibration. Chapter 8 (Visualizing Model Performance) is particularly relevant.

Baesens, B., Van Vlasselaer, V., & Verbeke, W. (2015). Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Wiley. The definitive practitioner text on fraud analytics. Covers rules-based, supervised, unsupervised, and network-based detection with a financial services focus. Chapters on card fraud specifically relevant to this chapter's content.

Ziegler, A. (2012). A Short Introduction to Boosting. Journal of Japanese Society for Artificial Intelligence. Accessible introduction to boosting algorithms — the family of techniques that produces gradient-boosted trees. Useful background for understanding why GBT consistently outperforms other approaches on tabular fraud data.

Vendor and Industry Resources

UK Finance (annual). Fraud: The Facts. Annual UK Finance report on fraud statistics across UK financial services — card fraud, APP fraud, online banking fraud. Provides the industry context for fraud rates, fraud typologies, and detection effectiveness. Free at ukfinance.org.uk.

Feedzai Research: feedzai.com/resources — Feedzai is a major fraud detection vendor whose research publications cover ML for financial crime. Practitioner-level content on feature engineering, model performance, and regulatory compliance.

Stripe Engineering Blog: stripe.com/blog/engineering — Stripe has published detailed technical posts on its fraud detection architecture (Radar). Particularly useful for understanding real-time scoring architecture and the feature store pattern in production.

Netflix Tech Blog (anomaly detection posts): netflixtechblog.com — While not financial services-specific, Netflix's engineering blog on anomaly detection and streaming feature computation is referenced by financial services fraud teams for its architectural insights. The concepts transfer directly to card fraud detection systems.

Further Reading

Chapter 25: Machine Learning in Fraud Detection

Essential Reading