Further Reading
Chapter 4: Technology Foundations: AI, ML, NLP, and Automation in Compliance
Essential Reading
Géron, A. (2022). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (3rd ed.). O'Reilly Media. The most accessible technical introduction to machine learning for practitioners. You do not need to read it cover to cover — Chapters 1-4 (foundations) and Chapter 7 (ensemble methods, including gradient boosting) are most relevant. Available at your local library.
Board of Governors of the Federal Reserve (2011). SR 11-7: Guidance on Model Risk Management. The US model risk management standard. Even if you operate outside the US, SR 11-7's framework for model inventory, validation, and documentation is the closest thing to an international standard. Essential reading for anyone responsible for governing ML systems in compliance. Free at federalreserve.gov.
FATF (2021). Opportunities and Challenges of New Technologies for AML/CFT. FATF's comprehensive review of how AI, machine learning, and digital identity technologies can be applied to AML/CFT compliance, along with their associated risks and governance challenges. Free at fatf-gafi.org.
For Practitioners
Interpretable Machine Learning by Christoph Molnar (molnar.fyi/interpretable-ml) A free online book that covers explainability techniques (SHAP, LIME, partial dependence plots) in accessible language. Essential for anyone implementing or governing explainable AI in compliance contexts.
Wolf, M. & Zaboli, A. (2021). Machine Learning in AML: Finding the Balance Between Automation and Human Oversight. Journal of Financial Crime. A practitioner-oriented paper on the organizational challenges of deploying ML in AML programs, with specific attention to analyst workflow integration and governance.
UK Finance (2022). Artificial Intelligence in Financial Services: A Guide for Practitioners. The UK banking trade association's practical guide to AI implementation, including compliance applications. Written for practitioners rather than technologists. Free from ukfinance.org.uk.
NetworkX Documentation (networkx.org) The complete documentation for the Python graph analytics library used in Chapter 4's code examples. Includes tutorials for AML-relevant graph analysis patterns.
For the Curious
Bishop, C.M. (2006). Pattern Recognition and Machine Learning. Springer. The classic academic textbook on machine learning. Dense and mathematical — for the reader who wants to understand the underlying theory, not just the application. Freely available from Microsoft Research.
Jurafsky, D. & Martin, J.H. (2024). Speech and Language Processing (3rd ed., online draft). The standard academic NLP textbook. Free online at web.stanford.edu/~jurafsky/slp3/. Chapter 2 (text classification) and Chapters 11-12 (named entity recognition) are most relevant to compliance applications.
Wasserman, S. & Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge University Press. The foundational text on graph analysis and network science. Dense but authoritative — the source material for understanding the mathematical basis of the graph analytics techniques described in Section 4.6.
Pearl, J. & Mackenzie, D. (2018). The Book of Why: The New Science of Cause and Effect. Basic Books. An accessible exploration of causal inference — the question of why ML models should not be confused with causal analysis. Directly relevant to the governance of ML models that flag transactions as suspicious without necessarily establishing that they are suspicious.
Regulatory Sources
| Document | Relevance |
|---|---|
| FRB SR 11-7: Model Risk Management | US model governance standard; adopted broadly |
| EBA Guidelines on Internal Governance | EU bank governance, relevant to model oversight |
| FCA CP21/3: A New Consumer Duty | Context for explainability requirements in retail AI |
| EU AI Act (Regulation 2024/1689) | High-risk AI requirements; Chapter 30 covers in depth |
| FATF Guidance on Digital Identity | Standards for biometric and digital KYC verification |
| FinCEN 2021 AML National Priorities | Identifies TBML and other priority detection areas |
Python Libraries for RegTech Applications
| Library | Use Case | Documentation |
|---|---|---|
scikit-learn |
General ML (classification, clustering) | scikit-learn.org |
xgboost / lightgbm |
Gradient boosting (transaction scoring) | xgboost.readthedocs.io |
shap |
Explainability (feature importance per prediction) | shap.readthedocs.io |
networkx |
Graph analytics (AML network analysis) | networkx.org |
spacy |
NLP (NER, text processing) | spacy.io |
transformers |
Large language models (regulatory text) | huggingface.co/docs |
pandas |
Data manipulation (compliance data pipelines) | pandas.pydata.org |
imbalanced-learn |
Class imbalance handling | imbalanced-learn.org |