Further Reading

Chapter 4: Technology Foundations: AI, ML, NLP, and Automation in Compliance


Essential Reading

Géron, A. (2022). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (3rd ed.). O'Reilly Media. The most accessible technical introduction to machine learning for practitioners. You do not need to read it cover to cover — Chapters 1-4 (foundations) and Chapter 7 (ensemble methods, including gradient boosting) are most relevant. Available at your local library.

Board of Governors of the Federal Reserve (2011). SR 11-7: Guidance on Model Risk Management. The US model risk management standard. Even if you operate outside the US, SR 11-7's framework for model inventory, validation, and documentation is the closest thing to an international standard. Essential reading for anyone responsible for governing ML systems in compliance. Free at federalreserve.gov.

FATF (2021). Opportunities and Challenges of New Technologies for AML/CFT. FATF's comprehensive review of how AI, machine learning, and digital identity technologies can be applied to AML/CFT compliance, along with their associated risks and governance challenges. Free at fatf-gafi.org.


For Practitioners

Interpretable Machine Learning by Christoph Molnar (molnar.fyi/interpretable-ml) A free online book that covers explainability techniques (SHAP, LIME, partial dependence plots) in accessible language. Essential for anyone implementing or governing explainable AI in compliance contexts.

Wolf, M. & Zaboli, A. (2021). Machine Learning in AML: Finding the Balance Between Automation and Human Oversight. Journal of Financial Crime. A practitioner-oriented paper on the organizational challenges of deploying ML in AML programs, with specific attention to analyst workflow integration and governance.

UK Finance (2022). Artificial Intelligence in Financial Services: A Guide for Practitioners. The UK banking trade association's practical guide to AI implementation, including compliance applications. Written for practitioners rather than technologists. Free from ukfinance.org.uk.

NetworkX Documentation (networkx.org) The complete documentation for the Python graph analytics library used in Chapter 4's code examples. Includes tutorials for AML-relevant graph analysis patterns.


For the Curious

Bishop, C.M. (2006). Pattern Recognition and Machine Learning. Springer. The classic academic textbook on machine learning. Dense and mathematical — for the reader who wants to understand the underlying theory, not just the application. Freely available from Microsoft Research.

Jurafsky, D. & Martin, J.H. (2024). Speech and Language Processing (3rd ed., online draft). The standard academic NLP textbook. Free online at web.stanford.edu/~jurafsky/slp3/. Chapter 2 (text classification) and Chapters 11-12 (named entity recognition) are most relevant to compliance applications.

Wasserman, S. & Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge University Press. The foundational text on graph analysis and network science. Dense but authoritative — the source material for understanding the mathematical basis of the graph analytics techniques described in Section 4.6.

Pearl, J. & Mackenzie, D. (2018). The Book of Why: The New Science of Cause and Effect. Basic Books. An accessible exploration of causal inference — the question of why ML models should not be confused with causal analysis. Directly relevant to the governance of ML models that flag transactions as suspicious without necessarily establishing that they are suspicious.


Regulatory Sources

Document Relevance
FRB SR 11-7: Model Risk Management US model governance standard; adopted broadly
EBA Guidelines on Internal Governance EU bank governance, relevant to model oversight
FCA CP21/3: A New Consumer Duty Context for explainability requirements in retail AI
EU AI Act (Regulation 2024/1689) High-risk AI requirements; Chapter 30 covers in depth
FATF Guidance on Digital Identity Standards for biometric and digital KYC verification
FinCEN 2021 AML National Priorities Identifies TBML and other priority detection areas

Python Libraries for RegTech Applications

Library Use Case Documentation
scikit-learn General ML (classification, clustering) scikit-learn.org
xgboost / lightgbm Gradient boosting (transaction scoring) xgboost.readthedocs.io
shap Explainability (feature importance per prediction) shap.readthedocs.io
networkx Graph analytics (AML network analysis) networkx.org
spacy NLP (NER, text processing) spacy.io
transformers Large language models (regulatory text) huggingface.co/docs
pandas Data manipulation (compliance data pipelines) pandas.pydata.org
imbalanced-learn Class imbalance handling imbalanced-learn.org