Chapter 4 Key Takeaways
Technology Foundations: AI, ML, NLP, and Automation in Compliance
The Big Picture
RegTech solutions use a range of technologies — rules, machine learning, NLP, RPA, and graph analytics — each suited to different types of compliance problems. Compliance professionals who understand these technologies at a conceptual level can govern, evaluate, and be accountable for the systems they are responsible for. Those who cannot are at a systematic disadvantage when working with vendors, technologists, and regulators.
Essential Points
1. Match Technology to Problem Type
| Problem Type | Technology Best Fit |
|---|---|
| Detection (find needles in haystack) | ML classification, rule-based + ML hybrid |
| Verification (confirm a fact is true) | Computer vision, database lookup, APIs |
| Calculation/aggregation | Deterministic engines, data pipelines |
| Text understanding | NLP, large language models |
Don't use ML where a deterministic calculation engine is needed. Don't use RPA where a proper API integration is feasible.
2. Rules Are Deterministic, Auditable — and Gameable
Rules-based systems are transparent and easy to audit. They are also brittle, manually intensive, and gameable by anyone who knows the rules. They produce high false positive rates at the sensitivity levels regulators require. They are necessary but insufficient for sophisticated compliance.
3. The Four ML Metrics You Must Know
| Metric | What It Measures |
|---|---|
| Precision | Of flagged transactions, % that are truly suspicious |
| Recall | Of all suspicious transactions, % that were caught |
| F1 Score | Balanced measure combining precision and recall |
| AUC-ROC | Overall discriminative ability across thresholds |
High recall = fewer missed suspicious transactions (lower regulatory risk). High precision = fewer false positives (lower analyst burden). You cannot maximize both simultaneously. The threshold is a compliance decision.
4. NLP Makes Text Machine-Readable
- Text classification → categorize regulatory updates, news, communications
- Named entity recognition → extract names, organizations, obligations from text
- Semantic search → find regulatory requirements without exact keyword match
- LLMs → powerful, but risk of hallucination makes them unsuitable for definitive compliance determinations without human review
5. RPA ≠ AI — But Delivers Real Value
RPA automates repetitive clicks and data entry. It does not learn or adapt. Its value: 60-80% effort reduction on high-volume manual processes (report population, alert enrichment, list refresh). Its limitation: fragile to interface changes; not a substitute for proper system integration.
6. Graph Analytics Sees What Transaction-Level Analysis Cannot
Financial crime networks are not visible at the transaction level. Graph analytics reveals: - Money mule networks (many nodes receiving and forwarding funds) - Circular payment schemes (funds returning to source) - Shell company webs (shared controllers across entities)
7. AI Readiness Has Four Dimensions
| Dimension | Key Questions |
|---|---|
| Data | Clean? Complete? Labeled? Traceable? |
| Technology | Cloud? APIs? MLOps? |
| Governance | Model inventory? Validation? Documentation? Explainability? |
| People | Can analysts interpret scores? Do they understand model limits? |
Most organizations are Data-ready before they are Governance-ready.
Self-Check Questions
- A vendor claims its AI system reduces AML false positives by 50%. What three questions would you ask before accepting this claim?
- Explain the precision-recall trade-off in plain language, using an AML monitoring scenario.
- Why is graph analytics important for AML, when most transaction monitoring systems monitor individual transactions?
- What is the difference between RPA and machine learning? Give a compliance use case where each is the appropriate choice.
- Why are large language models risky for generating definitive compliance determinations without human review?