Chapter 4 Key Takeaways

DataField.Dev

Chapter 4 Key Takeaways

Technology Foundations: AI, ML, NLP, and Automation in Compliance

The Big Picture

RegTech solutions use a range of technologies — rules, machine learning, NLP, RPA, and graph analytics — each suited to different types of compliance problems. Compliance professionals who understand these technologies at a conceptual level can govern, evaluate, and be accountable for the systems they are responsible for. Those who cannot are at a systematic disadvantage when working with vendors, technologists, and regulators.

Essential Points

1. Match Technology to Problem Type

Problem Type	Technology Best Fit
Detection (find needles in haystack)	ML classification, rule-based + ML hybrid
Verification (confirm a fact is true)	Computer vision, database lookup, APIs
Calculation/aggregation	Deterministic engines, data pipelines
Text understanding	NLP, large language models

Don't use ML where a deterministic calculation engine is needed. Don't use RPA where a proper API integration is feasible.

2. Rules Are Deterministic, Auditable — and Gameable

Rules-based systems are transparent and easy to audit. They are also brittle, manually intensive, and gameable by anyone who knows the rules. They produce high false positive rates at the sensitivity levels regulators require. They are necessary but insufficient for sophisticated compliance.

3. The Four ML Metrics You Must Know

Metric	What It Measures
Precision	Of flagged transactions, % that are truly suspicious
Recall	Of all suspicious transactions, % that were caught
F1 Score	Balanced measure combining precision and recall
AUC-ROC	Overall discriminative ability across thresholds

High recall = fewer missed suspicious transactions (lower regulatory risk). High precision = fewer false positives (lower analyst burden). You cannot maximize both simultaneously. The threshold is a compliance decision.

4. NLP Makes Text Machine-Readable

Text classification → categorize regulatory updates, news, communications
Named entity recognition → extract names, organizations, obligations from text
Semantic search → find regulatory requirements without exact keyword match
LLMs → powerful, but risk of hallucination makes them unsuitable for definitive compliance determinations without human review

5. RPA ≠ AI — But Delivers Real Value

RPA automates repetitive clicks and data entry. It does not learn or adapt. Its value: 60-80% effort reduction on high-volume manual processes (report population, alert enrichment, list refresh). Its limitation: fragile to interface changes; not a substitute for proper system integration.

6. Graph Analytics Sees What Transaction-Level Analysis Cannot

Financial crime networks are not visible at the transaction level. Graph analytics reveals: - Money mule networks (many nodes receiving and forwarding funds) - Circular payment schemes (funds returning to source) - Shell company webs (shared controllers across entities)

7. AI Readiness Has Four Dimensions

Dimension	Key Questions
Data	Clean? Complete? Labeled? Traceable?
Technology	Cloud? APIs? MLOps?
Governance	Model inventory? Validation? Documentation? Explainability?
People	Can analysts interpret scores? Do they understand model limits?

Most organizations are Data-ready before they are Governance-ready.

Self-Check Questions

A vendor claims its AI system reduces AML false positives by 50%. What three questions would you ask before accepting this claim?
Explain the precision-recall trade-off in plain language, using an AML monitoring scenario.
Why is graph analytics important for AML, when most transaction monitoring systems monitor individual transactions?
What is the difference between RPA and machine learning? Give a compliance use case where each is the appropriate choice.
Why are large language models risky for generating definitive compliance determinations without human review?