Chapter 5 Key Takeaways

Data Architecture for Regulatory Compliance


The Big Picture

Data quality and governance are prerequisite to every RegTech solution. The algorithms and the infrastructure are available; the bottleneck is almost always data — incomplete, inconsistent, untraced, or fragmented across systems. Building compliance data architecture that is complete, accurate, auditable, and fit for regulatory examination is the most important technical investment a compliance function can make.


Essential Points

1. BCBS 239 Is the Governance Template

Even for institutions not directly subject to Basel's risk data aggregation principles, BCBS 239's eleven principles define what good compliance data governance looks like: - Senior management accountability - Integrated, scalable architecture - Automated, reconciled pipelines with audit trails - Complete capture, timely availability, adaptability for ad hoc requests

2. Four Types of Compliance Data Require Different Treatment

Data Type Examples Key Quality Dimension Key Risk
Customer Name, DOB, KYC status Completeness, currency Stale KYC, duplicates
Transaction Amount, parties, date Accuracy, timeliness Missing records, delays
Reference Sanctions lists, LEIs Currency, completeness Using outdated lists
Regulatory reporting Capital ratios, positions Accuracy, traceability Incorrect submissions

3. The Six Data Quality Dimensions

Dimension Definition Compliance Consequence if Fails
Completeness All required records present Missing customer records, gap in monitoring
Accuracy Data correctly reflects reality Incorrect SAR, wrong capital calculation
Consistency Same fact represented the same way across systems Entity matching failure, monitoring gaps
Timeliness Data available when needed Late monitoring, missed fraud
Validity Data conforms to required format Report rejection, invalid screening
Uniqueness Each entity represented exactly once Monitoring split across duplicate records

4. Data Lineage = Regulatory Defensibility

You must be able to trace every compliance output — every reported number, every flagged transaction, every risk rating — back to its source data. Without lineage, you cannot answer a regulator's "show me how you calculated that" question.

5. The Customer Golden Record Is the Foundation of Compliance

A single, authoritative customer identifier that links across all systems is required to: - Monitor activity across all of a customer's accounts - Apply consistent risk ratings - Aggregate SAR history - Meet beneficial ownership requirements

6. Cloud Requires Regulatory Planning, Not Just Technical Planning

Cloud data architecture creates four regulatory obligations: - Data residency: Know where your data is stored - Exit strategy: Demonstrate you can migrate without disruption - Audit rights: Ensure your contract gives you (and regulators) access - Concentration risk: Assess dependency on any single cloud provider

7. Data Integration Time Is Always Underestimated

Allow 50–100% more time than vendor quotes for data integration. This is the industry norm, not pessimism. The integration between the new compliance system and the existing source systems is typically the hardest, longest, and least well-documented part of any RegTech implementation.


Compliance Data Architecture in Summary

Sources → Data Lake (raw) → Data Warehouse (governed) → Compliance Applications
         [Completeness]    [Quality, Lineage, MDM]      [Monitoring, Reporting, KYC]

Self-Check Questions

  1. A peer argues: "We have great ML models — data quality is a detail we can fix later." What is the strongest response to this argument?
  2. What is data lineage, and why is it a regulatory requirement, not just an operational nicety?
  3. A financial institution discovers that the same customer is registered under slightly different names in three different systems. Which data quality dimension has failed, and what approach would you use to fix it?
  4. What are the four regulatory obligations that cloud adoption creates for financial institutions?
  5. Why is the customer golden record important for AML compliance specifically?