Further Reading
Chapter 5: Data Architecture for Regulatory Compliance
Essential Reading
Basel Committee on Banking Supervision (2013). BCBS 239: Principles for Effective Risk Data Aggregation and Risk Reporting. Bank for International Settlements. The foundational document for compliance data governance in banking. Even firms not directly subject to BCBS 239 will find its eleven principles provide the most comprehensive framework available for compliance data governance. Free at bis.org.
DAMA International (2017). DAMA-DMBOK: Data Management Body of Knowledge (2nd ed.). Technics Publications. The standard reference for data management practitioners. Chapter 9 (data quality) and Chapter 11 (data governance) are most relevant to compliance applications. Comprehensive, detailed, and practical.
Kimball, R. & Ross, M. (2013). The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling (3rd ed.). Wiley. The classic text on data warehouse design. Compliance data warehouses need to be built on sound architectural principles; this book provides the foundation. Technical but accessible.
For Practitioners
Information Commissioner's Office (UK). Guide to the UK GDPR. The ICO's comprehensive guide to UK GDPR requirements. Free at ico.org.uk. Essential for anyone designing data architecture that involves personal data of UK residents.
Article 29 Working Party (2017). Guidelines on Data Portability. European Data Protection Board. Explains the GDPR right to data portability in practical terms — relevant for designing systems that can respond to data subject requests.
FATF (2020). Guidance on Digital Identity. FATF's guidance on using digital identity solutions for AML/CFT compliance, including the data standards and verification approaches that satisfy KYC obligations. Free at fatf-gafi.org.
IBM Data Governance Council (various). Data Governance White Papers. IBM has published a series of practitioner-level papers on data governance implementation. While commercially motivated, the frameworks are sound and practically useful.
Domo (annual). Data Never Sleeps. Annual infographic showing the volume of data generated globally per minute. Useful for calibrating the scale of data management challenges in financial services.
For the Curious
Stonebraker, M. & Cetintemel, U. (2005). "One Size Fits All" — An Idea Whose Time Has Come and Gone. ICDE 2005. A seminal paper on why different data workloads require different database architectures. Relevant to the data lake vs. data warehouse architectural choice described in Section 5.7.
Kleppmann, M. (2017). Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O'Reilly Media. The best technical introduction to modern data systems architecture — databases, data pipelines, stream processing. Not specifically about compliance, but essential for compliance data architects.
Louppe, G. (2014). Understanding Random Forests: From Theory to Practice. PhD thesis, Université de Liège. For the technically curious: a rigorous explanation of ensemble methods including gradient boosting — the ML approaches most used in compliance scoring. Free on arXiv.
European Union Agency for Fundamental Rights (2018). Handbook on European Data Protection Law. The definitive reference on EU data protection law, including GDPR. Free and comprehensive.
Regulatory Sources for Data Architecture
| Document | Source | Relevance |
|---|---|---|
| BCBS 239: Risk Data Aggregation | bis.org | Foundation for compliance data governance |
| FCA PS23/3: Operational Resilience | fca.org.uk | Cloud and data architecture for FCA-regulated firms |
| EBA Guidelines on ICT and Security Risk | eba.europa.eu | EU data architecture requirements |
| GDPR (Regulation 2016/679) | EUR-lex | EU data protection architecture requirements |
| UK GDPR (as retained EU law) | legislation.gov.uk | UK-specific GDPR requirements |
| OCC Bulletin 2020-10: Third-Party Risk | occ.gov | US cloud and data vendor management |
| NIST SP 800-53: Security and Privacy Controls | nist.gov | US federal data security standards |
Python Libraries for Data Quality and Architecture
| Library | Use Case |
|---|---|
great_expectations |
Data quality validation and documentation |
pandas_profiling / ydata-profiling |
Automated data quality reports |
apache-airflow |
Data pipeline orchestration |
sqlalchemy |
Database ORM for data lineage tracking |
dbt (data build tool) |
SQL-based data transformation with lineage |
delta-lake |
ACID transactions on data lakes (Azure/AWS) |
apache-spark |
Large-scale data processing |
Professional Standards and Certifications
CDMP (Certified Data Management Professional) — DAMA International's professional certification for data management practitioners. Relevant for compliance data architects.
CIPM (Certified Information Privacy Manager) — IAPP certification focused on privacy program management. Relevant for privacy-compliance intersection roles.
CISA (Certified Information Systems Auditor) — ISACA certification with data governance components. Relevant for compliance technology audit roles.