Further Reading

Chapter 5: Data Architecture for Regulatory Compliance


Essential Reading

Basel Committee on Banking Supervision (2013). BCBS 239: Principles for Effective Risk Data Aggregation and Risk Reporting. Bank for International Settlements. The foundational document for compliance data governance in banking. Even firms not directly subject to BCBS 239 will find its eleven principles provide the most comprehensive framework available for compliance data governance. Free at bis.org.

DAMA International (2017). DAMA-DMBOK: Data Management Body of Knowledge (2nd ed.). Technics Publications. The standard reference for data management practitioners. Chapter 9 (data quality) and Chapter 11 (data governance) are most relevant to compliance applications. Comprehensive, detailed, and practical.

Kimball, R. & Ross, M. (2013). The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling (3rd ed.). Wiley. The classic text on data warehouse design. Compliance data warehouses need to be built on sound architectural principles; this book provides the foundation. Technical but accessible.


For Practitioners

Information Commissioner's Office (UK). Guide to the UK GDPR. The ICO's comprehensive guide to UK GDPR requirements. Free at ico.org.uk. Essential for anyone designing data architecture that involves personal data of UK residents.

Article 29 Working Party (2017). Guidelines on Data Portability. European Data Protection Board. Explains the GDPR right to data portability in practical terms — relevant for designing systems that can respond to data subject requests.

FATF (2020). Guidance on Digital Identity. FATF's guidance on using digital identity solutions for AML/CFT compliance, including the data standards and verification approaches that satisfy KYC obligations. Free at fatf-gafi.org.

IBM Data Governance Council (various). Data Governance White Papers. IBM has published a series of practitioner-level papers on data governance implementation. While commercially motivated, the frameworks are sound and practically useful.

Domo (annual). Data Never Sleeps. Annual infographic showing the volume of data generated globally per minute. Useful for calibrating the scale of data management challenges in financial services.


For the Curious

Stonebraker, M. & Cetintemel, U. (2005). "One Size Fits All" — An Idea Whose Time Has Come and Gone. ICDE 2005. A seminal paper on why different data workloads require different database architectures. Relevant to the data lake vs. data warehouse architectural choice described in Section 5.7.

Kleppmann, M. (2017). Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O'Reilly Media. The best technical introduction to modern data systems architecture — databases, data pipelines, stream processing. Not specifically about compliance, but essential for compliance data architects.

Louppe, G. (2014). Understanding Random Forests: From Theory to Practice. PhD thesis, Université de Liège. For the technically curious: a rigorous explanation of ensemble methods including gradient boosting — the ML approaches most used in compliance scoring. Free on arXiv.

European Union Agency for Fundamental Rights (2018). Handbook on European Data Protection Law. The definitive reference on EU data protection law, including GDPR. Free and comprehensive.


Regulatory Sources for Data Architecture

Document Source Relevance
BCBS 239: Risk Data Aggregation bis.org Foundation for compliance data governance
FCA PS23/3: Operational Resilience fca.org.uk Cloud and data architecture for FCA-regulated firms
EBA Guidelines on ICT and Security Risk eba.europa.eu EU data architecture requirements
GDPR (Regulation 2016/679) EUR-lex EU data protection architecture requirements
UK GDPR (as retained EU law) legislation.gov.uk UK-specific GDPR requirements
OCC Bulletin 2020-10: Third-Party Risk occ.gov US cloud and data vendor management
NIST SP 800-53: Security and Privacy Controls nist.gov US federal data security standards

Python Libraries for Data Quality and Architecture

Library Use Case
great_expectations Data quality validation and documentation
pandas_profiling / ydata-profiling Automated data quality reports
apache-airflow Data pipeline orchestration
sqlalchemy Database ORM for data lineage tracking
dbt (data build tool) SQL-based data transformation with lineage
delta-lake ACID transactions on data lakes (Azure/AWS)
apache-spark Large-scale data processing

Professional Standards and Certifications

CDMP (Certified Data Management Professional) — DAMA International's professional certification for data management practitioners. Relevant for compliance data architects.

CIPM (Certified Information Privacy Manager) — IAPP certification focused on privacy program management. Relevant for privacy-compliance intersection roles.

CISA (Certified Information Systems Auditor) — ISACA certification with data governance components. Relevant for compliance technology audit roles.