Further Reading: Data Governance Frameworks and Institutions
The sources below provide deeper engagement with the themes introduced in Chapter 22. They include foundational frameworks, practitioner guides, academic research, and resources for the chapter's Python components.
Foundational Frameworks and Standards
DAMA International. DAMA-DMBOK: Data Management Body of Knowledge. 2nd ed. Bradley Beach, NJ: Technics Publications, 2017. The definitive reference for data management and governance, organizing the discipline into eleven knowledge areas. The DMBOK is to data governance what the PMBOK is to project management — a comprehensive body of knowledge that establishes shared vocabulary, concepts, and best practices. Essential for any student or practitioner seeking to implement the governance structures described in this chapter.
ISO 8000 Series. Data Quality — Parts 1-150. International Organization for Standardization. The international standard for data quality, covering concepts, vocabulary, measurement, and management. While more technical and less accessible than the DAMA-DMBOK, the ISO 8000 series provides the formal, internationally recognized framework for data quality that organizations can reference in contracts, audits, and regulatory compliance.
Ladley, John. Data Governance: How to Design, Deploy, and Sustain an Effective Data Governance Program. 2nd ed. Cambridge, MA: Academic Press, 2019. The most practical guide to building a data governance program from the ground up. Ladley covers organizational design, policy development, stewardship roles, executive sponsorship, change management, and sustainability — all the operational challenges that the DAMA-DMBOK framework describes in principle but that Ladley addresses in practice. Directly relevant to the NovaCorp case study.
Data Quality: Theory and Practice
Redman, Thomas C. Data Driven: Profiting from Your Most Important Business Asset. Boston: Harvard Business Review Press, 2008. Redman makes the business case for data quality with the same logic Ray Zhao used at NovaCorp — quantifying the cost of poor data and demonstrating the return on quality investment. Though published before the current data governance wave, Redman's core argument remains timely: organizations cannot be "data-driven" if their data is unreliable.
Batini, Carlo, and Monica Scannapieco. Data and Information Quality: Dimensions, Principles and Techniques. Cham: Springer, 2016.
The most comprehensive academic treatment of data quality dimensions, measurement techniques, and improvement methodologies. Batini and Scannapieco provide the theoretical foundation for the six quality dimensions introduced in this chapter, with detailed discussion of measurement approaches, benchmarking, and the relationships between dimensions. Essential for students interested in the technical depth behind the DataQualityAuditor class.
Sebastian-Coleman, Laura. Measuring Data Quality for Ongoing Improvement: A Data Quality Assessment Framework. Amsterdam: Morgan Kaufmann, 2013. A practitioner-focused guide to building data quality measurement programs. Sebastian-Coleman provides detailed guidance on defining quality metrics, establishing baselines, setting targets, monitoring trends, and reporting results — the operational practices that make the quality dimensions from Section 22.4 actionable in real organizations.
Organizational Governance and Stewardship
Seiner, Robert S. Non-Invasive Data Governance: The Path of Least Resistance and Greatest Success. Bradley Beach, NJ: Technics Publications, 2014. Seiner's central argument — that effective data governance should leverage existing organizational roles and relationships rather than imposing new bureaucratic structures — provides a valuable counterpoint to the council-and-steward model described in this chapter. His "non-invasive" approach may be particularly relevant for organizations (like NovaCorp at the outset) where governance is perceived as bureaucratic overhead.
Plotkin, David. Data Stewardship: An Actionable Guide to Effective Data Management and Data Governance. Cambridge, MA: Morgan Kaufmann, 2013. The most detailed guide to the data steward role — responsibilities, skills, selection criteria, performance measurement, and organizational positioning. Directly relevant to the chapter's discussion of stewardship as a governance function and to the practical challenge of appointing and supporting stewards in organizations where the role does not yet exist.
Khatri, Vijay, and Carol V. Brown. "Designing Data Governance." Communications of the ACM 53, no. 1 (2010): 148–152. A concise, influential article that proposes five decision domains for data governance: data principles, data quality, metadata, data access, and data lifecycle. The framework provides a useful alternative lens to the DAMA-DMBOK for understanding what governance must address, and is particularly valuable for its clarity about the relationship between governance decisions and organizational authority.
Public Sector and Government Data Governance
UK Government. "National Data Strategy." Department for Digital, Culture, Media and Sport. London, September 2020. The full text of the UK NDS, providing the strategic framework examined in Case Study 2. Available freely online. Reading the strategy document alongside the case study analysis reveals the gap between aspiration and implementation that characterizes many government data initiatives.
Goldacre, Ben, and Jessica Morley. "Better, Broader, Safer: Using Health Data for Research and Analysis." Department of Health and Social Care, April 2022. Commissioned by the UK government, this independent report provides one of the best analyses of health data governance challenges — covering trusted research environments, data access models, public trust, and the institutional reforms needed to realize the potential of health data while maintaining public confidence.
Open Data Institute. Various publications and reports. Available at https://theodi.org. The Open Data Institute, co-founded by Sir Tim Berners-Lee and Sir Nigel Shadbolt, publishes extensive resources on data governance, data ethics, data sharing, and data infrastructure. Their Data Ethics Canvas and Data Governance Handbook are particularly relevant to the themes of this chapter.
Python and Technical Resources
McKinney, Wes. Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter. 3rd ed. Sebastopol, CA: O'Reilly Media, 2022.
The standard reference for pandas, the Python library that underpins the DataQualityAuditor class. Chapters on data cleaning, missing data handling, duplicate detection, and data transformation provide the technical foundation for implementing the quality checks described in this chapter.
Great Expectations (open-source project). Available at https://greatexpectations.io.
Great Expectations is an open-source Python library for data quality testing and documentation. It implements many of the quality concepts from this chapter — including expectations for completeness, uniqueness, validity, and consistency — in a production-ready framework. Students who enjoyed building the DataQualityAuditor will find Great Expectations a natural next step toward real-world data quality tooling.
Apache Atlas (open-source project). Available at https://atlas.apache.org. An open-source metadata management and data governance framework that provides data catalog, lineage tracking, and classification capabilities. Relevant to the chapter's discussion of metadata management and data catalogs, Atlas demonstrates how the concepts from Section 22.5 are implemented in enterprise-scale systems.
These readings extend the chapter's coverage from governance frameworks to implementation details. As Part 4 continues with cross-border data flows (Chapter 23) and sector-specific governance (Chapter 24), the internal governance foundations laid here will interact with external regulatory requirements to create the complex, multi-layered governance landscape that real organizations must navigate.