Case Study: Data Governance in Government: The UK's National Data Strategy

"Government is the largest data holder in most countries. How it governs that data shapes what is possible — and what is just." — Sir Nigel Shadbolt, co-founder, Open Data Institute

Overview

When the UK government published its National Data Strategy in 2020, it was attempting something few governments had done systematically: articulate a comprehensive vision for how the state should manage, share, protect, and derive value from the enormous quantities of data it collects about its citizens, its services, and its operations.

This case study examines the UK's approach to government data governance — the institutional structures it created, the frameworks it adopted, the successes it achieved, and the tensions it struggled to resolve. It provides a counterpoint to the NovaCorp case study: where NovaCorp illustrates governance in a private company driven by commercial incentives, the UK illustrates governance in a public institution driven by democratic accountability, public service delivery, and the distinctive challenges of governing data that citizens did not choose to provide.

Skills Applied: - Analyzing data governance in a public-sector context - Evaluating the tension between data sharing for public benefit and individual privacy - Assessing institutional governance structures at national scale - Comparing public-sector and private-sector governance challenges


The Context: Government as Data Holder

The Scale

The UK government is one of the largest data holders in the world. Its departments and agencies collect and manage:

  • Tax records (HM Revenue & Customs): Income, employment, and tax data for approximately 45 million individuals and 5 million businesses.
  • Health records (NHS): Clinical records for approximately 66 million people, managed through the NHS Spine digital infrastructure.
  • Education records (Department for Education): School enrollment, examination results, attendance, and special needs data for millions of students.
  • Social security records (Department for Work and Pensions): Benefits, employment history, and disability assessments.
  • Criminal justice records (Home Office, Ministry of Justice): Police records, court records, prison records, and immigration data.
  • Census data (Office for National Statistics): The most comprehensive demographic survey, conducted decennially (the 2021 Census was the first to be primarily digital).

Each of these datasets is enormous, sensitive, and collected under statutory authority — meaning that citizens are often legally required to provide data, rather than choosing to do so voluntarily.

The Governance Challenge

Government data governance faces challenges that private-sector governance does not:

Democratic accountability. Government collects data using public authority. Citizens have a right to know how that data is used, who has access, and what decisions it informs. Governance must be transparent in ways that corporate governance can avoid.

Departmental silos. UK government departments operate with significant autonomy. Each department has its own data systems, its own data culture, and its own interpretation of data governance requirements. Cross-departmental data sharing — essential for efficient public services — is impeded by technical incompatibilities, legal uncertainties, and institutional reluctance.

The dual mandate. Government is simultaneously obligated to protect citizens' data (privacy, security, proportionality) and to use that data effectively for public benefit (fraud detection, public health, evidence-based policy). These obligations often point in different directions.

Legacy systems. Many government data systems are decades old. The NHS's Patient Administration Systems, DWP's legacy benefits systems, and HMRC's tax processing infrastructure were built before modern data governance concepts existed. Retrofitting governance onto these systems is technically and organizationally challenging.


The National Data Strategy (2020)

Structure and Ambitions

Published in September 2020, the UK National Data Strategy (NDS) set out a framework organized around five priority "missions":

  1. Unlocking the value of data across the economy. Using data to drive economic growth, innovation, and productivity.
  2. Securing a pro-growth and trusted data regime. Building a regulatory framework that enables data use while maintaining trust.
  3. Transforming government's use of data. Improving data-driven decision-making within government.
  4. Ensuring the security and resilience of data infrastructure. Protecting data from cyber threats and ensuring system reliability.
  5. Championing the international flow of data. Positioning the UK as a global hub for data (particularly significant post-Brexit, as the UK needed to demonstrate adequacy for continued data flows from the EU).

The Chief Data Officer Function

A key institutional innovation was the establishment of a cross-government Chief Data Officer (CDO) function, housed within the Central Digital and Data Office (CDDO) in the Cabinet Office. The government CDO was responsible for:

  • Setting data standards across departments
  • Establishing a government data catalog (the "Data Marketplace")
  • Developing a cross-government data quality framework
  • Coordinating data sharing initiatives
  • Building data capability (training, recruitment, professional development)

This mirrored the private-sector CDO role described in Chapter 22 — but with the added complexity of operating across dozens of semi-autonomous departments, each with its own minister, budget, and institutional culture.

The Data Quality Framework

The CDDO published a Government Data Quality Framework in 2021, establishing quality dimensions directly aligned with those in the DAMA-DMBOK and Chapter 22 of this textbook:

  • Accuracy: Data correctly represents the real-world entity or event
  • Completeness: All expected data is present
  • Uniqueness: No unintended duplicate records
  • Consistency: Data is consistent across systems and over time
  • Timeliness: Data is sufficiently current for its intended use
  • Validity: Data conforms to defined rules, formats, and constraints

Departments were expected to measure data quality against these dimensions, establish baselines, and set improvement targets. The framework was advisory rather than mandatory — a crucial distinction that shaped its effectiveness.


Implementation: Successes and Struggles

Success: NHS COVID-19 Data Sharing

The most dramatic example of government data governance in action came during the COVID-19 pandemic. In March 2020, the UK government invoked emergency powers to enable rapid sharing of NHS health data for pandemic response. The Control of Patient Information (COPI) notices temporarily relaxed normal data sharing restrictions, allowing:

  • Real-time sharing of hospital admission and ICU data for capacity planning
  • Linkage of health records with testing data, vaccination records, and mortality data
  • Research access to de-identified clinical data through the OpenSAFELY platform

The OpenSAFELY platform, developed by the DataLab at the University of Oxford, represented a governance innovation: rather than extracting patient data and sharing it with researchers, OpenSAFELY allowed researchers to run approved analyses within the NHS data environment, with only aggregated, non-identifying results leaving the system. This "trusted research environment" model balanced research utility with privacy protection.

The pandemic data sharing enabled critical insights — identifying high-risk populations, tracking variant spread, evaluating vaccine effectiveness — and was widely regarded as a success. But it also raised concerns: emergency data sharing frameworks were established with minimal public consultation, the COPI notices suspended normal consent requirements, and some citizens were alarmed to learn how extensively their health data could be shared under emergency powers.

Struggle: The GP Data Controversy

In contrast to the pandemic response, the government's attempt to create a centralized GP (general practitioner) data extraction system — the General Practice Data for Planning and Research (GPDPR) initiative — encountered fierce public opposition.

Announced in 2021, GPDPR proposed to extract data from GP records across England and transfer it to a central NHS Digital database for planning and research purposes. The initiative was technically sound and potentially valuable. But its governance was disastrous:

  • Communication failure: Most patients learned about the data extraction from media reports, not from their GPs or the NHS. The opt-out process was confusing and poorly publicized.
  • Trust deficit: The initiative launched shortly after the pandemic data sharing (which many had accepted as emergency-justified), creating a perception that the government was using emergency norms to expand routine data collection.
  • Governance concerns: The initial proposal lacked a clear data governance framework for how the extracted data would be managed, who would have access, and what purposes it would serve. Critics, including the British Medical Association, argued that the proposal prioritized data extraction over data governance.
  • Outcome: The initiative was paused following public outcry. The government eventually committed to an improved governance framework, including clearer opt-out mechanisms, independent oversight, and a "trusted research environment" model similar to OpenSAFELY.

The GPDPR controversy illustrated a fundamental lesson: in government data governance, legitimacy is as important as legality. The government had the legal authority to extract the data. But without public trust — built through transparent governance, meaningful consent, and demonstrated benefit — legal authority was insufficient.

Struggle: Cross-Departmental Data Sharing

Despite the NDS's emphasis on breaking down silos, cross-departmental data sharing remained difficult. Three barriers persisted:

Legal uncertainty. Different data types are governed by different statutes (the Data Protection Act, HMRC confidentiality provisions, NHS confidentiality obligations, census confidentiality requirements). Departments, wary of legal liability, often defaulted to not sharing rather than risk a violation. The result was a system where citizens had to provide the same information to multiple departments because the departments could not (or would not) share it among themselves.

Technical fragmentation. Government IT systems, built over decades by different departments with different vendors, were rarely interoperable. Data formats, coding standards, and identifier systems differed. A "customer" in DWP's systems could not be reliably matched to the same individual in HMRC's systems without extensive manual reconciliation — precisely the kind of consistency and uniqueness problem that the data quality framework was designed to address.

Cultural resistance. Departments viewed their data as an institutional asset — something they controlled and guarded. The NDS's vision of data as a shared government resource conflicted with departmental cultures that valued control over collaboration.


The Governance Model: Assessment

Strengths

The UK's approach to government data governance had several notable strengths:

  • Strategic vision. The NDS provided a clear, publicly stated framework for what the government was trying to achieve with data, enabling public scrutiny and accountability.
  • Institutional innovation. The creation of the CDDO and the government CDO function established governance infrastructure that did not previously exist.
  • Quality framework. The Government Data Quality Framework provided a common language and common metrics for assessing data quality across departments.
  • Trusted research environments. The OpenSAFELY model demonstrated that data could be used for public benefit without sacrificing privacy, providing a governance template for future initiatives.

Weaknesses

  • Advisory, not mandatory. The quality framework and many governance standards were advisory. Departments with limited resources or competing priorities could — and often did — deprioritize governance implementation.
  • Funding gaps. The NDS was aspirational but not fully funded. Many governance initiatives depended on departmental budgets that were already stretched.
  • Trust deficit. The GPDPR controversy revealed that the government's data governance efforts had not yet earned sufficient public trust to support ambitious data sharing initiatives.
  • Legacy system inertia. Technical barriers to cross-departmental sharing remained largely unresolved, as modernizing government IT systems required investments measured in billions of pounds and decades of effort.

Lessons for Public-Sector Data Governance

  1. Legitimacy is a governance requirement. In the private sector, governance serves the organization and its stakeholders. In the public sector, governance must also serve democratic accountability. Citizens whose data is collected under statutory authority — not by choice — deserve governance that is transparent, proportionate, and subject to meaningful public scrutiny.

  2. Emergency norms must not become permanent norms. The pandemic data sharing was justified by crisis. But the ease with which emergency powers were invoked raised the question of whether those powers would be relinquished. Governance frameworks must include sunset provisions and independent review of emergency measures.

  3. Data sharing requires governance before implementation. The GPDPR controversy could have been avoided if the governance framework — opt-out mechanisms, access controls, purpose limitations, independent oversight — had been established before the data extraction was announced. Trust is built before the ask, not after.

  4. Quality measurement is necessary but not sufficient. The Government Data Quality Framework provided excellent metrics. But metrics without mandates produce assessments without improvement. Making quality standards mandatory — with accountability and resources to support compliance — is essential.


Discussion Questions

  1. The OpenSAFELY model — bringing analysis to the data rather than data to the analyst — is sometimes called "privacy-preserving analytics." Could this model be applied beyond health data? What are its limitations?

  2. Compare NovaCorp's governance journey (private sector) to the UK government's experience (public sector). What governance principles are universal? What principles are sector-specific?

  3. The GPDPR controversy shows that legal authority to collect data is not the same as social license to collect data. How should governments build and maintain the social license necessary for data governance? Is public consultation sufficient, or are stronger mechanisms needed?

  4. Eli is working on a municipal data governance ordinance for Detroit. What lessons from the UK's experience are most relevant to his work? What lessons are specific to the UK context and may not transfer?


Your Turn: Mini-Project

Option A: Research the government data governance framework of a country other than the UK (e.g., Estonia's e-governance model, Singapore's Smart Nation initiative, or the US Federal Data Strategy). Write a 1,000-word comparison identifying what the other country does differently and what the UK could learn.

Option B: Design a public consultation process for a government data sharing initiative. Your design should address: how citizens are informed, how feedback is collected, how concerns are addressed, and how the results of the consultation influence the final governance framework.

Option C: Using the Government Data Quality Framework dimensions, design a quality assessment for a government dataset you interact with (e.g., public transit schedules, crime statistics, school performance data). Identify likely quality issues and propose governance measures to address them.


References

  • UK Government. "National Data Strategy." Department for Digital, Culture, Media and Sport. London, September 2020.

  • Central Digital and Data Office. "Government Data Quality Framework." London, 2021.

  • Goldacre, Ben, and Jessica Morley. "Better, Broader, Safer: Using Health Data for Research and Analysis." Department of Health and Social Care, April 2022.

  • OpenSAFELY Collaborative. "OpenSAFELY: Factors Associated with COVID-19 Death in 17 Million Patients." Nature 584 (2020): 430–436.

  • Vezyridis, Paraskevas, and Stephen Timmons. "Understanding the Care.Data Conundrum: New Information Flows for Economic Growth." Big Data & Society 4, no. 1 (2017).

  • Shadbolt, Nigel, and Roger Hampson. The Digital Ape: How to Live (in Peace) with Smart Machines. London: Scribe, 2018.

  • Ada Lovelace Institute. "The Data Divide: Public Attitudes to the Use of Data by Government." London, 2021.