> "Personal data is the new oil. But unlike oil, it's also part of you."
Learning Objectives
- Explain why traditional property law concepts do not map cleanly onto data
- Compare and contrast at least four theories of data ownership (property, labor, rights-based, commons)
- Analyze the interests of different stakeholders in data ownership disputes
- Evaluate the strengths and weaknesses of data trusts, cooperatives, and other collective governance models
- Describe the principles of indigenous data sovereignty and their implications for governance frameworks
- Apply data ownership frameworks to a real-world scenario involving VitraMed's patient data
In This Chapter
- Chapter Overview
- 3.1 The Ownership Problem
- 3.2 Theories of Data Ownership
- 3.3 Emerging Governance Models
- 3.4 Indigenous Data Sovereignty
- 3.5 Data Ownership in Practice: Three Scenarios
- 3.6 Toward a Pluralistic Approach
- 3.7 Chapter Summary
- What's Next
- Chapter 3 Exercises → exercises.md
- Chapter 3 Quiz → quiz.md
- Case Study: The Sidewalk Labs Toronto Data Trust → case-study-01.md
- Case Study: Indigenous Genomic Data and the HeLa Cells Legacy → case-study-02.md
Chapter 3: Who Owns Your Data?
"Personal data is the new oil. But unlike oil, it's also part of you." — European Consumer Organisation (BEUC), 2017
Chapter Overview
When Mira's father Vikram first built VitraMed, the question seemed simple: clinics owned their patients' medical records, patients had a right to access those records, and VitraMed — as the software provider — stored the data on the clinics' behalf. Clean lines, clear responsibilities.
Then VitraMed started building predictive models. The models were trained on aggregated patient data from hundreds of clinics. The predictions they generated — which patients were at risk for which conditions — were derived from the data but were not the data themselves. Who owned the predictions? The clinics that contributed the data? The patients whose conditions the models predicted? VitraMed, which built the algorithms and bore the computational costs? Or no one?
This chapter tackles one of the most contested questions in data governance: Who owns your data? The answer, as you'll discover, depends entirely on what you mean by "owns," what you mean by "your," and what you mean by "data."
In this chapter, you will learn to: - Recognize why data ownership is legally, philosophically, and practically complex - Compare multiple frameworks for thinking about data rights - Evaluate emerging models like data trusts and cooperatives - Understand how indigenous data sovereignty challenges Western data governance assumptions - Analyze a real-world data ownership dispute using multiple frameworks
3.1 The Ownership Problem
3.1.1 Why Property Law Struggles with Data
Property law, in most legal traditions, was designed for rivalrous and excludable goods. A loaf of bread is rivalrous — if I eat it, you can't. A house is excludable — I can lock the door. These properties make ownership intuitive: the owner controls access to a thing that others cannot simultaneously use.
Data has neither property:
- Non-rivalrous: If I share a dataset with you, I still have the dataset. Copying data doesn't deplete it. A million people can use the same dataset simultaneously.
- Non-excludable (in practice): Once data is shared, it's extremely difficult to prevent further sharing. You can't "un-know" something, and you can't reliably prevent digital copies from propagating.
This is why applying traditional property law to data produces paradoxes. If you "own" your medical data the way you own your car, what happens when your doctor makes a copy? Has your property been stolen? What about the insurance company that received the data under a contractual agreement? What about the aggregated, anonymized dataset that includes a derivative of your data but cannot be traced back to you?
Intuition: Think of data ownership less like owning a car and more like owning a conversation. You participated in the conversation — your words, your ideas. But the other person heard them, remembers them, and may have written them down. You can't "own" the conversation in the same way you own a physical object. Data ownership is similarly entangled.
3.1.2 The Stakeholder Map
In any data transaction, multiple parties have legitimate claims:
| Stakeholder | Claim | Basis |
|---|---|---|
| The data subject | "It's my life, my behavior, my body" | Autonomy, dignity, informational self-determination |
| The data collector | "We invested in the infrastructure, the survey, the sensors" | Investment, labor, contractual agreements |
| The data processor | "We cleaned, organized, and added value to the raw data" | Value-added labor, intellectual property |
| The algorithm builder | "We created the model that makes the data useful" | Inventive contribution, IP in the model |
| Society at large | "This data has public health/safety/research value" | Common good, public interest |
VitraMed's situation involves all five. The patient's body generated the health data. The clinic collected it. VitraMed processed and stored it. VitraMed's data scientists built predictive models from it. And public health researchers want access to aggregated data for epidemiological studies.
"Every one of these stakeholders has a legitimate point," Dr. Adeyemi told her class. "The question isn't who is right. The question is how we design governance systems that respect all of these interests without letting any one of them dominate."
3.2 Theories of Data Ownership
3.2.1 Data as Property
The most intuitive theory treats data as property — something that can be owned, bought, sold, and traded. Under this view, you own your personal data the way you own your labor or your creative works. You should have the right to sell it, license it, or refuse to share it.
Arguments for: - Gives individuals a legal basis to control their data - Creates market incentives for responsible data handling (if data has a price, companies must weigh the cost) - Aligns with familiar legal frameworks
Arguments against: - Deepens inequality — wealthy people can afford to withhold data; poor people may feel compelled to sell it - Impractical at scale — the average person generates data through hundreds of interactions daily; negotiating each one is impossible - Ignores the relational nature of data — your social media data includes information about your friends, your genetic data includes information about your relatives - May legitimize rather than constrain data markets
3.2.2 Data as Labor
Economist Jaron Lanier and others have proposed that the data we generate should be treated as a form of labor — and compensated accordingly. Under this theory, every Google search, Facebook post, and Amazon purchase is an act of productive labor that creates value for the platform. Users should receive payment for their contribution, just as workers receive wages.
Arguments for: - Recognizes the genuine economic value users create - Could redistribute some of the enormous profits of platform companies - Provides a framework for collective bargaining ("data unions")
Arguments against: - The value of any individual's data is typically trivial (pennies per interaction) — payments would be too small to be meaningful - Frames the problem as one of compensation rather than power — paying people for their data doesn't address the surveillance, manipulation, and discrimination that data enables - Assumes the problem is that users aren't being paid enough, rather than that certain uses of data shouldn't happen at all
"I don't want to be paid for my data," Eli argued in class. "I want to not be surveilled. If the city pays me five dollars a month for the data from the lampposts in my neighborhood, does that make the surveillance okay? Of course not."
3.2.3 Data as a Rights-Based Framework
The European tradition, rooted in the concept of informational self-determination articulated by the German Federal Constitutional Court in 1983, treats data protection not as a property right but as a fundamental right — akin to freedom of speech or freedom of assembly.
Under this view, individuals don't "own" their data; they have a right to control how it is used. The GDPR embodies this approach: it doesn't give you property rights over your data, but it gives you rights of access, rectification, erasure, portability, and objection.
Arguments for: - Inalienable — you can't sell away your fundamental rights, which protects against coercion - Applies regardless of economic status - Places obligations on data controllers rather than burdens on data subjects - Connects data protection to broader human rights frameworks
Arguments against: - Rights without enforcement mechanisms are aspirational rather than practical - Can conflict with other rights (freedom of expression, freedom of information, public safety) - May be too rigid for dynamic, context-dependent data flows
Global Perspective: The rights-based approach is strongest in Europe and has influenced legislation worldwide, including Brazil's LGPD and India's DPDP Act. The United States, by contrast, has historically treated data protection through a sectoral approach (HIPAA for health, FERPA for education, COPPA for children) rather than as a comprehensive fundamental right. China takes yet another approach, balancing individual data rights against state interests in data access. We'll explore these regulatory models in detail in Part 4.
3.2.4 Data as Commons
A fourth perspective draws on Elinor Ostrom's Nobel Prize-winning work on commons governance — the management of shared resources that belong to no one and everyone simultaneously. Under this theory, data generated by communities (health data, environmental data, urban data) should be governed as a shared resource, with community-determined rules for access, use, and benefit sharing.
Arguments for: - Recognizes that much data is relational — generated by communities, not just individuals - Provides governance models that balance access with protection - Draws on well-studied governance frameworks (Ostrom's design principles)
Arguments against: - Defining the "community" is often contested and politically fraught - Commons governance works best for relatively stable, bounded communities — the internet is neither - Risk of free-riding and tragedy of the commons without strong institutional frameworks
3.3 Emerging Governance Models
3.3.1 Data Trusts
A data trust is a legal structure in which an independent trustee manages data on behalf of a defined group of beneficiaries. The trustee has a fiduciary duty — a legal obligation to act in the beneficiaries' best interests — similar to how a financial trustee manages assets on behalf of a client.
Data trusts have gained attention as a potential solution to the power imbalance between individuals and platforms. Rather than each person negotiating data terms alone, a data trust aggregates negotiating power and provides expert governance.
Examples: - The Open Data Institute (UK) has piloted data trust frameworks for urban mobility data - Sidewalk Labs (a Google subsidiary) proposed a data trust for its Toronto smart city project — though the project was ultimately canceled amid privacy concerns - MIDATA in Switzerland operates as a health data cooperative that gives members collective control over their medical data
Connection: Data trusts connect to a broader theme we'll develop in Chapter 22 (Data Governance Frameworks): the institutionalization of data stewardship through structures that are accountable, transparent, and oriented toward beneficiaries rather than shareholders.
3.3.2 Data Cooperatives
A data cooperative is a member-owned organization where data subjects collectively own, manage, and benefit from their data. Unlike a data trust, where a trustee makes decisions on behalf of beneficiaries, a cooperative is governed democratically by its members.
Data cooperatives draw on the long history of cooperative movements in agriculture, housing, and financial services. The key principle is that the people who generate the data should collectively control it.
Example: Driver's Seat Cooperative Ride-share and delivery drivers created Driver's Seat to collectively pool their trip data. Instead of each driver's data flowing exclusively to Uber or DoorDash, Driver's Seat aggregates data across platforms, allowing drivers to identify the most profitable times, locations, and platforms — turning data from a tool of management into a tool of worker empowerment.
3.3.3 Data Portability
Data portability — the right to receive your data in a structured, commonly used format and transmit it to another service — is enshrined in the GDPR (Article 20) and has been adopted by several other jurisdictions.
Portability addresses one aspect of ownership: the ability to take your data and leave. If you're unhappy with how Facebook handles your data, you can download your profile and take it to a competitor — in theory.
In practice, portability faces significant challenges: - The data you can export is often less useful than the inferences drawn from it, which are not portable - Network effects mean your social graph has no value if your friends don't also switch - Technical formats may be interoperable in theory but incompatible in practice
3.3.4 The Right to Be Forgotten
The right to be forgotten (or right to erasure) — established in the EU by the 2014 Google Spain ruling and codified in GDPR Article 17 — gives individuals the right to request deletion of personal data when it is no longer necessary, when consent is withdrawn, or when the data was unlawfully processed.
This right represents a partial solution to the "data never dies" problem described in Chapter 1. But it creates its own tensions:
- Freedom of expression vs. privacy: Should a newspaper be required to delete a factual article because the subject finds it embarrassing?
- Jurisdictional conflicts: Should a deletion request in Europe apply to search results visible in the United States?
- Technical feasibility: As Chapter 1 explored, truly deleting data from all copies, backups, and derivative works is extraordinarily difficult
Mira discovered this firsthand when a patient contacted VitraMed requesting erasure of all their data. "Technically, we can delete their records," she told her father. "But the predictive model was trained on their data. We can't 'un-train' the model. Their patterns are embedded in the weights. Is that deletion?"
Vikram didn't have an answer.
3.4 Indigenous Data Sovereignty
3.4.1 The CARE Principles
Perhaps the most fundamental challenge to Western data ownership frameworks comes from indigenous data sovereignty movements, which argue that indigenous peoples have collective rights to data about their communities, lands, languages, and cultural practices — rights that exist independently of individual data rights.
The CARE Principles for Indigenous Data Governance, developed by the Global Indigenous Data Alliance, stand for:
- C — Collective Benefit: Data ecosystems should be designed to enable indigenous peoples to derive benefit from data
- A — Authority to Control: Indigenous peoples have rights to govern the collection, ownership, and application of data about their peoples, lands, and resources
- R — Responsibility: Those working with indigenous data have a responsibility to support indigenous data governance
- E — Ethics: Indigenous peoples' rights and wellbeing should be the primary concern in data collection and use
3.4.2 Why This Challenges Western Frameworks
The CARE Principles challenge Western data ownership on several fronts:
-
Collective rights, not just individual rights. The GDPR protects individual data subjects. Indigenous data sovereignty asserts collective rights — the right of a people to control data about their language, genome, or sacred sites, regardless of whether any individual data point is personally identifiable.
-
Historical context. Indigenous data has been extracted by colonial powers for centuries — from anthropological studies that treated communities as research subjects to genetic research that patented indigenous biological knowledge. Data sovereignty is inseparable from broader struggles for self-determination.
-
Different ontologies of data. Western frameworks typically treat data as a resource to be managed. Some indigenous frameworks treat data as a living entity with relational obligations — not something to be "owned" but something to be cared for within a web of responsibilities.
Thought Experiment: A pharmaceutical company wants to study genetic data from an indigenous community to develop treatments for a rare disease. The community would benefit from the treatment, but the company would also profit. Individual community members have given informed consent. But the community's elders argue that collective consent is also required — that the community as a whole should decide whether its genetic information enters the commercial sphere.
Under a Western individual-rights framework, the individual consents are sufficient. Under indigenous data sovereignty principles, they are not. Which framework better protects the community? Which better respects individual autonomy? Can both be honored simultaneously?
3.5 Data Ownership in Practice: Three Scenarios
3.5.1 Your Fitness Tracker Data
You wear a fitness tracker that records your steps, heart rate, sleep patterns, and GPS location. Who owns this data?
| Framework | Answer |
|---|---|
| Property | You do — it's data about your body, generated by your movement |
| Labor | You do — your physical activity created the data; you should be compensated |
| Rights-based | You have rights to access, correct, port, and delete the data, but "ownership" is not the right frame |
| Corporate/contractual | The manufacturer does — you agreed to their terms of service when you activated the device |
| Commons | If aggregated health data has public health value, the community has an interest |
In practice, most fitness tracker companies retain extensive rights to your data under their terms of service. Fitbit's (now Google's) privacy policy allows the company to use your data for product improvement, research partnerships, and "de-identified" data sharing. Your legal rights vary dramatically by jurisdiction.
3.5.2 Your Social Media Posts
You post a photograph on Instagram. Who owns it?
You retain copyright in the image — you created it. But Instagram's terms of service grant the company a "non-exclusive, royalty-free, transferable, sub-licensable, worldwide license" to use, modify, distribute, and create derivative works from your content. You own it; they can do almost anything with it.
Moreover, the metadata about your post — when you posted, where you were, who liked it, how long other users looked at it — belongs entirely to Meta. And the inferences drawn from your posting behavior (your interests, emotional state, political leanings) are Meta's intellectual property.
3.5.3 VitraMed's Predictive Models
VitraMed trains a model on data from 200,000 patients across 500 clinics. The model can predict Type 2 diabetes risk with 87% accuracy. Who owns the model?
- The patients contributed the data, but the model is not their data — it's an abstraction of patterns across many patients
- The clinics collected the data, but under their agreements with VitraMed, they licensed it for processing
- VitraMed designed the algorithm, managed the computation, and bore the financial cost
- Public health researchers argue the model has immense social value and should be publicly accessible
This is, in essence, the fundamental question of AI-era data ownership: Who owns the intelligence extracted from data contributed by many?
3.6 Toward a Pluralistic Approach
No single theory of data ownership is adequate for all contexts. The property framework works well for clearly individual data (your diary, your creative works) but poorly for relational data (social graphs, genetic data). The labor framework highlights exploitation but risks legitimizing data markets. The rights framework protects dignity but can be hard to enforce. The commons framework empowers communities but struggles with boundary-drawing.
What emerges from this analysis is not a single answer but a pluralistic approach — different governance mechanisms for different contexts, informed by the nature of the data, the power dynamics involved, and the values at stake.
Best Practice: When evaluating a data ownership question, don't ask "who owns this data?" as if there's a single answer. Instead, ask: 1. What kind of data is this? (Personal, relational, collective, derived?) 2. What interests are at stake? (Autonomy, profit, public health, cultural sovereignty?) 3. What governance mechanisms exist? (Contracts, regulation, community norms, technical controls?) 4. What power dynamics shape the negotiation? (Who has leverage? Who lacks it?) 5. What outcomes should governance produce? (Fairness, innovation, protection, equity?)
3.7 Chapter Summary
Key Concepts
- Traditional property law struggles with data because data is non-rivalrous and practically non-excludable
- Four major theories of data ownership: data as property, data as labor, data as rights, and data as commons
- Emerging governance models include data trusts, data cooperatives, data portability, and the right to be forgotten
- Indigenous data sovereignty (CARE Principles) challenges Western assumptions by asserting collective rights and relational obligations
- No single theory is adequate; a pluralistic approach matches governance mechanisms to contexts
Key Debates
- Should individuals be able to sell their personal data?
- Is the problem with data exploitation best addressed through compensation (labor theory) or prohibition (rights theory)?
- How should collective data rights (indigenous sovereignty, community health data) interact with individual rights?
- Who owns the intelligence extracted from data contributed by many?
Applied Framework
For any data ownership question, analyze through five lenses: type of data, interests at stake, governance mechanisms available, power dynamics, and desired outcomes.
What's Next
In Chapter 4: The Attention Economy, we'll examine how the data flows described in Chapters 1-3 are monetized through a business model that treats human attention as a scarce resource to be captured, measured, and sold. We'll explore how dark patterns, engagement optimization, and behavioral surplus extraction shape the digital environment — and what governance responses are emerging.
Before moving on, complete the exercises and quiz to solidify your understanding of data ownership frameworks.