27 min read

> "If you have come here to help me, you are wasting your time. But if you have come because your liberation is bound up with mine, then let us work together."

Learning Objectives

  • Define the digital divide in its multiple dimensions — access, skills, and outcomes — and explain how each dimension compounds the others
  • Analyze digital redlining as a form of discriminatory infrastructure investment and its consequences for data-driven services
  • Apply the data colonialism framework to contemporary extraction of data from marginalized communities
  • Evaluate indigenous data sovereignty principles and their implications for mainstream data governance
  • Apply data feminism principles to identify and challenge structural biases in data systems
  • Connect the digital divide to algorithmic bias, surveillance harms, and health data equity

Chapter 32: Digital Divide, Data Justice, and Equity

"If you have come here to help me, you are wasting your time. But if you have come because your liberation is bound up with mine, then let us work together." — Lilla Watson, Aboriginal Australian activist and educator

Chapter Overview

In March 2020, when schools across the United States shifted to remote learning virtually overnight, a stark reality emerged: millions of students had no reliable internet access at home. In Detroit, where Eli grew up, an estimated 40% of households lacked broadband internet. Students sat in fast-food parking lots to access WiFi for their classes. Others simply disappeared from attendance rosters entirely.

The pandemic did not create this inequality. It revealed it — violently, visibly, and undeniably. The "digital divide" had been documented for decades, but the assumption had always been that it was narrowing, that time and market forces would close the gap. The pandemic demonstrated that the gap was not narrowing. In many communities, it was deepening — and the consequences were not merely inconvenience but compounding disadvantage that intersected with every data-driven system we've examined in this book.

This chapter goes beyond access to examine the deeper structures of data-driven inequality. The digital divide is not just about who has broadband; it is about who is represented in data, who benefits from data-driven systems, who is harmed by algorithmic decisions, and whose knowledge is recognized as legitimate. These are questions of data justice — and they connect directly to the Power Asymmetry, Consent Fiction, and Accountability Gap themes that run throughout this text.

In this chapter, you will learn to: - Analyze the digital divide as a multi-dimensional problem of access, skills, and outcomes - Recognize digital redlining as a contemporary form of structural discrimination - Apply data colonialism and data feminism as critical frameworks for understanding data-driven inequality - Evaluate models for more equitable data governance, including indigenous data sovereignty - Connect digital inequality to the algorithmic bias, surveillance, and health data equity challenges examined in earlier chapters


32.1 The Digital Divide: Beyond Access

32.1.1 Three Levels of the Digital Divide

The term "digital divide" was popularized in the late 1990s, initially referring to the gap between those who had access to the internet and those who did not. Since then, researchers have identified three distinct levels of digital inequality (van Dijk, 2020):

First-level divide: Access. Do people have reliable, affordable internet access and functional devices? As of 2024, approximately 2.6 billion people worldwide lack internet access entirely (ITU, 2024). In the United States, the FCC estimates that approximately 24 million Americans lack broadband access — though independent researchers argue the actual number is closer to 42 million (BroadbandNow, 2023).

Second-level divide: Skills. Among those who have access, how effectively can they use digital tools? Digital literacy encompasses basic skills (navigating a browser, using email), operational skills (evaluating information credibility, protecting privacy), and strategic skills (using digital tools to achieve economic, educational, or civic goals). Access without skills reproduces inequality in a different form.

Third-level divide: Outcomes. Even among people with equivalent access and skills, digital engagement produces unequal outcomes depending on social position. A job seeker in a wealthy suburb with professional networks benefits differently from the same job search tools as a job seeker in an underserved community without those networks. The platforms are the same; the outcomes are not.

"These three levels interact," Dr. Adeyemi explained. "A student without broadband at home falls behind on digital skills. Without skills, they can't use digital tools strategically. Without strategic use, they experience worse outcomes. The three levels don't add — they multiply."

32.1.2 Who Is on the Wrong Side?

The digital divide does not fall randomly across the population. It tracks preexisting lines of inequality:

Income. In the United States, 99% of households earning more than $100,000 per year have broadband internet, compared to 57% of households earning less than $25,000 (Pew Research Center, 2023).

Race and ethnicity. Black Americans are approximately 1.4 times more likely than white Americans to lack home broadband, and Hispanic Americans approximately 1.3 times more likely (NTIA, 2023). On many Native American reservations, broadband access rates fall below 20%.

Geography. Rural areas consistently lag behind urban areas. In the United States, approximately 22% of rural residents lack broadband access, compared to 1.5% of urban residents (FCC, 2023). The gap is even more pronounced in the Global South.

Age. Older adults are significantly less likely to use the internet and less likely to have the digital skills needed to navigate complex online systems — including the healthcare portals, government services, and financial platforms that increasingly require digital access.

Disability. People with disabilities face both access barriers (inaccessible interfaces, incompatible assistive technologies) and representation barriers (datasets that exclude or misrepresent disabled populations).

Callout Box: The Compounding Effect

These categories intersect. An elderly, low-income, Black woman living in rural Mississippi faces compounding disadvantages — not merely the sum of each individual disadvantage but their multiplication. This is what legal scholar Kimberle Crenshaw calls intersectionality — the insight that overlapping systems of oppression cannot be understood by examining each axis of disadvantage in isolation. Digital inequality is intersectional by nature.

32.1.3 From Digital Divide to Data Divide

The digital divide has a downstream consequence that is often overlooked: a data divide. Communities without reliable internet access are systematically underrepresented in the data that drives decision-making. This creates a vicious cycle:

  1. Underrepresented communities generate less data — fewer clicks, fewer searches, fewer digital transactions.
  2. Algorithms trained on incomplete data perform worse for these communities — less accurate recommendations, less relevant services, more biased predictions.
  3. Worse algorithmic performance reduces the value of digital services for these communities, further discouraging adoption.
  4. Reduced adoption means even less data, perpetuating the cycle.

Eli saw this pattern clearly in Detroit. "The neighborhood I grew up in doesn't show up in Google Maps the way downtown Detroit does. The street view images are outdated. The business listings are wrong. The transit recommendations don't account for the buses that actually run. It's not that the technology doesn't work — it's that the technology was never designed with us as the audience. We're the missing data."


32.2 Digital Redlining: Discriminatory Infrastructure

32.2.1 From Physical Redlining to Digital Redlining

In the 1930s, the Home Owners' Loan Corporation (HOLC) created maps of American cities that graded neighborhoods by perceived lending risk. Neighborhoods with large Black populations were typically graded "D" — marked in red — and denied access to federally backed mortgages. This practice, known as redlining, systematically denied Black communities access to the capital necessary to build wealth, with consequences that persist to this day.

Digital redlining is the contemporary equivalent: discriminatory patterns in the deployment of digital infrastructure that map onto and reinforce historical patterns of exclusion. The term, developed by researchers including Christopher Ali and Safiya Umoja Noble, describes how telecom companies' investment decisions systematically disadvantage communities of color and low-income communities.

32.2.2 The Evidence

Research has documented digital redlining across multiple dimensions:

Broadband infrastructure. A landmark study by the Markup (2022) analyzed broadband pricing in 38 US cities and found that AT&T, Verizon, EarthLink, and CenturyLink offered slower speeds at higher prices in neighborhoods with higher percentages of Black, Hispanic, and low-income residents. The disparities persisted even after controlling for factors like housing density and distance from network infrastructure.

5G deployment. The deployment of 5G networks has followed patterns consistent with digital redlining. Mobile carriers have prioritized deployment in affluent urban areas, leaving rural and low-income communities with slower, less reliable service.

Municipal broadband. When cities have attempted to build their own broadband networks to address access gaps, they have often been blocked by state laws — passed at the urging of telecom industry lobbyists — that prohibit municipal broadband. As of 2024, approximately 17 US states have laws restricting municipal broadband initiatives.

Algorithmic pricing. Research by Hao and Sweeney (2020) documented how algorithmic pricing systems charged higher prices for online services in zip codes with higher proportions of minority residents — a form of digital redlining that operates through automated systems rather than explicit human decisions.

32.2.3 The Compounding Effect: When Infrastructure Meets Algorithms

Digital redlining is not merely an access problem. It compounds every algorithmic harm we've examined in this book:

  • Predictive policing (Chapter 14) is deployed in the same communities that experience digital redlining — producing surveillance saturation in neighborhoods already denied equitable infrastructure investment.
  • Algorithmic credit scoring (Chapter 15) performs less accurately in communities with less digital footprint data — leading to higher denial rates and worse terms for the same communities denied broadband infrastructure.
  • Health technology operates less effectively in communities without reliable internet — VitraMed's telehealth features, for instance, are functionally unavailable to patients without broadband, precisely the patients most likely to need expanded healthcare access.

"This is what structural inequality looks like," Eli said during a seminar on digital infrastructure policy. "It's not that someone decides 'let's give Black neighborhoods worse internet.' It's that investment decisions follow expected return, expected return follows existing wealth, existing wealth follows centuries of discriminatory policy, and the result is a system that reproduces inequality without anyone having to make an explicitly discriminatory decision. The Accountability Gap is built into the architecture."

Connection to Chapter 14: The algorithmic bias patterns examined in Chapter 14 are partly downstream consequences of digital redlining. Biased training data often reflects not just historical discrimination but ongoing infrastructure inequality. A model trained on data that underrepresents digitally redlined communities will perform worse for those communities — not because the algorithm is explicitly biased but because the data infrastructure that feeds it is structurally unequal.


32.3 Data Colonialism: Extraction and Exploitation

32.3.1 The Concept

In Chapter 5, we introduced the concept of data colonialism — the idea that contemporary data extraction practices reproduce the logic of historical colonialism: powerful actors extracting value from less powerful populations, with minimal compensation, consent, or reciprocal benefit.

Nick Couldry and Ulises Mejias (2019), who developed the framework in The Costs of Connection, argue that data colonialism represents a new stage of capitalism in which the raw material being extracted is not physical resources but human life itself — relationships, behaviors, emotions, movements — converted into data and used to generate value for platform corporations.

The parallels to historical colonialism are specific:

Colonial Dynamic Historical Form Data Colonial Form
Extraction Gold, rubber, labor Behavioral data, social graphs, attention
Appropriation Land seizure through legal mechanisms Data capture through terms of service
Value export Raw materials shipped to metropole for processing Raw data shipped to Silicon Valley for algorithm training
Dependency creation Colonial infrastructure designed to serve the colonizer Platform infrastructure creating lock-in and dependency
Erasure Indigenous knowledge systems delegitimized Local knowledge overridden by algorithmic classification
Resistance Anti-colonial movements Data sovereignty movements, data cooperatives

32.3.2 Digital Extractivism in Practice

Data colonialism manifests in concrete, observable practices:

Platform dependency. Communities in the Global South increasingly depend on platforms controlled by companies in the Global North for basic communication (WhatsApp), economic activity (Facebook Marketplace, Uber), and information access (Google). The data generated by these activities flows to servers in the United States and Europe, where it is used to train algorithms and generate advertising revenue — little of which returns to the communities that generated the data.

Free labor. Users of social media platforms generate the content that makes those platforms valuable. This labor is uncompensated — users "pay" with their data and attention but receive no share of the advertising revenue their activity generates. When Meta's AI systems are trained on user posts, user images, and user conversations, this is extraction of labor without compensation.

Research extraction. Academic and corporate researchers collect data from marginalized communities, publish findings, and advance their careers — often without returning meaningful benefits to the communities studied. Genetic data from indigenous populations has been used to develop pharmaceutical products without consent or benefit-sharing. Urban sensor data from low-income neighborhoods has been used to train smart city algorithms deployed in affluent communities.

"Data colonialism isn't a metaphor," Sofia Reyes said during a policy workshop. "It's a structural analysis. The flows of value — data flowing from communities to corporations, profits flowing from corporations to shareholders, harms flowing from algorithms back to communities — follow the same extractive logic as the flows of sugar, cotton, and gold. The technology is new. The logic is centuries old."

32.3.3 Challenging Data Colonialism

Resistance to data colonialism takes multiple forms:

Data sovereignty movements assert communities' right to control data about themselves and their members (Section 32.4).

Data cooperatives organize collective data governance, allowing communities to negotiate data use on their own terms.

Platform cooperatives build alternative platforms owned and governed by their users rather than by venture capital investors.

Regulatory interventions like the EU's GDPR and the AU's Data Policy Framework establish legal constraints on data extraction, though enforcement remains uneven.

Counter-data practices collect and publish data that challenges dominant narratives — community-controlled environmental monitoring, citizen science projects, independent audit initiatives.


32.4 Indigenous Data Sovereignty: A Deeper Examination

32.4.1 Foundations

In Chapter 3, we introduced indigenous data sovereignty (IDS) as an emerging framework for data governance. Here we examine it in greater depth as a model for community-controlled data governance that challenges the assumptions of both corporate and state data systems.

The Global Indigenous Data Alliance (GIDA) defines indigenous data sovereignty as "the right of Indigenous peoples to govern the collection, ownership, and application of data about Indigenous communities, peoples, lands, and resources."

This right is grounded in:

  • The United Nations Declaration on the Rights of Indigenous Peoples (UNDRIP, 2007), which affirms Indigenous peoples' right to self-determination and to maintain and develop their own institutions.
  • The reality that data about Indigenous peoples has historically been collected by colonial authorities, about Indigenous populations, for the purposes of the colonizer — from census data used to administer reservations to genetic data extracted without consent.
  • The recognition that Indigenous knowledge systems — including oral histories, ecological knowledge, and cultural practices — constitute data that requires governance according to Indigenous values and protocols, not Western intellectual property frameworks.

32.4.2 The CARE Principles

The CARE Principles for Indigenous Data Governance (Carroll et al., 2020) complement the FAIR Principles (Findable, Accessible, Interoperable, Reusable) that guide mainstream open data practices:

C — Collective Benefit. Data ecosystems should be designed and function in ways that enable Indigenous peoples to derive benefit from the data.

A — Authority to Control. Indigenous peoples' rights and interests in Indigenous data must be recognized, and their authority to control such data must be empowered.

R — Responsibility. Those working with Indigenous data have a responsibility to share how those data are used to support Indigenous peoples' self-determination and collective benefit.

E — Ethics. Indigenous peoples' rights and wellbeing should be the primary concern at all stages of the data life cycle.

Callout Box: FAIR vs. CARE — Complementary, Not Competing

The FAIR Principles prioritize data accessibility — making data findable, interoperable, and reusable. The CARE Principles prioritize data justice — ensuring that data accessibility serves the communities the data concerns. Without CARE, FAIR principles can facilitate data extraction: making Indigenous data maximally accessible to non-Indigenous researchers without ensuring Indigenous communities benefit. Together, they point toward a data governance model that is both scientifically productive and ethically grounded.

32.4.3 Case Studies in Indigenous Data Sovereignty

The Maori Data Sovereignty Network (Te Mana Raraunga) in Aotearoa New Zealand has developed protocols for the governance of Maori data — data about Maori people, language, culture, resources, and environment. Their framework recognizes Maori data as a "taonga" (treasure) subject to collective Maori governance, and has influenced New Zealand government data policy.

The First Nations Information Governance Centre (FNIGC) in Canada operates under OCAP principles — Ownership, Control, Access, and Possession — which assert First Nations' collective ownership of community data and their right to control all aspects of data management and use.

The US Census and Native American undercounting. The US Census has historically undercounted Native American populations, leading to reduced federal funding and political representation. Native-led census outreach initiatives have worked to improve count accuracy while maintaining tribal sovereignty over how census data is used.

These examples demonstrate that indigenous data sovereignty is not an abstract principle but an operational framework with concrete institutional expressions.


32.5 Data Feminism: Rethinking Data Through an Intersectional Lens

32.5.1 The Framework

Catherine D'Ignazio and Lauren Klein's Data Feminism (2020) applies intersectional feminist theory to data science, identifying seven principles for more just and equitable data practices:

  1. Examine power. Data science is not neutral — it reflects and reinforces power structures. Begin every data project by asking: who has power, who doesn't, and how does this data system affect the distribution of power?

  2. Challenge power. Use data science to challenge unjust power structures, not merely to describe them. Data systems can be designed to illuminate inequality, amplify marginalized voices, and redistribute resources.

  3. Elevate emotion and embodiment. Data science's emphasis on objectivity and rationality often devalues emotional knowledge, lived experience, and embodied understanding. A more complete data science would incorporate multiple ways of knowing.

  4. Rethink binaries and hierarchies. Data classification systems impose binary categories (male/female, citizen/non-citizen, employed/unemployed) that erase the complexity of human experience. Feminist data science questions these categories and their consequences.

  5. Embrace pluralism. No single perspective captures the full picture. Data systems should be designed to incorporate multiple viewpoints, especially those of communities most affected by the systems' decisions.

  6. Consider context. Data is never "raw" — it is always shaped by the context of its collection. Feminist data science insists on transparency about how data was collected, by whom, for what purpose, and with what limitations.

  7. Make labor visible. The labor that produces and maintains data systems — from the underpaid content moderators (Chapter 31) to the data entry workers who clean and label training datasets — is often invisible. Making this labor visible is an ethical and analytical imperative.

32.5.2 Missing Data and Structural Silence

One of data feminism's most powerful insights concerns missing data — the systematic absence of data about marginalized populations, which renders those populations invisible to data-driven decision-making.

Missing data is not accidental. It reflects power:

Femicide data. In many countries, data on femicide (the killing of women because of their gender) is not systematically collected. Activist groups like Data Cívica in Mexico and Feminicidio.net in Spain have built independent databases, demonstrating that the absence of official data was itself a form of structural violence — a refusal to count that made the problem invisible.

Police violence data. In the United States, no comprehensive national database of police killings existed until journalists and activists built them (the Washington Post's Fatal Force database, The Guardian's The Counted, and the community-maintained Mapping Police Violence). The absence of official data was not an oversight — it was a structural choice that served the interests of those who would rather the problem remain unmeasured.

Trans and non-binary populations. Most national statistical systems do not collect data on transgender and non-binary populations. This means that the specific needs, health outcomes, and economic circumstances of these communities are invisible to evidence-based policymaking.

"Silence in data is not just an absence," Dr. Adeyemi observed. "It is a statement about whose experiences are worth measuring, whose suffering is worth counting, whose needs are worth knowing about. Missing data is not missing by accident. It is missing by design — or, more precisely, by the design of systems that were built by and for people who didn't need that data."

32.5.3 Counter-Data and Data Activism

Data feminism doesn't just critique existing data systems — it supports the creation of alternative data practices:

Community data collection. Organizations like the Anti-Eviction Mapping Project in San Francisco collect and visualize data on displacement, eviction, and housing speculation that official data sources don't capture. Their work makes visible the human costs of gentrification and provides evidence for policy advocacy.

Participatory data design. Rather than designing data systems for marginalized communities, participatory approaches design systems with them — incorporating community knowledge, values, and priorities into the data collection and analysis process.

Data auditing. Algorithmic auditing initiatives (Chapter 17) conduct independent tests of automated systems to identify bias. These audits often reveal disparities that the systems' designers either didn't test for or chose not to disclose.

Eli connected data feminism to his community organizing work. "In Detroit, we don't need someone from Silicon Valley to come collect data about us. We need the resources and infrastructure to collect data for ourselves — data that answers our questions, supports our priorities, and is governed by our community. That's what data justice looks like."


32.6 VitraMed: The Equity Gap in Predictive Health

32.6.1 Discovering the Disparity

In the wake of the data breach crisis (Chapter 30), VitraMed's newly hired Data Protection Officer conducted a comprehensive audit of the company's predictive health models. The audit revealed a disparity that the company had suspected but not systematically measured: VitraMed's patient risk scoring models were significantly less accurate for rural and underserved populations.

The reasons were structural, not intentional:

Training data skew. VitraMed's models had been trained primarily on data from its earliest clinic clients — suburban and urban practices serving insured, predominantly white populations. Patients from rural clinics, community health centers, and safety-net providers were underrepresented in the training data.

Feature availability gaps. The models relied on features — electronic health record completeness, lab result frequency, specialist referral history — that correlated with access to healthcare. Patients in underserved communities had fewer recorded health encounters, less complete records, and fewer specialist visits — not because they were healthier but because they had less access to the care that generated the data.

Outcome label bias. The models were trained to predict "adverse health events" as recorded in the EHR. But the recording of adverse events depended on a patient being in contact with the healthcare system. Patients who experienced adverse events but did not seek care (due to cost, distance, distrust, or transportation barriers) were labeled as "no adverse event" in the training data — a systematic undercount of harm in precisely the populations most at risk.

32.6.2 The Equity Audit

Mira, working from within the company during her summer internship, advocated for a formal equity audit. The audit examined model performance across demographic groups and geographic regions, producing findings that were uncomfortable but illuminating:

  • Model sensitivity (the ability to correctly identify patients at risk) was 23% lower for patients in rural ZIP codes than for patients in urban and suburban ZIP codes.
  • False negative rates — cases where the model incorrectly classified a patient as "low risk" when they were actually at risk — were 31% higher for Black patients than for white patients.
  • The model performed worst for the patients who needed it most: low-income, rural, minority patients with limited healthcare access and incomplete records.

"This is the data divide in action," Mira told her father. "Our model works best for the patients who already have the best healthcare and worst for the patients who already have the worst. We're not reducing health disparities — we're encoding them."

Vikram Chakravarti, VitraMed's founder, was shaken. "We built this to help people," he said. "How did we end up in a position where our technology reinforces the very inequalities it was supposed to address?"

The answer, as this chapter has shown, is structural. VitraMed did not intend to create a biased model. But the data ecosystem in which it operated — shaped by digital redlining, data colonialism, and the structural silences of missing data — produced biased outcomes as a natural consequence of building on an unequal foundation.

Callout Box: What VitraMed Did Next

VitraMed's response to the equity audit included: 1. Partnering with community health centers and rural clinics to improve training data representation 2. Developing separate performance benchmarks for underserved populations, with public reporting 3. Engaging community advisory boards in model development and evaluation 4. Implementing "data supplements" — alternative data sources (community health surveys, social determinants data) to compensate for EHR gaps 5. Publishing the equity audit methodology so other health-tech companies could replicate it

These steps did not eliminate the disparity overnight. But they demonstrated that acknowledging the problem — rather than hiding behind claims of "algorithmic objectivity" — was the first step toward meaningful equity in data-driven health technology.


32.7 Eli's Detroit: The Digital Divide as Compounding Factor

32.7.1 A Neighborhood Case Study

Eli's neighborhood on Detroit's east side illustrates how digital inequality compounds every harm we've examined in this book:

Surveillance without connectivity. The Smart City sensors deployed in Eli's neighborhood (Chapter 1) collect data continuously — but residents lack the broadband access necessary to use the city's transparency portals, file complaints about surveillance practices, or access the public records that document how their data is being used. They are surveilled but cannot participate in the governance of that surveillance.

Predictive policing without digital recourse. The predictive policing algorithm (Chapter 14) targets Eli's neighborhood based on data that overrepresents communities of color. Residents who are stopped or arrested as a result of algorithmic predictions often cannot access the digital tools — legal databases, civil rights organization portals, algorithmic audit reports — that might help them challenge those predictions.

Health data without health access. VitraMed's models underperform for patients in communities like Eli's — and those patients are the least likely to have the broadband access, digital literacy, and institutional connections needed to advocate for better model performance.

"The digital divide isn't just about who has WiFi," Eli told the class. "It's about who has the power to shape the systems that shape their lives. Every data system we've studied in this class — surveillance, predictive policing, credit scoring, health tech — works differently depending on which side of the digital divide you're on. And the same communities that are on the wrong side of the digital divide are on the wrong side of every one of those systems."

32.7.2 Community Responses

Eli's community organizing work has focused on bridging the digital divide through collective action:

Community broadband initiatives. Eli has been advocating for municipal broadband in Detroit, arguing that internet access is essential infrastructure — like water and electricity — that should not be left entirely to the market. The Detroit Community Technology Project, a real organization that has been building community-owned wireless networks since 2016, provides a model.

Digital literacy programs. Community organizations in Detroit offer digital literacy training — but Eli argues that "digital literacy" must include not just technical skills (how to use a computer) but political literacy (how to understand and challenge the data systems that affect your community).

Data governance advocacy. Eli's testimony before the city council on the data governance ordinance (Chapter 25) included demands for community representation on data governance boards, mandatory equity audits of city data systems, and public reporting on how city data investments are distributed across neighborhoods.


32.8 Toward Data Justice: Models and Frameworks

32.8.1 The Data Justice Framework

Linnet Taylor (2017) defines data justice as the pursuit of fairness in the way people are made visible, represented, and treated as a result of their production of digital data. The framework identifies three pillars:

  1. (In)visibility. Who is seen and who is unseen in data systems? Being visible can enable access to services but also enable surveillance. Being invisible can provide privacy but also deny access. Data justice requires control over one's own visibility.

  2. Engagement with technology. On what terms do people engage with data-collecting technologies? Are those terms genuinely voluntary? Are they informed? Are they equitable? Data justice requires meaningful participation in the governance of technologies that affect people's lives.

  3. Non-discrimination. Are data-driven decisions free from bias? Do automated systems treat people equitably regardless of race, gender, income, geography, or other protected characteristics? Data justice requires both technical fairness (Chapter 15) and structural equity.

32.8.2 From Individual Rights to Collective Justice

A key insight of the data justice framework is that individual rights — privacy rights, data access rights, consent rights — are necessary but not sufficient for data equity. Individual rights assume a level playing field: each person exercises their rights independently, and the aggregate result is fair.

But the playing field is not level. Individual data rights mean little if: - You lack the broadband access to exercise them - You lack the digital literacy to understand what you're consenting to - You lack the economic power to choose alternatives when you disagree with data practices - You lack the political representation to influence the regulations that govern data systems

Data justice therefore requires collective mechanisms — community governance, cooperative structures, solidarity networks, political organizing — that can counterbalance the structural power of data-collecting institutions.

Reflection: Consider a data system that affects your daily life (a credit score, a social media algorithm, a health app, a university LMS). Using Taylor's three pillars — visibility, engagement, and non-discrimination — evaluate whether that system is data-just. What would need to change?


32.9 Chapter Summary

Key Concepts

  • The digital divide operates on three levels — access, skills, and outcomes — that compound each other and track preexisting lines of inequality (race, income, geography, age, disability).
  • Digital redlining reproduces historical patterns of discriminatory infrastructure investment in the digital domain, with documented disparities in broadband pricing, 5G deployment, and algorithmic service delivery.
  • Data colonialism (Couldry & Mejias, 2019) analyzes how contemporary data extraction reproduces the logic of historical colonialism — extraction without consent, value export without compensation, dependency creation.
  • Indigenous data sovereignty asserts Indigenous peoples' right to govern data about their communities, operationalized through frameworks like the CARE Principles and institutions like the FNIGC and Te Mana Raraunga.
  • Data feminism (D'Ignazio & Klein, 2020) applies intersectional feminist analysis to data systems, identifying structural biases including the systematic production of missing data about marginalized populations.
  • Data justice (Taylor, 2017) provides a framework for evaluating data systems through three pillars: visibility, engagement, and non-discrimination.

Key Debates

  • Is the digital divide primarily a market failure (requiring public investment) or a policy failure (requiring regulatory reform)?
  • Does the concept of "data colonialism" illuminate structural dynamics or trivialize historical colonialism by metaphorical extension?
  • Can individual data rights achieve data justice, or are collective governance mechanisms necessary?
  • How should data systems handle the tension between visibility (which enables services) and invisibility (which protects privacy) for marginalized communities?

Applied Framework

The Data Equity Audit: 1. Representation — Who is in the data? Who is missing? What are the consequences of absence? 2. Access — Who can access and use the data system? What barriers exist? 3. Benefit — Who benefits from the data system? Are benefits equitably distributed? 4. Harm — Who is harmed? Are harms disproportionately borne by specific communities? 5. Governance — Who governs the data system? Are affected communities meaningfully represented in governance decisions?


What's Next

The structural inequalities examined in this chapter extend into the workplace. In Chapter 33: Labor, Automation, and the Gig Economy, we examine how data-driven systems are reshaping work itself — from algorithmic management that tracks every keystroke to gig economy platforms that classify workers as independent contractors to avoid labor protections. Sofia Reyes takes center stage as her work at the DataRights Alliance leads her to investigate the data asymmetries that define the modern workplace.


Chapter 32 Exercises → exercises.md

Chapter 32 Quiz → quiz.md

Case Study: Digital Redlining — Broadband Discrimination in American Cities → case-study-01.md

Case Study: Data Feminism in Practice — Challenging Missing Data → case-study-02.md