Case Study 18.1: The New York Times Location Dataset — What "Anonymous" Data Reveals
Overview
In December 2019, the New York Times Opinion section published a landmark investigation titled "One Nation, Tracked," in which reporters Stuart A. Thompson and Charlie Warzel analyzed a dataset of 50 billion location data points covering more than 12 million Americans' smartphones over several months. The dataset was provided to the Times by a source within the location data industry who was concerned about privacy implications.
The investigation demonstrated, with specific examples (reported without identifying the individuals involved), that "anonymous" location data was trivially re-identifiable and contained intimate details about millions of people's lives — details that the people themselves did not know were being collected or sold.
This case study examines the investigation's findings, the industry response, and the regulatory consequences.
What the Dataset Showed
The Times' reporters were given access to a dataset typical of those sold by location data brokers. Each record contained:
- A device identifier (not a name or phone number, but a persistent ID tied to a specific device)
- Latitude and longitude coordinates, often precise to within a few meters
- A timestamp
- An accuracy score (how precise the location measurement was)
For many devices, the dataset contained hundreds or thousands of records per day, creating a near-continuous location trace. Reporters could follow the movement of a specific device from home to work, through leisure activities, to medical appointments, and through residential neighborhoods.
The White House finding: By filtering for devices that appeared in restricted areas near the White House and the Pentagon, reporters identified patterns suggesting the presence of devices belonging to Secret Service agents and military personnel — from their home addresses, their workplaces, and their daily routes. The implication was that a foreign intelligence agency, or a private company, with access to this dataset could potentially identify and locate security personnel.
The domestic violence shelter finding: The dataset included records of devices that had visited domestic violence shelters — locations that victims are told are confidential. The presence of a device at a shelter, combined with the device's history at other locations (including a previous home address), could identify a victim and their new location to an abuser.
The psychiatric facility finding: Devices that had visited psychiatric facilities, addiction treatment centers, and HIV clinics were identifiable from their broader location histories, revealing medical information about the devices' owners without any medical record access.
The religious and political activity finding: Devices that had attended specific churches, mosques, and temples, or that had appeared at political rallies and labor organizing meetings, were identifiable from their histories — creating records of religious affiliation and political activity for millions of people.
In each case, the reporters could identify these patterns without knowing the device owner's name. The patterns pointed unambiguously to specific individuals who could be identified by cross-referencing with publicly available data (home address records, employer directories, social media profiles).
The Industry Response
Major location data brokers and the apps that fed their databases responded to the Times investigation with several lines of defense:
"The data is anonymized." Brokers maintained that removing names and phone numbers from the dataset constituted adequate privacy protection. The Times investigation directly and visibly refuted this claim, but it remained the industry's primary legal and public relations position.
"Users consented through app permissions." Apps that contributed location data to broker pipelines pointed to their privacy policies and location permission requests as user consent. As discussed in the chapter, this consent is formal rather than meaningful — most users do not read privacy policies and do not understand that granting location access to a weather app may include sharing their location with third parties.
"The data is used for beneficial purposes." Brokers emphasized legitimate commercial uses: retail analytics, real estate research, public health analysis. These uses are real. They do not address the question of whether the data's existence creates risks that outweigh its benefits, particularly for vulnerable populations.
The industry did not, as a direct response to the investigation, change its data collection practices.
Regulatory Consequences
The Times investigation contributed to a series of regulatory actions that, while significant, fell short of comprehensively addressing the location data broker ecosystem:
FTC action against Venntel: The FTC reached a settlement with Venntel, a location data broker, in 2024, alleging that the company had sold sensitive location data (including visits to health facilities and religious sites) without adequate consent. The settlement prohibited Venntel from selling data for sensitive location categories and required it to delete previously collected data. The settlement was the first major FTC action specifically targeting location data brokers.
FTC action against X-Mode/Outlogic: In 2024, the FTC also settled with X-Mode Social, requiring the company to stop selling sensitive location data and to delete existing datasets covering sensitive locations.
State legislative responses: California, Virginia, Colorado, and Connecticut passed or amended comprehensive privacy laws with some location data provisions. None created a comprehensive prohibition on location data brokerage.
Congress: Multiple bills addressing location data broker practices were introduced in Congress between 2019 and 2023 without passing. The American Data Privacy and Protection Act, which would have created federal data minimization requirements, stalled in the Senate.
As of this writing, the location data broker industry continues to operate with substantial latitude. The specific practices documented in the Times investigation — collection of sensitive location visits, sale to government agencies, inadequate anonymization — have been subject to incremental regulatory action but not comprehensive reform.
Analysis
The Public Interest Case for the Investigation
The Times investigation is an important example of journalism as surveillance accountability. The reporters obtained a dataset that was being commercially traded, analyzed it in ways that documented specific harms and vulnerabilities, and published the results without identifying any of the individuals in the data. The public interest justification for the investigation — demonstrating that a commercial ecosystem was creating serious privacy risks without public awareness — is strong.
The investigation also illustrates the role of investigative journalism in the governance gap. Regulatory action against Venntel and X-Mode followed the Times investigation and subsequent similar reporting. The investigation created the public pressure and documented the harm that regulatory agencies required to act. Without the journalism, the regulatory action might not have occurred.
The Limits of Consent Frameworks
The investigation's most fundamental finding is that consent-based frameworks for location data privacy do not work. The apps whose permissions fed the broker ecosystem had obtained user consent in the formal sense — by presenting permission requests and privacy policies. That formal consent had produced a commercial data ecosystem in which millions of people's most sensitive location activities were available to anonymous commercial buyers without their awareness.
Consent-based frameworks fail when the party seeking consent has structural advantages — designing permission requests, writing policies, controlling the default settings — that render "consent" formal rather than genuine. The location data ecosystem is a demonstration at scale that when consent is designed to be given rather than to be meaningful, it produces the forms of permission without the substance of informed agreement.
The Vulnerable Populations Problem
The most troubling finding in the Times investigation involves vulnerable populations: domestic violence survivors, people receiving psychiatric care or addiction treatment, people whose religious or political affiliations could expose them to discrimination or violence. These are the people for whom location data exposure has the highest stakes and the least capacity for self-protection.
Location data brokers do not know or care whether a device belongs to a domestic violence survivor fleeing an abuser or to a tech executive attending a conference. The data infrastructure treats all devices equivalently. But the harms from exposure are radically unequal. A policy framework that focuses on average-case privacy risks misses the population for whom the risks are catastrophic.
The FTC's actions against Venntel and X-Mode, which specifically targeted sales of data about sensitive location categories (health facilities, religious sites), represent a recognition of this vulnerable populations problem. The limitation of this approach is that "sensitive categories" are hard to define comprehensively, and location data's power to reveal sensitive information comes precisely from its combinability — no single data point may look sensitive, but the pattern across many data points reveals what matters.
Discussion Questions
-
The Times investigation used location data to identify patterns about Secret Service agents, domestic violence survivors, and psychiatric patients without accessing any communication content. What does this demonstrate about the adequacy of "metadata" protections (protections that apply to content but not to location data)?
-
Location data brokers argue that users consent to data collection through app permissions. Evaluate this consent claim using the "meaningful consent" framework: is it informed, voluntary, and specific?
-
The FTC's enforcement actions against Venntel and X-Mode required demonstrating that specific harmful practices occurred. Why is this "harm-specific" enforcement approach insufficient for the structural problem the Times investigation identified?
-
The investigation demonstrates that "anonymous" location data is practically identifiable. Should the legal treatment of location data change based on this technical finding? What would change if "anonymous" location data received the same legal protections as identified location data?
-
The Times decided to report the investigation without identifying any of the individuals in the dataset, even when they could be identified. Was this the right editorial decision? What ethical principles support it? Are there scenarios in which identifying individuals from the dataset would have been justified?