Case Study 31-2: AI and the Global Water Crisis — Data Centers in Drought Regions

DataField.Dev

Case Study 31-2: AI and the Global Water Crisis — Data Centers in Drought Regions

Overview

When Pengfei Li and colleagues published "Making AI Less 'Thirsty'" in 2023, they quantified for the first time what many data center engineers and environmental advocates had long suspected: AI systems are significant freshwater consumers, and at scale their water footprint is substantial enough to raise serious questions about siting data centers in water-stressed regions. The paper estimated that ChatGPT — OpenAI's conversational AI product, running on Microsoft's Azure infrastructure — consumed approximately 500 milliliters (half a liter) of freshwater per conversation. At hundreds of millions of daily conversations, this represents millions of liters of water per day.

This case study examines the water dimension of AI's environmental impact: how data centers consume water, what the Li et al. findings reveal, where the most serious geographic conflicts arise, and what responsible siting and disclosure practices for AI infrastructure would look like.

How Data Centers Consume Water

Cooling Technology and Water Use

The servers and power infrastructure in a large data center generate substantial heat — a 100-megawatt data center (a medium-large facility) produces roughly as much waste heat as a small industrial plant. This heat must be dissipated to maintain server operating temperatures and prevent equipment failure.

The dominant cooling technology in modern large data centers is evaporative cooling, also called evaporative cooling towers or cooling tower heat rejection. In this system, hot water absorbs heat from server systems (via heat exchangers), then flows to cooling towers where it is exposed to ambient air. A portion of the hot water evaporates, removing heat from the remaining water through the latent heat of evaporation. The cooled water is recirculated to the servers. The evaporated water is lost from the system and must be replaced with fresh water.

The efficiency of evaporative cooling depends on ambient temperature and humidity. In hot, dry climates — where land and power are often cheap — evaporative cooling is particularly effective at keeping servers cool, but requires more water than in cool, humid climates where less evaporation is needed. This creates a paradox: the climates most amenable to data center siting from a cooling perspective are often the climates where water is most scarce.

Alternative cooling approaches include: - Air cooling: Using fans and ambient air without water. Less efficient at high temperatures but no water consumption. Most suitable for cool climates. - Seawater cooling: Using ocean or bay water as a heat sink. Microsoft has experimented with undersea data centers (Project Natick). Eliminates freshwater consumption but has other environmental considerations. - Liquid cooling at the chip level: Bringing coolant fluid directly to the chip, enabling more efficient heat transfer and reducing or eliminating the need for facility-level evaporative cooling. Emerging and potentially transformative for high-performance computing.

The technology choice is primarily economic: evaporative cooling is well-understood, reliable, and cost-effective in most climates, which is why it dominates despite its water consumption.

The Li et al. Methodology

Li and colleagues estimated AI water consumption by combining several data sources: Microsoft's published Water Use Effectiveness (WUE) metrics for its data centers (liters of water consumed per kWh of server energy); the geographic distribution of Microsoft's AI-serving data centers and the local WUE that applies in each climate; and estimates of the energy consumed per GPT conversation derived from published hardware specifications and estimated query volumes.

The key finding: estimating that ChatGPT consumes approximately 500 mL of water per conversation, with significant variation depending on which geographic region's data centers are serving the query. Queries served by data centers in temperate climates (Pacific Northwest, Netherlands) require less cooling water; queries served by data centers in hot, dry climates (Arizona, Texas) require more.

The paper also estimated GPT-3 training water consumption at approximately 700,000 liters — comparable to the water consumed by 23 average American households for a month, or sufficient to fill approximately 280 bathtubs.

The methodology has limitations acknowledged by the authors: actual WUE varies by season and time of day; the geographic distribution of ChatGPT serving infrastructure is not publicly disclosed; and OpenAI and Microsoft do not publish the query volumes on which consumption estimates depend. The figures are credible approximations rather than precise measurements — and the approximations are only possible because of partial information available from Microsoft's voluntary sustainability reporting.

The Geographic Conflicts

Arizona: AI Data Centers vs. Colorado River

The Phoenix metropolitan area has become one of the most significant data center hubs in the United States. Microsoft, Google, Meta, Apple, Apple, Oracle, and multiple colocation providers operate large data centers in the greater Phoenix area, attracted by cheap power (historically, though rates are rising), large available land parcels, business-friendly state policy, and the effective evaporative cooling enabled by the hot, dry climate.

The water dimension of this concentration is alarming in context. The Phoenix area receives approximately 8 inches of rain annually — far below the 28-inch US average. It depends for its water supply on the Colorado River (via the Central Arizona Project canal), Salt River Project reservoirs, and groundwater. The Colorado River itself is severely overallocated: the Law of the River — the series of compacts, court decisions, and federal statutes governing Colorado River water — allocated the river based on early 20th century measurements that substantially overestimated its average flow. The gap between allocated withdrawals and actual average flow has been managed through Lake Mead and Lake Powell reservoir storage that spent decades being drawn down toward historically unprecedented low levels.

In 2023, the Bureau of Reclamation declared the first-ever Tier 1 shortage on the Colorado River, triggering mandatory reductions in Arizona's water allocation. Agricultural users in central Arizona bore the first significant cuts. The Bureau of Reclamation and basin states reached a consensus plan for further cuts in 2023, with subsequent negotiations about long-term basin management. Data centers — which have rights-based claims to municipal water supplies — are not directly subject to Colorado River shortage restrictions, but they compete for water in a system where every gallon used by data centers is a gallon unavailable for other uses in an acutely water-limited basin.

The political economy of this conflict is instructive. Data centers bring construction jobs, high-wage operational jobs (though fewer than the size of facilities might suggest — large data centers are highly automated), and significant tax revenue. Cities and counties in the Phoenix area have competed vigorously to attract data centers through tax incentives, streamlined permitting, and favorable land deals. The water consumption of those data centers — measured in millions of gallons per year per facility — has received less attention in the economic development calculus.

Communities in the Phoenix area and water advocates have raised concerns about the cumulative water impact of data center concentration, but data centers are typically served by municipal water utilities that don't publicly characterize their largest industrial customers, making quantification of the data center water footprint difficult. Individual data center operators do not typically disclose their facility-level water consumption.

Netherlands: Moratoriums and Political Controversy

The Netherlands provides a contrasting case: a water-sufficient country (approximately 32 inches of annual rainfall, extensive water management infrastructure) that nonetheless faced significant political controversy over data center water use and eventually imposed policy restrictions.

The Netherlands, and particularly the Amsterdam metropolitan area, became the most significant European data center hub through a combination of: excellent transatlantic internet connectivity (the Amsterdam Internet Exchange is one of the world's most important), business-friendly English-language environment, EU membership (important for data sovereignty), and historically reliable power and water.

By 2022, data centers in the Netherlands consumed approximately 3-4 TWh of electricity annually — roughly 3-4% of national electricity demand — and were growing rapidly. Several municipalities, led by Amsterdam, imposed temporary moratoria on new data center development, citing electricity grid constraints, land use impacts, water use, and a desire to ensure that data center growth served local economic development objectives rather than simply serving global cloud demand.

The Dutch national government subsequently developed a data center localization policy that attempted to direct future data center development away from densely populated areas and toward locations with power and water capacity. Requirements for efficient water use and preference for data centers that could reuse their waste heat (for district heating or industrial processes) were incorporated.

The Dutch response illustrates the policy tools available to governments that wish to govern data center development: moratorium, localization requirements, efficiency standards, and heat reuse mandates. These are not prohibitions on data center development but conditions on responsible development — an approach that could be applied in water-stressed regions in the United States and globally.

Chile and the Atacama Region

Chile has emerged as a target for major data center investment, with Microsoft, Google, Amazon, and multiple colocation providers announcing or building facilities primarily in the Santiago region and the northern regions that offer cheap renewable energy (Chile has abundant solar and wind resources). The northern regions — including the Atacama Desert, the world's driest non-polar desert — offer solar irradiance levels among the highest on earth, making them attractive for renewable energy-powered computing.

The water dimension of data centers in the Atacama context is acute. The Atacama region has some of the lowest annual rainfall on earth (less than 1 millimeter in some areas) and its rivers and aquifers are severely stressed by existing demands from copper mining, agriculture, and human settlement. Indigenous communities — Atacameño, Aymara — depend on these water resources for livelihood and cultural practice and have already experienced significant water stress from mining operations.

Data center development in the Atacama region that relies on evaporative cooling would consume water in a context where any additional water demand creates zero-sum competition with communities whose water resources are already under severe pressure. Air cooling or seawater cooling (where coastal siting is possible) could potentially allow data center development without freshwater competition, but these approaches require more careful engineering and may be more expensive.

What Responsible Siting and Disclosure Would Look Like

A Framework for Responsible Data Center Water Stewardship

Drawing on the geographic case studies and the science of water stress assessment, a responsible framework for AI data center water stewardship would include:

Water Stress Screening: Before siting any new data center, conduct assessment of the local water stress level using established frameworks such as the World Resources Institute's Aqueduct Water Risk Atlas or similar tools. Facilities in regions classified as "high" or "extremely high" water stress should face heightened scrutiny and require demonstrated water-efficient design.

Water Use Transparency: Require facility-level disclosure of actual water consumption (not just design targets), including both direct water use (evaporative cooling) and indirect water use (power plant cooling for the electricity supply). This disclosure should be public, standardized, and verified.

Water Use Efficiency Standards: Commit to water use efficiency targets — not just PUE (power usage effectiveness) as the dominant metric, but WUE (water usage effectiveness) as an equally prominent metric. The Green Grid's WUE standard provides a benchmark; leading data centers achieve WUE below 0.5 liters/kWh.

Alternative Cooling Preference in Stressed Regions: Where data centers are to be built in water-stressed regions, require or strongly prefer cooling technologies that do not use evaporative freshwater cooling — air cooling, closed-loop cooling that reuses the same water without evaporation, or seawater cooling where coastal siting is feasible.

Rainwater Harvesting and Water Recycling: Data centers in locations with seasonal rainfall can reduce their net freshwater withdrawal through on-site rainwater harvesting and through use of reclaimed (treated wastewater) water for cooling rather than potable water. Microsoft and Google have made commitments to achieve net positive water balance — returning more water to local watersheds than they withdraw — in water-stressed regions.

Community Water Impact Disclosure: Disclose to local communities and water authorities the projected and actual water withdrawal associated with data centers, enabling informed local governance of shared water resources.

The Microsoft and Google Commitments

Both Microsoft and Google have made public commitments on water that go beyond current standard practice. Microsoft committed in 2020 to be "water positive" by 2030 — replenishing more water than it withdraws in stressed regions, through water conservation projects that return water to local watersheds. Google committed to replenishing 120% of its freshwater consumption in water-stressed regions by 2030.

These commitments are meaningful if implemented with integrity — they acknowledge the local water impact of data center operations and commit to offsetting it through watershed investments. Critics note that water replenishment projects (protecting a watershed, restoring a wetland) are not equivalent to returning water to the specific communities or aquifers affected by data center extraction; geographic and hydrological distance between the water withdrawn and the water replenished can make the "net positive" claim misleading.

The commitments also require disclosure and verification to be credible. If Microsoft and Google do not publish facility-level water consumption data, communities and regulators cannot independently verify whether the commitments are being met.

Analysis: What the Water Story Teaches

The Invisible Resource

Water is what economists call an "off-balance-sheet" resource in most AI company reporting — it is consumed, it is a cost (water utility bills), but it does not appear in any comparable way to carbon in sustainability reporting. The development of the GHG Protocol and the associated CDP disclosure framework created the infrastructure for carbon accounting; no equivalent framework with similar uptake exists for water.

The Li et al. paper was significant not primarily for its precise numerical estimates (which are approximations) but for making the water footprint of AI visible at all — for providing a framework that allows journalists, policymakers, and communities to ask "how much water does this AI system use?" and to receive a reasoned, evidence-based answer rather than no answer.

Making water visible requires the same infrastructure as making carbon visible: standardized measurement methodology, mandatory disclosure requirements, independent verification, and public access to facility-level data.

The Justice of Siting

The communities most likely to bear the water costs of data center siting are those with the least power to resist it: rural communities in water-stressed regions that face economic development pressure, indigenous communities whose traditional water rights are often inadequately protected in legal frameworks, and low-income communities that lack the political resources to challenge industrial permitting decisions.

The communities capturing the benefits of the AI systems those data centers support — corporate users, knowledge workers, high-income consumers — are largely different communities. This is the justice dimension of data center water use: it reproduces the same pattern as AI's global carbon impact, but at a local scale. The water withdrawn from Arizona aquifers to cool Microsoft data centers serving GPT-4 queries is water unavailable to Arizona farmers, Arizona municipal water systems, and Arizona ecosystems. The connection between those queries and those water withdrawals is invisible in current accounting — but it is real.

Discussion Questions

The Phoenix, Arizona, metropolitan area is in acute water scarcity, and data centers compete for water with agricultural users and residents. Should local or state governments limit data center construction in water-stressed regions? What criteria should govern such restrictions?
Microsoft and Google have committed to being "water positive" by 2030 through watershed restoration projects. Critics argue that watershed restoration in distant locations does not compensate for local aquifer depletion adjacent to specific data centers. Is the "water positive" framework an adequate accountability mechanism or a form of water offset accounting that obscures local impact?
The Li et al. paper estimated ChatGPT's water footprint but acknowledged significant uncertainty. Microsoft and OpenAI have not confirmed or corrected these estimates. What does this refusal to engage with independent research on environmental impact suggest about corporate accountability?
Air cooling requires more energy than evaporative cooling in hot climates, meaning that reducing water use through air cooling increases energy and carbon consumption. How should this trade-off be evaluated, and who should make the decision?
Should individual AI users have a right to know the water and carbon cost of their specific AI queries? Would this information change your behavior? What would responsible "environmental label" disclosure for AI products look like?