Further Reading: Environmental Data Ethics and Climate

The environmental impact of data infrastructure is a rapidly evolving area where technical measurement, policy analysis, and justice scholarship intersect. The sources below are organized by theme and selected to deepen your engagement with the core questions of Chapter 34: How much environmental harm do data systems produce? Who bears the costs? And what governance mechanisms can ensure that efficiency gains translate into actual reductions rather than expanded consumption? Start with Strubell et al. (2019) and Schwartz et al. (2020) for the foundational arguments, then follow the threads that matter most to your work.


AI Energy Consumption and Carbon Footprint Measurement

Strubell, Emma, Ananya Ganesh, and Andrew McCallum. "Energy and Policy Considerations for Deep Learning in NLP." Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (2019): 3645-3650. The paper that launched the conversation. Strubell et al. estimated the carbon footprint of training a large NLP model (a Transformer with neural architecture search) at approximately 284 tonnes of CO2 — equivalent to five times the lifetime emissions of an average American car. The paper's most influential contribution was not the specific number but the argument that environmental cost should be reported alongside performance metrics in AI research. Required reading for anyone working at the intersection of AI and sustainability.

Patterson, David, et al. "Carbon Emissions and Large Neural Network Training." arXiv:2104.10350, 2021. A Google Research team's response to Strubell et al., arguing that the original estimates were too high and that careful engineering choices — efficient hardware, low-carbon data centers, optimized training procedures — can reduce emissions by orders of magnitude. Patterson et al. reported that training GPT-3 on Google's infrastructure would have produced 78% less carbon than published estimates. The paper is valuable both for its technical detail on emission reduction strategies and as an example of how measurement methodology shapes environmental narratives. Read alongside Strubell et al. for the full debate.

Luccioni, Alexandra Sasha, Sylvain Viguier, and Anne-Laure Ligozat. "Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model." Journal of Machine Learning Research 24, no. 253 (2023): 1-15. A rare example of transparent carbon accounting for a large language model. The BLOOM team disclosed that training their 176-billion-parameter model produced approximately 25 tonnes of CO2 — substantially less than comparable models — largely because training occurred in France, where nuclear power provides low-carbon electricity. The paper demonstrates that location decisions have enormous environmental consequences and provides a methodological template for carbon disclosure that other organizations could adopt.

Li, Pengfei, et al. "Making AI Less 'Thirsty': Uncovering and Addressing the Secret Water Footprint of AI Models." arXiv:2304.03271, 2023. Expands the environmental analysis beyond carbon to water consumption — a dimension often invisible in sustainability discussions. Li et al. estimate that training GPT-3 consumed approximately 700,000 liters of fresh water for data center cooling, and that a typical ChatGPT conversation consumes roughly 500 ml. In regions facing water scarcity, these numbers raise governance questions that carbon metrics alone cannot capture. Essential reading for understanding why environmental impact assessment must be multi-dimensional.


Green AI, Efficiency, and the Rebound Effect

Schwartz, Roy, et al. "Green AI." Communications of the ACM 63, no. 12 (2020): 54-63. The manifesto for treating computational efficiency as a first-class research metric. Schwartz et al. coined the distinction between "Red AI" (pursuing accuracy at any computational cost) and "Green AI" (optimizing the accuracy-efficiency trade-off). They proposed that AI publications report compute budgets alongside performance results, enabling the research community to evaluate whether marginal accuracy gains justify their environmental costs. The paper's framework remains the clearest articulation of how research incentives shape environmental outcomes.

Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the Knowledge in a Neural Network." arXiv:1503.02531, 2015. The foundational paper on knowledge distillation — training smaller "student" models to replicate the behavior of larger "teacher" models. While not framed as an environmental paper, knowledge distillation has become one of the most important techniques for reducing the computational cost of AI deployment. Understanding this technique is essential for evaluating claims about AI efficiency and for the model compression exercises in this chapter.

Alcott, Blake. "Jevons' Paradox." Ecological Economics 54, no. 1 (2005): 9-21. A rigorous examination of the rebound effect (Jevons paradox) — the phenomenon where efficiency improvements reduce per-unit costs and thereby increase total consumption. Alcott traces the concept from Jevons' 1865 observation about coal efficiency to modern applications in energy economics. The paper provides the theoretical foundation for Section 34.4.3's argument that technical efficiency alone cannot solve AI's environmental challenge without complementary governance mechanisms such as carbon pricing and emissions caps.


Environmental Monitoring, Climate Data, and Dual-Use Tensions

Rolnick, David, et al. "Tackling Climate Change with Machine Learning." ACM Computing Surveys 55, no. 2 (2022): 1-96. The most comprehensive survey of AI applications for climate change mitigation and adaptation, covering electricity systems, transportation, buildings, industry, agriculture, forestry, carbon capture, climate prediction, and societal adaptation. At 96 pages, it is a reference work rather than casual reading, but it is indispensable for understanding the "environmental benefit" side of the dual-role tension described in Section 34.5. The paper acknowledges but does not resolve the paradox that climate AI requires the same computational infrastructure that contributes to climate change.

Garnett, Stephen T., et al. "A Spatial Overview of the Global Importance of Indigenous Lands for Conservation." Nature Sustainability 1 (2018): 369-374. Demonstrates that indigenous peoples manage or have tenure over approximately 25% of the world's land surface, including approximately 40% of terrestrial protected areas and 37% of ecologically intact landscapes. This paper provides the empirical foundation for the argument in Section 34.5.2 that indigenous land management is essential for global conservation — and therefore that environmental monitoring on indigenous territories must be governed under indigenous data sovereignty principles.

Kukutai, Tahu, and John Taylor (eds.). Indigenous Data Sovereignty: Toward an Agenda. Canberra: ANU Press, 2016. The foundational collection on indigenous data sovereignty, establishing the principle that indigenous peoples have inherent rights to govern data about their communities, territories, and knowledge systems. Several chapters directly address environmental data — from land management records to biodiversity databases to climate adaptation knowledge. Essential reading for understanding how the CARE Principles apply to the environmental monitoring cases discussed in Section 34.5.2 and Case Study 2.


Environmental Justice and Data Infrastructure

Hogan, Mél. "Data Flows and Water Woes: The Utah Data Center." Big Data & Society 2, no. 2 (2015). A pioneering analysis of the material infrastructure behind "the cloud," focused on the NSA's Utah Data Center and its water consumption in an arid region. Hogan's work connects the abstract concept of data processing to its concrete environmental consequences — water diversion, land use, community impact — in ways that anticipate the environmental data justice framework of Section 34.6. The paper demonstrates that data center siting is an environmental justice issue, not merely a technical or economic one.

Crawford, Kate. Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. New Haven: Yale University Press, 2021. A sweeping account of AI's material infrastructure — from lithium mines in Nevada to Amazon warehouses to data centers in Iowa. Crawford traces the full supply chain of AI, revealing the human labor, natural resources, and environmental consequences that are rendered invisible by narratives of technological immateriality. Chapter 2 ("Compute") and Chapter 3 ("Data") are particularly relevant to Chapter 34's themes. The book complements the technical measurement focus of Strubell and Patterson with a political economy perspective that asks not just "how much?" but "who pays?"

International Energy Agency. "Electricity 2024: Analysis and Forecast to 2026." IEA, 2024. The authoritative source for data center energy consumption statistics cited in Section 34.1.1. The IEA estimates that data centers consumed approximately 460 TWh globally in 2024 — roughly 2% of global electricity — and projects significant growth driven by AI workloads. The report provides the macroeconomic context for understanding why individual model training decisions aggregate into a systemic environmental challenge. Updated annually; check for the most recent edition.

Lepawsky, Josh. Reassembling Rubbish: Worlding Electronic Waste. Cambridge, MA: MIT Press, 2018. A critical examination of the global e-waste trade that challenges simplistic narratives about "dumping" while documenting the real health and environmental harms of electronics recycling in the Global South. Lepawsky's analysis is essential for understanding the full lifecycle environmental cost of the GPUs, servers, and networking equipment that constitute AI infrastructure. The book connects to Section 34.6.1's discussion of how hardware disposal costs are externalized to communities with the least political power.


Policy and Governance Frameworks

European Commission. "European Green Deal: Fit for 55 Package." 2021. The EU's comprehensive climate policy framework, which includes provisions increasingly relevant to data infrastructure — energy efficiency requirements for data centers, carbon pricing through the Emissions Trading System, and sustainability reporting obligations under the Corporate Sustainability Reporting Directive. As the EU Digital Strategy and Green Deal converge, this framework may become the template for governing AI's environmental impact. Compare with the absence of equivalent requirements in US federal policy discussed in Section 34.6.2.


These sources represent the state of knowledge as of early 2025. The field is evolving rapidly — new measurement methodologies, efficiency techniques, and governance proposals emerge regularly. For the most current research, monitor the proceedings of ACM FAccT, NeurIPS (particularly the Climate Change AI workshop), and the journals Nature Energy and Nature Sustainability.