Deep Science VenturesExalsiuslogsight.ai GmbHTechnical University of BerlinTU BerlinFeb 26, 2026arXiv:2602.22760

Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study

Philipp Wiesner, Philipp Wiesner, Soeren Becker, Soeren Becker, Brett Cornick, Brett Cornick, Dominik Scheinert, Dominik Scheinert, Alexander Acker, Alexander Acker, Odej Kao, Odej Kao

AI Summary

This paper presents a system for distributed LLM pretraining that elastically schedules training across geo-distributed GPU clusters during renewable energy curtailment windows, switching between local single-site training and federated multi-site synchronization. The goal is to leverage curtailed renewable energy for LLM pretraining, reducing both cost and carbon emissions. The authors demonstrate the feasibility of this approach by pretraining a 561M-parameter transformer model across three clusters using the Flower federated learning framework, achieving a 5-12% reduction in operational emissions compared to single-site baselines.

Key Contribution

Train LLMs on otherwise-wasted renewable energy and slash operational emissions by up to 95% without sacrificing model quality.

Abstract

Training large language models (LLMs) requires substantial compute and energy. At the same time, renewable energy sources regularly produce more electricity than the grid can absorb, leading to curtailment, the deliberate reduction of clean generation that would otherwise go to waste. These periods represent an opportunity: if training is aligned with curtailment windows, LLMs can be pretrained using electricity that is both clean and cheap. This technical report presents a system that performs full-parameter LLM training across geo-distributed GPU clusters during regional curtailment windows, elastically switching between local single-site training and federated multi-site synchronization as sites become available or unavailable. Our prototype trains a 561M-parameter transformer model across three clusters using the Flower federated learning framework, with curtailment periods derived from real-world marginal carbon intensity traces. Preliminary results show that curtailment-aware scheduling preserves training quality while reducing operational emissions to 5-12% of single-site baselines.

Distributed Systems & Hardware Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References41

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study

Related Papers