Sungshin Women’s UniversityJan 7, 2026

A Carbon-Efficient Framework for Deep Learning Workloads on GPU Clusters

AI Summary

The paper introduces a Carbon-Aware Resource Management (CA-RM) framework for GPU clusters that minimizes carbon emissions by dynamically adjusting GPU core frequency and intelligently placing workloads based on real-time renewable energy availability. They define a performance-per-carbon (PPC) metric and formulate carbon-constrained, performance-constrained, and PPC-driven optimization objectives to balance DNN training deadlines, inference latency, and carbon emission budgets. Simulation results using real-world renewable energy traces and NVIDIA RTX4090 GPU profiling data demonstrate a 35% average carbon reduction compared to other approaches while maintaining service-level agreement (SLA) targets.

Key Contribution

Slash your deep learning carbon footprint by 35% without touching hardware or models: a carbon-aware resource manager dynamically juggles GPU frequency and workload placement to align with renewable energy availability.

Abstract

The explosive growth of artificial intelligence (AI) services has led to massive scaling of GPU computing clusters, causing sharp rises in power consumption and carbon emissions. Although hardware-level accelerator enhancements and deep neural network (DNN) model compression techniques can improve power efficiency, they often encounter deployment barriers and risks of accuracy loss in practice. To address these issues without altering hardware or model architectures, we propose a novel Carbon-Aware Resource Management (CA-RM) framework for GPU clusters. In order to minimize the carbon emission, the CA-RM framework dynamically adjusts energy usage by combining real-time GPU core frequency scaling with intelligent workload placement, aligning computation with the temporal availability of renewable generation. We introduce a new metric, performance-per-carbon (PPC), and develop three optimization formulations: carbon-constrained, performance-constrained, and PPC-driven objectives that simultaneously respect DNN model training deadlines, inference latency requirements, and carbon emission budgets. Through extensive simulations using real-world renewable energy traces and profiling data collected from NVIDIA RTX4090 GPU running representative DNN workloads, we show that the CA-RM framework substantially reduces carbon emission while satisfying service-level agreement (SLA) targets across a wide range of workload characteristics. Through experimental evaluation, we verify that the proposed CA-RM framework achieves approximately 35% carbon reduction on average, compared to competing approaches, while still ensuring acceptable processing performance across diverse workload behaviors.

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization

Citation Metrics

Citations0

Influential citations0

References12

Year2026

VenueApplied Sciences

Related Papers

Finding related papers...

Search

A Carbon-Efficient Framework for Deep Learning Workloads on GPU Clusters

Related Papers