KarlstadFeb 16, 2026arXiv:2602.14578

RNM-TD3: N:M Semi-structured Sparse Reinforcement Learning From Scratch

AI Summary

The paper introduces RNM-TD3, a reinforcement learning framework that leverages N:M semi-structured sparsity within the TD3 algorithm to balance model compression, performance, and hardware efficiency. By enforcing row-wise N:M sparsity during training, the method maintains compatibility with hardware accelerators designed for sparse matrix operations. Empirical results on continuous control tasks demonstrate that RNM-TD3 achieves superior performance compared to its dense counterpart at sparsity levels of 50%-75%, and remains competitive even at 87.5% sparsity.

Key Contribution

Surprisingly, a semi-structured sparse RL agent (RNM-TD3) not only matches but *outperforms* its dense counterpart at high sparsity levels (50-75%) on continuous control tasks, opening doors to hardware-accelerated RL training.

Abstract

Sparsity is a well-studied technique for compressing deep neural networks (DNNs) without compromising performance. In deep reinforcement learning (DRL), neural networks with up to 5% of their original weights can still be trained with minimal performance loss compared to their dense counterparts. However, most existing methods rely on unstructured fine-grained sparsity, which limits hardware acceleration opportunities due to irregular computation patterns. Structured coarse-grained sparsity enables hardware acceleration, yet typically degrades performance and increases pruning complexity. In this work, we present, to the best of our knowledge, the first study on N:M structured sparsity in RL, which balances compression, performance, and hardware efficiency. Our framework enforces row-wise N:M sparsity throughout training for all networks in off-policy RL (TD3), maintaining compatibility with accelerators that support N:M sparse matrix operations. Experiments on continuous-control benchmarks show that RNM-TD3, our N:M sparse agent, outperforms its dense counterpart at 50%-75% sparsity (e.g., 2:4 and 1:4), achieving up to a 14% increase in performance at 2:4 sparsity on the Ant environment. RNM-TD3 remains competitive even at 87.5% sparsity (1:8), while enabling potential training speedups.

Distributed Systems & Hardware Inference & Quantization Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

RNM-TD3: N:M Semi-structured Sparse Reinforcement Learning From Scratch

Related Papers