Feb 23, 2026arXiv:2602.19805

Decision MetaMamba: Enhancing Selective SSM in Offline RL with Heterogeneous Sequence Mixing

AI Summary

The paper introduces Decision MetaMamba (DMM), a novel architecture for offline reinforcement learning that addresses information loss in Mamba-based models caused by selective scanning. DMM replaces Mamba's token mixer with a dense layer-based sequence mixer that considers all channels simultaneously, mitigating information loss from selective scanning and residual gating. Empirical results demonstrate that DMM achieves state-of-the-art performance across diverse RL tasks with a compact parameter footprint.

Key Contribution

Mamba's selective scan can hurt offline RL, but a dense-layer sequence mixer can fix it, boosting performance to state-of-the-art while keeping the parameter count low.

Abstract

Mamba-based models have drawn much attention in offline RL. However, their selective mechanism often detrimental when key steps in RL sequences are omitted. To address these issues, we propose a simple yet effective structure, called Decision MetaMamba (DMM), which replaces Mamba's token mixer with a dense layer-based sequence mixer and modifies positional structure to preserve local information. By performing sequence mixing that considers all channels simultaneously before Mamba, DMM prevents information loss due to selective scanning and residual gating. Extensive experiments demonstrate that our DMM delivers the state-of-the-art performance across diverse RL tasks. Furthermore, DMM achieves these results with a compact parameter footprint, demonstrating strong potential for real-world applications.

Architecture Design (Transformers, SSMs, MoE)Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Decision MetaMamba: Enhancing Selective SSM in Offline RL with Heterogeneous Sequence Mixing

Related Papers