Jun 10, 2026arXiv:2606.11797

Space-sampled Value Decay: Forgetting Mechanisms for Non-stationary Deep Reinforcement Learning

Felix Störck, Fabian Hinder, Barbara Hammer

AI Summary

This paper introduces Space-sampled Value Decay, a novel forgetting mechanism designed to enhance value-based deep reinforcement learning (RL) in non-stationary environments. By allowing agents to adapt their behavior without requiring explicit information about environmental changes, the method shows promising improvements in performance for Deep Q-networks (DQN) and Soft Actor-Critic (SAC) architectures. The findings reveal that while the approach effectively mitigates the impact of drift, it also presents certain limitations in the returns achieved in dynamic settings.

Key Contribution

Forgetting mechanisms can significantly boost the adaptability of RL agents in changing environments, even without explicit drift information.

Abstract

Studies on rodents such as mice have shown the capabilities to adapt their behavior when dealing with changing parameters (``drift'') of the environment even if no information about change is provided (uncertainty) -- a behavior that can be modeled by forgetting mechanisms. Non-stationary Reinforcement Learning (NSRL) deals with adapting state-of-the-art RL methods to deal with changing environments: these however usually require (partially) perfect information about the drift such as ``task IDs'' or ``context''. To mitigate the effects of drift, this work develops \emph{Space-sampled Value Decay} as an explicit forgetting mechanism for value-based deep RL architectures as a simple yet effective approach. In particular we demonstrate and discuss positive effects but also limitations in achieved returns for modifications of Deep Q-networks (DQN) and Soft Actor-Critic (SAC) when evaluated on non-stationary environments.

World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Space-sampled Value Decay: Forgetting Mechanisms for Non-stationary Deep Reinforcement Learning

Related Papers