Apr 27, 2026arXiv:2604.24661

Agent-Centric Visual Reinforcement Learning under Dynamic Perturbations

Zhengru Fang, Yu Guo, Fei Liu, Yuang Zhang, Yihang Tao, Senkang Hu, Wenbo Ding, Yuguang Fang

AI Summary

This paper introduces the Visual Degraded Control Suite (VDCS), a benchmark for visual RL robustness against dynamic, Markov-switching corruptions. They show that existing methods struggle due to entanglement of perturbation artifacts in latent representations, a problem they analyze information-theoretically. To address this, they propose Agent-Centric Observations with Mixture-of-Experts (ACO-MoE), which decouples perception from perturbation, achieving near-clean performance on VDCS and state-of-the-art results on DMControl Generalization.

Key Contribution

Visual RL agents can recover near-perfect performance even under severe, dynamically changing visual corruptions by learning to disentangle task-relevant foreground from perturbation artifacts.

Abstract

Visual reinforcement learning aims to empower an agent to learn policies from visual observations, yet it remains vulnerable to dynamic visual perturbations, such as unpredictable shifts in corruption types. To systematically study this, we introduce the Visual Degraded Control Suite (VDCS), a benchmark extending DeepMind Control Suite with Markov-switching degradations to simulate non-stationary real-world perturbations. Experiments on VDCS reveal severe performance degradation in existing methods. We theoretically prove via information-theoretic analysis that this failure stems from reconstruction-based objectives inevitably entangling perturbation artifacts into latent representations. To mitigate this negative impact, we propose Agent-Centric Observations with Mixture-of-Experts (ACO-MoE) to robustify visual RL against perturbations. The proposed framework leverages unique agent-centric restoration experts, achieving restoration from corruptions and task-relevant foreground extraction, thereby decoupling perception from perturbation before being processed by the RL agent. Extensive experiments on VDCS show our ACO-MoE outperforms strong baselines, recovering 95.3% of clean performance under challenging Markov-switching corruptions. Moreover, it achieves SOTA results on DMControl Generalization with random-color and video-background perturbations, demonstrating a high level of robustness.

Computer Vision Red-Teaming & Adversarial Robustness Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Agent-Centric Visual Reinforcement Learning under Dynamic Perturbations

Related Papers