BeihangApr 24, 2026

Adviser–Actor–Critic: A Precision-Oriented Reinforcement Learning Framework for Space Robotics Control

Donghe Chen, Jiaxuan Yue, Yubin Peng, Lin Cheng, Sheng-hao Gong

AI Summary

This paper introduces the adviser–actor–critic (AAC) framework, which integrates traditional control theory with deep reinforcement learning to enhance precision in space robotics tasks. By utilizing a dual-loop architecture, the AAC framework generates virtual goals that effectively compensate for tracking errors, achieving over 80% reduction in steady-state error across various simulations and hardware experiments. The results demonstrate that AAC significantly improves attitude regulation from 1° to 0.03° in a quadrotor platform, underscoring its potential for critical space applications where precision is paramount.

Key Contribution

Achieving over 80% reduction in steady-state error, the AAC framework revolutionizes precision control in space robotics by merging classical control with deep reinforcement learning.

Abstract

High-precision control is critical for autonomous space robotics tasks, such as space-pointing observation, debris removal, and on-orbit assembly, where even submillimeter or subdegree errors can jeopardize mission safety. Classical feedback controllers, such as proportional–integral–derivative, can effectively eliminate steady-state error but are for nonlinear multi-input multioutput systems, whereas deep reinforcement learning (DRL) offers strong real-time optimal decision-making and adaptability to complex dynamics but suffers from significant steady-state error due to neural approximation errors. This article proposes an adviser–actor–critic (AAC) framework that couples traditional control theory with reinforcement learning (RL) through a dual-loop architecture featuring a lightweight proportional-integral-based adviser. During deployment, the adviser generates virtual goals that proactively compensate accumulated tracking errors, while a pretrained goal-conditioned actor handles complex dynamics without modification. A control-theoretic analysis shows that under mild assumptions, AAC can eliminate steady-state error for a broad class of systems. Simulations across standard manipulation, dexterous-hand benchmarks, and space-relevant docking scenarios demonstrate over 80% average steady-state error reduction across diverse RL backbones. Hardware experiments on a quadrotor platform validate that AAC improves attitude regulation from 1$^{\circ }$ to 0.03$^{\circ }$ under realistic disturbances, highlighting its practicality for precision-critical space applications.

Robotics & Embodied AI Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References62

Year2026

VenueIEEE Transactions on Aerospace and Electronic Systems

Related Papers

Finding related papers...

Search

Adviser–Actor–Critic: A Precision-Oriented Reinforcement Learning Framework for Space Robotics Control

Related Papers