Search papers, labs, and topics across Lattice.
This paper introduces Physics-Informed Policy Optimization via Analytic Dynamics Regularization (PIPER), a novel RL framework that integrates physical constraints into neural policy optimization using a differentiable Lagrangian residual as a regularization term. By incorporating this residual, derived from the robot's simulator description, PIPER biases policy updates towards dynamically consistent solutions without modifying the simulator or RL algorithm. Experiments show PIPER improves learning efficiency, stability, and control accuracy in robotic control tasks.
Achieve more efficient and physically plausible robot control by baking differentiable Lagrangian mechanics directly into your RL policy optimization.
Reinforcement learning (RL) has achieved strong performance in robotic control; however, state-of-the-art policy learning methods, such as actor-critic methods, still suffer from high sample complexity and often produce physically inconsistent actions. This limitation stems from neural policies implicitly rediscovering complex physics from data alone, despite accurate dynamics models being readily available in simulators. In this paper, we introduce a novel physics-informed RL framework, called PIPER, that seamlessly integrates physical constraints directly into neural policy optimization with analytical soft physics constraints. At the core of our method is the integration of a differentiable Lagrangian residual as a regularization term within the actor's objective. This residual, extracted from a robot's simulator description, subtly biases policy updates towards dynamically consistent solutions. Crucially, this physics integration is realized through an additional loss term during policy optimization, requiring no alterations to existing simulators or core RL algorithms. Extensive experiments demonstrate that our method significantly improves learning efficiency, stability, and control accuracy, establishing a new paradigm for efficient and physically consistent robotic control.