Search papers, labs, and topics across Lattice.
This paper introduces a novel physics-informed regularization method for offline goal-conditioned reinforcement learning (GCRL) based on the viscosity solution of the Hamilton-Jacobi-Bellman (HJB) equation. The approach leverages the Feynman-Kac theorem to reformulate the PDE solution as an expectation, enabling stable Monte Carlo estimation and avoiding numerical instability associated with higher-order gradients. Experiments demonstrate improved geometric consistency and applicability to navigation and complex manipulation tasks, showcasing the benefits of grounding learning in optimal control theory.
By recasting the Hamilton-Jacobi-Bellman equation as a tractable Monte Carlo estimation, this work stabilizes physics-informed RL and unlocks its potential for high-dimensional control tasks.
Offline goal-conditioned reinforcement learning (GCRL) learns goal-conditioned policies from static pre-collected datasets. However, accurate value estimation remains a challenge due to the limited coverage of the state-action space. Recent physics-informed approaches have sought to address this by imposing physical and geometric constraints on the value function through regularization defined over first-order partial differential equations (PDEs), such as the Eikonal equation. However, these formulations can often be ill-posed in complex, high-dimensional environments. In this work, we propose a physics-informed regularization derived from the viscosity solution of the Hamilton-Jacobi-Bellman (HJB) equation. By providing a physics-based inductive bias, our approach grounds the learning process in optimal control theory, explicitly regularizing and bounding updates during value iterations. Furthermore, we leverage the Feynman-Kac theorem to recast the PDE solution as an expectation, enabling a tractable Monte Carlo estimation of the objective that avoids numerical instability in higher-order gradients. Experiments demonstrate that our method improves geometric consistency, making it broadly applicable to navigation and high-dimensional, complex manipulation tasks. Open-source codes are available at https://github.com/HrishikeshVish/phys-fk-value-GCRL.