Search papers, labs, and topics across Lattice.
This paper introduces a model-based RL framework that learns linear dynamics of nonlinear robotic systems using Koopman operator theory and integrates it into an actor-critic architecture. To improve sample efficiency and reduce rollout errors, policy gradients are estimated using one-step predictions of the learned dynamics. Experiments on simulated and real-world robotic systems (Kinova Gen3 arm, Unitree Go1 quadruped) demonstrate improved sample efficiency over model-free RL and comparable performance to classical model-based control.
Achieve model-free RL sample efficiency with model-based control performance by sidestepping multi-step rollout errors via one-step Koopman dynamics predictions.
This paper presents a model-based reinforcement learning (RL) framework for optimal closed-loop control of nonlinear robotic systems. The proposed approach learns linear lifted dynamics through Koopman operator theory and integrates the resulting model into an actor-critic architecture for policy optimization, where the policy represents a parameterized closed-loop controller. To reduce computational cost and mitigate model rollout errors, policy gradients are estimated using one-step predictions of the learned dynamics rather than multi-step propagation. This leads to an online mini-batch policy gradient framework that enables policy improvement from streamed interaction data. The proposed framework is evaluated on several simulated nonlinear control benchmarks and two real-world hardware platforms, including a Kinova Gen3 robotic arm and a Unitree Go1 quadruped. Experimental results demonstrate improved sample efficiency over model-free RL baselines, superior control performance relative to model-based RL baselines, and control performance comparable to classical model-based methods that rely on exact system dynamics.