Search papers, labs, and topics across Lattice.
The paper introduces Feasibility-Aware Behavior Cloning from Observation (FABCO) to address the challenges of imitation learning from observation when the demonstrator's motions are infeasible for the robot. FABCO integrates behavior cloning from observation with a learned robot-dynamics model to estimate the feasibility of demonstrated motions. The estimated feasibility is then used to provide multimodal (visual and haptic) feedback to the demonstrator and to weight the demonstration data during policy learning, leading to improved policy robustness and performance.
Robots can learn significantly better from human demonstrations when the humans receive real-time feedback about whether the robot can actually perform the demonstrated actions.
Imitation learning frameworks that learn robot control policies from demonstrators' motions via hand-mounted demonstration interfaces have attracted increasing attention. However, due to differences in physical characteristics between demonstrators and robots, this approach faces two limitations: i) the demonstration data do not include robot actions, and ii) the demonstrated motions may be infeasible for robots. These limitations make policy learning difficult. To address them, we propose Feasibility-Aware Behavior Cloning from Observation (FABCO). FABCO integrates behavior cloning from observation, which complements robot actions using robot dynamics models, with feasibility estimation. In feasibility estimation, the demonstrated motions are evaluated using a robot-dynamics model, learned from the robot's execution data, to assess reproducibility under the robot's dynamics. The estimated feasibility is used for multimodal feedback and feasibility-aware policy learning to improve the demonstrator's motions and learn robust policies. Multimodal feedback provides feasibility through the demonstrator's visual and haptic senses to promote feasible demonstrated motions. Feasibility-aware policy learning reduces the influence of demonstrated motions that are infeasible for robots, enabling the learning of policies that robots can execute stably. We conducted experiments with 15 participants on two tasks and confirmed that FABCO improves imitation learning performance by more than 3.2 times compared to the case without feasibility feedback.