Search papers, labs, and topics across Lattice.
This paper investigates the impact of action space design choices on imitation learning for robotic manipulation policies. Through a large-scale empirical study involving over 13,000 real-world rollouts and 500 trained models, the authors analyze the effects of temporal (absolute vs. delta actions) and spatial (joint-space vs. task-space) action representations on policy learnability and control stability. The results demonstrate that predicting delta actions improves performance, while joint-space and task-space representations offer trade-offs between control stability and generalization, respectively.
Stop guessing about action spaces for robot manipulation: a massive empirical study reveals that predicting delta actions boosts performance, while joint vs. task space offers a stability vs. generalization tradeoff.
The specification of the action space plays a pivotal role in imitation-based robotic manipulation policy learning, fundamentally shaping the optimization landscape of policy learning. While recent advances have focused heavily on scaling training data and model capacity, the choice of action space remains guided by ad-hoc heuristics or legacy designs, leading to an ambiguous understanding of robotic policy design philosophies. To address this ambiguity, we conducted a large-scale and systematic empirical study, confirming that the action space does have significant and complex impacts on robotic policy learning. We dissect the action design space along temporal and spatial axes, facilitating a structured analysis of how these choices govern both policy learnability and control stability. Based on 13,000+ real-world rollouts on a bimanual robot and evaluation on 500+ trained models over four scenarios, we examine the trade-offs between absolute vs. delta representations, and joint-space vs. task-space parameterizations. Our large-scale results suggest that properly designing the policy to predict delta actions consistently improves performance, while joint-space and task-space representations offer complementary strengths, favoring control stability and generalization, respectively.