Feb 26, 2026arXiv:2602.22862

GraspLDP: Towards Generalizable Grasping Policy via Latent Diffusion

Enda Xiang, Enda Xiang, Haoxiang Ma, Xinzhu Ma, Xinzhu Ma, Zicheng Liu, Di Huang

AI Summary

The paper introduces GraspLDP, a latent diffusion policy framework for robotic grasping that improves precision and generalization in imitation learning. It incorporates grasp pose priors to guide action chunk decoding and uses a self-supervised reconstruction objective to embed graspness priors by reconstructing wrist-camera images from intermediate diffusion representations. Experiments in simulation and on a real robot show that GraspLDP outperforms baselines and exhibits strong dynamic grasping capabilities.

Key Contribution

Achieve more precise and generalizable robot grasping by injecting grasp pose priors into latent diffusion policies, outperforming existing imitation learning methods.

Abstract

This paper focuses on enhancing the grasping precision and generalization of manipulation policies learned via imitation learning. Diffusion-based policy learning methods have recently become the mainstream approach for robotic manipulation tasks. As grasping is a critical subtask in manipulation, the ability of imitation-learned policies to execute precise and generalizable grasps merits particular attention. Existing imitation learning techniques for grasping often suffer from imprecise grasp executions, limited spatial generalization, and poor object generalization. To address these challenges, we incorporate grasp prior knowledge into the diffusion policy framework. In particular, we employ a latent diffusion policy to guide action chunk decoding with grasp pose prior, ensuring that generated motion trajectories adhere closely to feasible grasp configurations. Furthermore, we introduce a self-supervised reconstruction objective during diffusion to embed the graspness prior: at each reverse diffusion step, we reconstruct wrist-camera images back-projected the graspness from the intermediate representations. Both simulation and real robot experiments demonstrate that our approach significantly outperforms baseline methods and exhibits strong dynamic grasping capabilities.

Computer Vision Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References55

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

GraspLDP: Towards Generalizable Grasping Policy via Latent Diffusion

Related Papers