Search papers, labs, and topics across Lattice.
Beihang University
3
0
4
Integrating visual frames with textual reasoning steps, VTI-CoT achieves state-of-the-art video reasoning performance while boosting training efficiency.
Single-view RGB input can revolutionize how robots perceive and manipulate transparent objects, achieving reliable grasping without complex depth sensing.
Achieve more precise and generalizable robot grasping by injecting grasp pose priors into latent diffusion policies, outperforming existing imitation learning methods.