Search papers, labs, and topics across Lattice.
This paper introduces EgoTactile, a novel benchmark that pairs egocentric video with full-hand grasp pressure supervision to enhance the understanding of grasp pressure for everyday objects. The authors develop EgoPressureDiff, a conditional diffusion framework that utilizes a pre-trained video diffusion model and incorporates a Physically-Informed Feature Rectification layer to address uncertainties in partial observations. Experimental results show that this approach significantly outperforms existing methods and demonstrates robust transferability to real-world scenarios, marking a significant advancement in vision-based grasp pressure estimation.
EgoPressureDiff not only outperforms traditional methods but also effectively resolves visual-physical ambiguities in grasp pressure estimation for complex 3D interactions.
Estimating full-hand grasp pressure from egocentric video is critical for immersive VR and robotic manipulation, yet dense tactile sensing often relies on intrusive hardware. Existing vision-based methods predominantly rely on planar surfaces or fingertip contacts, failing to generalize to complex 3D object interactions. Therefore, we introduce EgoTactile, a benchmark pairing egocentric video with full-hand pressure supervision for diverse everyday objects, incorporating a bare-hand transfer subset to enable generalization to natural scenarios. Leveraging this benchmark, we first establish EgoPressureFormer as a discriminative baseline. Beyond this, to explicitly address the uncertainty in partial observations, we propose EgoPressureDiff, a conditional diffusion framework that adapts a large-scale pre-trained video diffusion backbone. By combining rich world knowledge priors with a Physically-Informed Feature Rectification layer to inject semantic constraints, our approach effectively infers plausible contact patterns and resolves visual-physical ambiguities. Extensive experiments demonstrate that our method achieves superior performance on the benchmark and robust transferability to in-the-wild scenarios. Our project page is available at https://egotactile.github.io/.