Search papers, labs, and topics across Lattice.
Exp-Force addresses the problem of pre-contact grasp force selection for robotic manipulation by retrieving relevant prior grasping experiences and conditioning a vision-language model on these examples for in-context inference. This approach avoids reliance on analytic contact models or manual heuristics, improving force estimation accuracy. Experiments on a dataset of 129 objects and real-world tests with 30 unseen objects demonstrate a significant reduction in force estimation error and an increase in appropriate force selection rate.
Forget hand-engineered force models: a vision-language model, conditioned on past grasp attempts, can predict the perfect pre-grasp force, boosting success rates by 24% on real robots.
Accurate pre-contact grasp force selection is critical for safe and reliable robotic manipulation. Adaptive controllers regulate force after contact but still require a reasonable initial estimate. Starting a grasp with too little force requires reactive adjustment, while starting a grasp with too high a force risks damaging fragile objects. This trade-off is particularly challenging for compliant grippers, whose contact mechanics are difficult to model analytically. We propose Exp-Force, an experience-conditioned framework that predicts the minimum feasible grasping force from a single RGB image. The method retrieves a small set of relevant prior grasping experiences and conditions a vision-language model on these examples for in-context inference, without analytic contact models or manually designed heuristics. On 129 object instances, ExpForce achieves a best-case MAE of 0.43 N, reducing error by 72% over zero-shot inference. In real-world tests on 30 unseen objects, it improves appropriate force selection rate from 63% to 87%. These results demonstrate that Exp-Force enables reliable and generalizable pre-grasp force selection by leveraging prior interaction experiences. http://expforcesubmission.github.io/Exp-Force-Website/