Mar 9, 2026arXiv:2603.08668

Exp-Force: Experience-Conditioned Pre-Grasp Force Selection with Vision-Language Models

Siqi Shang, Minchao Huang, Bill Fan, Lillian Chin

AI Summary

Exp-Force addresses the problem of pre-contact grasp force selection for robotic manipulation by retrieving relevant prior grasping experiences and conditioning a vision-language model on these examples for in-context inference. This approach avoids reliance on analytic contact models or manual heuristics, improving force estimation accuracy. Experiments on a dataset of 129 objects and real-world tests with 30 unseen objects demonstrate a significant reduction in force estimation error and an increase in appropriate force selection rate.

Key Contribution

Forget hand-engineered force models: a vision-language model, conditioned on past grasp attempts, can predict the perfect pre-grasp force, boosting success rates by 24% on real robots.

Abstract

Accurate pre-contact grasp force selection is critical for safe and reliable robotic manipulation. Adaptive controllers regulate force after contact but still require a reasonable initial estimate. Starting a grasp with too little force requires reactive adjustment, while starting a grasp with too high a force risks damaging fragile objects. This trade-off is particularly challenging for compliant grippers, whose contact mechanics are difficult to model analytically. We propose Exp-Force, an experience-conditioned framework that predicts the minimum feasible grasping force from a single RGB image. The method retrieves a small set of relevant prior grasping experiences and conditions a vision-language model on these examples for in-context inference, without analytic contact models or manually designed heuristics. On 129 object instances, ExpForce achieves a best-case MAE of 0.43 N, reducing error by 72% over zero-shot inference. In real-world tests on 30 unseen objects, it improves appropriate force selection rate from 63% to 87%. These results demonstrate that Exp-Force enables reliable and generalizable pre-grasp force selection by leveraging prior interaction experiences. http://expforcesubmission.github.io/Exp-Force-Website/

Computer Vision Multimodal Models Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References25

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Exp-Force: Experience-Conditioned Pre-Grasp Force Selection with Vision-Language Models

Related Papers