B teacher into ourHKUSTMuka RoboticsNJUPKUJun 7, 2026arXiv:2606.08737

Dream-Tac: A Unified Tactile World Action Model for Contact-Rich Robot Manipulation

Yunfan Lou, Yifan Ye, Yankai Fu, Jun Cen, Xiaowei Chi, Yaoxu Lyu, Peidong Jia, Sirui Han, Zhihe Lu, Shanghang Zhang

AI Summary

This paper introduces Dream-Tac, a unified Tactile-World Action Model that enhances robot manipulation in contact-rich environments by integrating tactile signals with visual observations. The model employs contact-gated visuotactile fusion and a contact-aware attention bias to improve cross-modal interactions, resulting in significantly improved action accuracy. With a dual-level acceleration strategy, Dream-Tac achieves up to 2.9× faster training and 1.8× faster inference while enhancing performance across six manipulation tasks by an average of 31.7%.

Key Contribution

Dream-Tac boosts robot manipulation accuracy by over 31% by effectively merging tactile and visual data in real-time.

Abstract

World action models inherit the predictive capability of world models, enabling action generation to be guided by anticipated future observations. However, they rely primarily on vision and often fail in contact-rich manipulation, where critical cues arise from physical interaction. In this paper, we propose Dream-Tac, a unified Tactile-World Action Model that jointly models actions, future visual observations, and tactile dynamics. Specifically, Dream-Tac introduces (i) contact-gated visuotactile fusion to selectively integrate tactile signals and (ii) a contact-aware attention bias to better regulate cross-modal interactions during manipulation. To support real-time deployment, we further design a dual-level acceleration strategy, reformulating the contact-aware bias to preserve the fused attention path during training and introducing cache-based diffusion acceleration at inference, achieving up to 2.9$\times$ faster training and 1.8$\times$ faster inference. Across six contact-rich manipulation tasks, Dream-Tac improves action accuracy by 31.7\% on average, demonstrating the effectiveness of unified visuotactile world modeling.Code is available at https://github.com/LYFCLOUDFAN/Dream-Tac.

Multimodal Models Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Dream-Tac: A Unified Tactile World Action Model for Contact-Rich Robot Manipulation

Related Papers