FudanINSAITLaboratoryLumos RoboticsSJTUSoochowXi'an Jiaotong-Liverpool UniversityJun 9, 2026arXiv:2606.10382

UMI-Bench 1.0: An Open and Reproducible Real-World Benchmark for Tabletop Robotic Manipulation with UMI Data

Shi Jin, Yuntian Wang, Yuhui Duan, Di Wu, Gaoqi Dong, Xiaohang Liu, Xiaotong Li, Hongfei Jia, Zehao Zhang, Tianyu Wang, Zhongjie Jia, Yuanqi Yao, Chenjia Bai, Zhaxizhuoma, Siao Liu, Nieqing Cao, Jin Wang, Chao Yu, Yan Ding

AI Summary

This paper introduces UMI-Bench 1.0, a novel benchmark specifically designed for the real-world evaluation of Universal Manipulation Interface (UMI)-style robotic manipulation policies. It addresses the critical gap in existing benchmarks by aligning data collection, scene reset, policy execution, result logging, and task-factor analysis within a unified protocol tailored for UMI data-to-deployment scenarios. The key finding is that UMI-Bench enables reproducible and auditable evaluations, thereby enhancing the understanding of how UMI-trained policies perform in real physical environments.

Key Contribution

UMI-Bench 1.0 reveals that standardized, reproducible evaluations can significantly improve our understanding of UMI-style manipulation policies in real-world settings.

Abstract

Real-robot evaluation is essential for understanding whether learned manipulation policies can operate reliably outside curated demonstrations. This need is particularly pressing for Universal Manipulation Interface (UMI)-style policies, whose performance depends on the coupling between wrist-view observations, action representation, data collection, and physical deployment. Existing real-world benchmarks have made important progress, but they are not designed around this UMI data-to-deployment setting. We present UMI-Bench 1.0, a local-first real-robot benchmark for standardized evaluation of UMI-style manipulation policies. To the best of our knowledge, this is the first benchmark dedicated to real-world evaluation of UMI-based manipulation models. UMI-Bench aligns data collection, scene reset, policy execution, result logging, and task-factor analysis within a unified protocol. By making the full evaluation process reproducible and auditable, UMI-Bench provides a practical testbed for measuring how UMI-trained policies generalize to real physical manipulation.

Eval Frameworks & Benchmarks Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

UMI-Bench 1.0: An Open and Reproducible Real-World Benchmark for Tabletop Robotic Manipulation with UMI Data

Related Papers