Search papers, labs, and topics across Lattice.
Northwestern University, T‖𝝃t−𝝃t(0)‖2,\boldsymbol{\xi}^{*}_{1:T}=\arg\min_{\boldsymbol{\xi}_{1:T}}\;\sum_{c\in\mathcal{C}}\sum_{t=1}^{T}\bigl[\max\!\bigl(0,\,c(\mathbf{k}_{t})\bigr)\bigr]^{2}+\lambda\sum_{t=1}^{T}\bigl\|\boldsymbol{\xi}_{t}-\boldsymbol{\xi}^{(0)}_{t}\bigr\|^{2}, (5) where 𝐤t\mathbf{k}_{t} denotes the, H. Li is with the Department of Data Science and Artificial Intelligence, Monash University, Melbourne, Australia. H. Li is also with the ARC Centre of Excellence for the Weather of the 21st CenturyH. Wang and J. Shen are with the School of Computing and Information Technology, University of Wollongong, Wollongong, Australia.Y. Lin and Y. Xu are with the Shenzhen Key Laboratory of Visual Object Detection and Recognition, Harbin Institute of Technology (Shenzhen), Shenzhen, China.X. Luo is with the College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, ChinaQ. Zhu is with the College of Artificial Intelligence, Nanjing University of Aeronautics and Astronautics, Nanjing, ChinaJ. Shi is with the Centre for Nutrition and Food Sciences, The University of Queensland, Brisbane, AustraliaH. Chen is with the School of Electrical and Computer Engineering, University of Sydney, Sydney, AustraliaB. Du is with the Department of Management, Griffith University, Brisbane, AustraliaJ. Barthelemy is with the NVIDIA, Santa Clara, USAZ. Xue is with The University of New South Wales, Sydney, AustraliaCorresponding author: Jun Shen, email: jshen@uow.edu.au and Yong Xu, email: laterfall@hit.edu.cn
NVIDIA Research1
0
3
3
By combining video generation and vision-language models, EmboAlign achieves a 43% boost in real-world robot manipulation success without any task-specific training.