Qi Zhu

Northwestern University, T‖𝝃t−𝝃t(0)‖2,\boldsymbol{\xi}^{*}_{1:T}=\arg\min_{\boldsymbol{\xi}_{1:T}}\;\sum_{c\in\mathcal{C}}\sum_{t=1}^{T}\bigl[\max\!\bigl(0,\,c(\mathbf{k}_{t})\bigr)\bigr]^{2}+\lambda\sum_{t=1}^{T}\bigl\|\boldsymbol{\xi}_{t}-\boldsymbol{\xi}^{(0)}_{t}\bigr\|^{2}, (5) where 𝐤t\mathbf{k}_{t} denotes the, H. Li is with the Department of Data Science and Artificial Intelligence, Monash University, Melbourne, Australia. H. Li is also with the ARC Centre of Excellence for the Weather of the 21st CenturyH. Wang and J. Shen are with the School of Computing and Information Technology, University of Wollongong, Wollongong, Australia.Y. Lin and Y. Xu are with the Shenzhen Key Laboratory of Visual Object Detection and Recognition, Harbin Institute of Technology (Shenzhen), Shenzhen, China.X. Luo is with the College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, ChinaQ. Zhu is with the College of Artificial Intelligence, Nanjing University of Aeronautics and Astronautics, Nanjing, ChinaJ. Shi is with the Centre for Nutrition and Food Sciences, The University of Queensland, Brisbane, AustraliaH. Chen is with the School of Electrical and Computer Engineering, University of Sydney, Sydney, AustraliaB. Du is with the Department of Management, Griffith University, Brisbane, AustraliaJ. Barthelemy is with the NVIDIA, Santa Clara, USAZ. Xue is with The University of New South Wales, Sydney, AustraliaCorresponding author: Jun Shen, email: jshen@uow.edu.au and Yong Xu, email: laterfall@hit.edu.cn

NVIDIA Research

Papers on Lattice

Total citations

Topics

h-index

Research focus

Multimodal Models (1)Robotics & Embodied AI (1)World Models & Planning (1)

Frequent co-authors

Gehao Zhang (1)Zhenyang Ni (1)Payal Mohapatra (1)Hangxu Liu (1)

Papers (1)

Mar 5, 2026

D keypoints intoMar 5, 2026·also NVIDIA, Fudan, Northwestern, Shanghai AI Lab

EmboAlign: Aligning Video Generation with Compositional Constraints for Zero-Shot Manipulation

By combining video generation and vision-language models, EmboAlign achieves a 43% boost in real-world robot manipulation success without any task-specific training.

Gehao Zhang, Zhenyang Ni, Payal Mohapatra +3

Multimodal Models Robotics & Embodied AI World Models & Planning

Search

Qi Zhu

Research focus

Frequent co-authors

Papers (1)