Hangjun Ye

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Robotics & Embodied AI (6)Multimodal Models (5)World Models & Planning (4)Computer Vision (2)

Frequent co-authors

Hongwei Xie (4)Yingbo Tang (3)Guang Chen (3)Yuncheng Jiang (2)

Papers (7)

Jun 4, 2026

Ziyang Yao +136d ago

Discrete-WAM: Unified Discrete Vision-Action Token Editing for World-Policy Learning

Discrete-WAM enables compositional causal reasoning in autonomous driving, outperforming traditional methods that struggle with complex state-action dynamics.

Ziyang Yao, Haochen Liu, Yuncheng Jiang +11

Multimodal Models Robotics & Embodied AI World Models & Planning

May 31, 2026

Tsinghua AI1w ago·also Ant Group, CAS, HKUST, Pengcheng Laboratory +1

OneVLA: A Unified Framework for Embodied Tasks

OneVLA unifies navigation and manipulation tasks into a single framework, enabling robots to seamlessly interpret commands and interact with their environments like never before.

Lingfeng Zhang, Xiaoshuai Hao, Yingbo Tang +10

Multimodal Models Robotics & Embodied AI

May 21, 2026

2w ago

LVDrive: Latent Visual Representation Enhanced Vision-Language-Action Autonomous Driving Model

Ditch pixel-perfect reconstruction: LVDrive shows that learning future scene representations in a high-level latent space dramatically improves autonomous driving performance.

Xiaodong Mei, Diankun Zhang, Hongwei Xie +2

Multimodal Models Robotics & Embodied AI World Models & Planning

Apr 29, 2026

Tsinghua AIApr 29, 2026·also CAS, Fudan, HFUT, Pengcheng Laboratory +2

Walk With Me: Long-Horizon Social Navigation for Human-Centric Outdoor Assistance

Robots can now navigate complex outdoor environments using only high-level human instructions and readily available GPS/map data, bypassing the need for expensive HD maps or limited short-horizon policies.

Lingfeng Zhang, Xiaoshuai Hao, Xizhou Bu +10

Natural Language Processing Robotics & Embodied AI

Apr 20, 2026

Jinghui Lu +55Apr 20, 2026·also CAS, PolyU, SYSU, Xiaomi Inc.

OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

Latent reasoning can beat explicit Chain-of-Thought – but only if you force it to learn causal dynamics via a visual world model, not just language.

Jinghui Lu, Jiayi Guan, Zhijian Huang +53

Multimodal Models Reasoning & Chain-of-Thought World Models & Planning

Apr 20, 2026·also Tsinghua AI, Rimbot

XEmbodied: A Foundation Model with Enhanced Geometric and Physical Cues for Large-Scale Embodied Environments

Endowing VLMs with intrinsic 3D geometric awareness and physical interaction cues via XEmbodied substantially boosts performance on spatial reasoning and embodied tasks, surpassing existing 2D image-text pretrained models.

Kangan Qian, ChuChu Xie, Yang Zhong +13

Computer Vision Multimodal Models Robotics & Embodied AI

Apr 5, 2026

Microsoft ResearchApr 5, 2026·also Cambridge, HKUST

DriveVA: Video Action Models are Zero-Shot Drivers

Autonomous driving models can now achieve remarkable zero-shot generalization by leveraging the power of large-scale video generation models to jointly predict future actions and visuals.

Mengmeng Liu, Diankun Zhang, Jiuming Liu +5

Computer Vision Robotics & Embodied AI World Models & Planning

Search

Hangjun Ye

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (7)