Xuelong Li

Institute of Artificial Intelligence (TeleAI), China Telecom Abstract The high cost of collecting real-robot data has made robotic simulation a scalable platform for both evaluation and data generation. Yet, most existing benchmarks concentrate on simple manipulation tasks like pick-and-place, failing to capture the non-Markovian characteristics of real-world tasks and the complexity of articulated object interactions. To address this limitation, we present RuleSafe, a new articulated manipulation benchmark built upon a scalable, LLM-aided simulation framework. RuleSafe features safes with diverse unlocking mechanisms—such as key, password, and logic locks—that require distinct multi-stage reasoning and manipulation strategies. These LLM-generated rules yield non-Markovian, long-horizon tasks that demand temporal modeling and memory-based reasoning. We further propose VQ-Memory, a compact and structured temporal representation that leverages vector-quantized variational autoencoders (VQ-VAEs) to encode past proprioceptive states into discrete latent tokens. This representation effectively filters low-level noise while preserving high-level task-phase context, providing lightweight yet robust temporal cues that are compatible with existing Vision-Language-Action models (VLA). Extensive experiments on state-of-the-art VLA models and diffusion policies demonstrate that VQ-Memory consistently improves long-horizon planning, enhances generalization to unseen configurations, and achieves more efficient manipulation with reduced computational cost. Project page is https://vqmemory.github.io. 00footnotetext:, [

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Robotics & Embodied AI (1)World Models & Planning (1)Multimodal Models (1)

Frequent co-authors

Zhigang Wang (2)Dong Wang (2)Bin Zhao (2)Pengyuan Wu (1)

Papers (2)

Mar 2, 2026

Pengyuan Wu +8Mar 2, 2026

Closed-Loop Action Chunks with Dynamic Corrections for Training-Free Diffusion Policy

Diffusion policies get a 19% adaptability boost in dynamic robotic tasks, without retraining, thanks to a closed-loop architecture that dynamically corrects actions in real-time.

Pengyuan Wu, Pengyuan Wu, Pingrui Zhang +6

Architecture Design (Transformers, SSMs, MoE)Robotics & Embodied AI World Models & Planning

Feb 4, 2026

Feb 4, 2026·also Shanghai AI Lab

Understanding Degradation with Vision Language Model

VLMs can be taught to understand the physics of image degradation well enough to control diffusion models for zero-shot image restoration, without fine-tuning the generative backbone.

Guan-Wei Lan, Chenyi Liao, Yuqi Yang +5

Multimodal Models

Search

Xuelong Li

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)