Jiayu Chen

Papers on Lattice

Total citations

Topics

h-index

Research focus

Multimodal Models (6)Computer Vision (4)Inference & Quantization (4)Robotics & Embodied AI (4)World Models & Planning (2)

Frequent co-authors

Maoliang Li (6)Zihao Zheng (6)Z. Mao (2)Zhihao Mao (2)

Papers (9)

Apr 6, 2026

Beijing Haitian Ruisheng ScienceApr 6, 2026

DIRECT: Video Mashup Creation via Hierarchical Multi-Agent Planning and Intent-Guided Editing

Achieve professional-grade video mashups by mimicking a human production pipeline, using hierarchical agents to handle global structure, editing intent, and fine-grained shot selection.

Ke Li, Maoliang Li, Maoliang Li +7

Computer Vision Multimodal Models World Models & Planning

Mar 19, 2026

Jiayi Luo +8Mar 19, 2026

Training-Free Sparse Attention for Fast Video Generation via Offline Layer-Wise Sparsity Profiling and Online Bidirectional Co-Clustering

Ditch the training: SVOO achieves up to 1.93x speedup in video generation with sparse attention by exploiting the intrinsic, layer-specific sparsity patterns of attention without any fine-tuning.

Jiayi Luo, Jiayu Chen, Jiayu Chen +6

Architecture Design (Transformers, SSMs, MoE)Computer Vision Inference & Quantization

Mar 18, 2026

Zihao Zheng +11Mar 18, 2026·also Corresponding Author

HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness

Robots can think (and act) twice as fast: HeiSD's hybrid speculative decoding turbocharges embodied agents by intelligently switching between draft and retrieval strategies.

Zihao Zheng, Z. Mao, Zhihao Mao +9

Inference & Quantization Multimodal Models Robotics & Embodied AI

Mar 13, 2026

Mar 13, 2026·also HUST, Imperial, KCL, NJU +1

Multimodal OCR: Parse Anything from Documents

Forget treating document graphics as mere pixels: this new OCR system parses them into reusable code, unlocking multimodal supervision and outperforming existing systems.

Handong Zheng, Kaile Zhang, Liang Xin +11

Computer Vision Multimodal Models Natural Language Processing

Mar 9, 2026

Xiaoquan Sun +15Mar 9, 2026·also Tsinghua AI, HUST

AtomVLA: Scalable Post-Training for Robotic Manipulation via Predictive Latent World Models

Stop struggling with compounding errors in long-horizon robotic tasks: AtomVLA leverages LLMs and latent world models to decompose tasks and score actions, boosting success rates to 97% on LIBERO.

Xiaoquan Sun, Zetian Xu, Chenxuan Cao +13

Multimodal Models Robotics & Embodied AI World Models & Planning

Mar 9, 2026·also Corresponding Author

RAPID: Redundancy-Aware and Compatibility-Optimal Edge-Cloud Partitioned Inference for Diverse VLA models

VLA models get a 1.73x speedup with only 5-7% overhead thanks to RAPID, a new edge-cloud collaborative inference framework that smartly handles visual noise and motion continuity.

Zihao Zheng, Sicheng Tian, Hangyu Cao +9

Inference & Quantization Multimodal Models Robotics & Embodied AI

Mar 2, 2026

Mar 2, 2026·also Cohere, Northwestern

HeRo: Adaptive Orchestration of Agentic RAG on Heterogeneous Mobile SoC

Achieve up to 10.94x speedup in end-to-end latency for on-device agentic RAG by intelligently scheduling tasks across heterogeneous mobile SoC hardware.

Maoliang Li, Jiayu Chen, Zihao Zheng +5

Distributed Systems & Hardware Recommendation & Information Retrieval Tool Use & Agents

Mar 2, 2026

KERV: Kinematic-Rectified Speculative Decoding for Embodied VLA Models

By integrating kinematic prediction with speculative decoding, KERV enables VLA models to achieve a 27-37% speedup in robot control tasks without sacrificing success rate.

Zihao Zheng, Z. Mao, Zhihao Mao +4

Inference & Quantization Multimodal Models Robotics & Embodied AI

Feb 26, 2026

ToProVAR: Efficient Visual Autoregressive Modeling via Tri-Dimensional Entropy-Aware Semantic Analysis and Sparsity Optimization

Attention entropy reveals exploitable sparsity in VAR models, enabling 3.4x faster image generation without sacrificing quality.

Jiayu Chen, Ruoyu Lin, Zihao Zheng +4

Architecture Design (Transformers, SSMs, MoE)Computer Vision Training Efficiency & Optimization

Search

Jiayu Chen

Research focus

Frequent co-authors

Papers (9)