Corresponding authorMar 19, 2026arXiv:2603.19199

FASTER: Rethinking Real-Time Flow VLAs

Yuxiang Lu, Yuxiang Lu, Zhe Liu, Zhe Liu, Xianzhe Fan, Xianzhe Fan, Zhenya Yang, Jinghua Hou, Jinghua Hou, Junyi Li, Junyi Li, Kaixin Ding, Kai Ding, Hengshuang Zhao

AI Summary

This paper analyzes reaction time bottlenecks in flow-based Vision-Language-Action (VLA) models, identifying the inefficiency of constant sampling schedules that delay initial action execution. To address this, they propose FASTER, a Horizon-Aware Schedule that prioritizes near-term actions during flow sampling. FASTER achieves a tenfold reduction in immediate reaction denoising while maintaining long-horizon trajectory quality, enabling faster and smoother trajectory generation on real robots, demonstrated through a dynamic table tennis task.

Key Contribution

Flow-based VLAs can react to environmental changes ten times faster by adaptively prioritizing near-term actions during sampling, unlocking unprecedented real-time responsiveness.

Abstract

Real-time execution is crucial for deploying Vision-Language-Action (VLA) models in the physical world. Existing asynchronous inference methods primarily optimize trajectory smoothness, but neglect the critical latency in reacting to environmental changes. By rethinking the notion of reaction in action chunking policies, this paper presents a systematic analysis of the factors governing reaction time. We show that reaction time follows a uniform distribution determined jointly by the Time to First Action (TTFA) and the execution horizon. Moreover, we reveal that the standard practice of applying a constant schedule in flow-based VLAs can be inefficient and forces the system to complete all sampling steps before any movement can start, forming the bottleneck in reaction latency. To overcome this issue, we propose Fast Action Sampling for ImmediaTE Reaction (FASTER). By introducing a Horizon-Aware Schedule, FASTER adaptively prioritizes near-term actions during flow sampling, compressing the denoising of the immediate reaction by tenfold (e.g., in $\pi_{0.5}$ and X-VLA) into a single step, while preserving the quality of long-horizon trajectory. Coupled with a streaming client-server pipeline, FASTER substantially reduces the effective reaction latency on real robots, especially when deployed on consumer-grade GPUs. Real-world experiments, including a highly dynamic table tennis task, prove that FASTER unlocks unprecedented real-time responsiveness for generalist policies, enabling rapid generation of accurate and smooth trajectories.

Inference & Quantization Multimodal Models Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References112

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

FASTER: Rethinking Real-Time Flow VLAs

Related Papers