Mar 18, 2026arXiv:2603.17850

ProbeFlow: Training-Free Adaptive Flow Matching for Vision-Language-Action Models

Zhou Fang, Jiaqi Wang, Yi Zhou, Yi Zhou, Q. Shi, Qiongfeng Shi

AI Summary

This paper introduces ProbeFlow, a training-free adaptive inference framework for Flow Matching-based Vision-Language-Action (VLA) models to reduce action decoding latency in robotics. ProbeFlow dynamically schedules integration steps based on trajectory complexity, measured by the cosine similarity between initial and lookahead velocity vectors, pruning redundant network evaluations. Experiments on MetaWorld and LIBERO benchmarks demonstrate significant speedups (14.8x action decoding, 2.8x end-to-end) without compromising manipulation success, validated in real-world deployments.

Key Contribution

Robot control gets a whole lot faster: ProbeFlow slashes action decoding latency by 14.8x in Vision-Language-Action models, all without retraining.

Abstract

Recent Vision-Language-Action (VLA) models equipped with Flow Matching (FM) action heads achieve state-of-the-art performance in complex robot manipulation. However, the multi-step iterative ODE solving required by FM introduces inference latency that precludes responsive physical control. While current acceleration efforts optimize the Vision-Language Model (VLM) backbone, the action head bottleneck remains overlooked. To address this, we propose ProbeFlow, a training-free adaptive inference framework tai- lored for continuous robotic control. By evaluating geometric trajectory complexity via the cosine similarity between initial and lookahead velocity vectors, ProbeFlow dynamically sched- ules integration steps to prune redundant network evaluations. On the MetaWorld benchmark, it accelerates action decoding by 14.8x (reducing average steps from N = 50 to 2.6) and cuts end-to-end system latency by 2.8x without compromising the manipulation success rate. On the long-horizon LIBERO benchmark, the probe automatically allocates a denser schedule to navigate semantic bottlenecks, effectively resolving the flow solver delay. Real-world physical deployments confirm that ProbeFlow successfully mitigates action decoding latency while ensuring execution stability, offering a highly practical solution for low-latency continuous generative policies.

Inference & Quantization Multimodal Models Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References27

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

ProbeFlow: Training-Free Adaptive Flow Matching for Vision-Language-Action Models

Related Papers