Search papers, labs, and topics across Lattice.
B [26] visual backbone. The action head is a conditional Flow Matching network implemented via an 8-layer Diffusion Transformer (DiT [16]) with a 1024 hidden dimension, trained to predict trajectories of horizon T=, Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Ministry of Education, China
1
0
3
2
Robot control gets a whole lot faster: ProbeFlow slashes action decoding latency by 14.8x in Vision-Language-Action models, all without retraining.