Search papers, labs, and topics across Lattice.
4
0
9
3
VLA models get a 1.73x speedup with only 5-7% overhead thanks to RAPID, a new edge-cloud collaborative inference framework that smartly handles visual noise and motion continuity.
By integrating kinematic prediction with speculative decoding, KERV enables VLA models to achieve a 27-37% speedup in robot control tasks without sacrificing success rate.
Achieve up to 10.94x speedup in end-to-end latency for on-device agentic RAG by intelligently scheduling tasks across heterogeneous mobile SoC hardware.
Attention entropy reveals exploitable sparsity in VAR models, enabling 3.4x faster image generation without sacrificing quality.