Search papers, labs, and topics across Lattice.
4
0
8
MemoryVLA++ achieves up to 28% performance gains in robotic manipulation tasks by integrating memory and imagination, transforming how robots handle temporal dependencies.
Achieving comparable text-to-image quality with a linearized model that accelerates inference by up to 1.47 times, all while leveraging pretrained weights.
By explicitly modeling both consensus and discrepancy between RGB and IR data, this text-guided multispectral object detector significantly boosts performance on multispectral benchmarks.
MLLMs struggle to effectively zoom into relevant details in ultra-high-resolution remote sensing imagery, but a new staged training framework can teach them when and where to focus for substantial accuracy gains.