Search papers, labs, and topics across Lattice.
3
0
6
0
VLMs can achieve 7.8x faster prefilling speeds with only a minor accuracy drop by intelligently pruning redundant visual tokens *without* retraining.
Robots can now recover from failures during manipulation tasks by explicitly tracking progress against spatial subgoals, without needing extra training data or models.
A dual-branch Transformer with safe cross-attention overcomes missing visual cues in emotion recognition by dynamically relying on audio, achieving state-of-the-art results on Aff-Wild2.