Search papers, labs, and topics across Lattice.
2
0
3
19
Ditch the cache: Prototype-Based Test-Time Adaptation (PTA) boosts vision-language model accuracy by nearly 4% while *doubling* inference speed compared to existing cache-based methods.
Achieve >97% FLOPs reduction in LVLM inference with minimal performance loss by intelligently pruning redundant visual tokens, all without retraining.