Search papers, labs, and topics across Lattice.
Peking University, Nanyang Technological University
3
0
6
Pruning 77.8% of visual tokens without losing performance could revolutionize the efficiency of multimodal large language models.
FORCE achieves a remarkable 79% increase in success rates for VLA models while eliminating the need for costly human interventions during training.
Achieving a 6.7x speedup in 3D scene reconstruction without sacrificing quality could redefine efficiency benchmarks in visual geometry tasks.