Search papers, labs, and topics across Lattice.
Tsinghua University
1
0
3
Compressing visual tokens doesn't have to mean sacrificing performance: VEN-VL's ensemble-MoE approach recovers accuracy while maintaining efficiency in multimodal models.