Search papers, labs, and topics across Lattice.
University of Science and Technology of China, Hefei, China
1
0
3
2
Freezing most weights and only LoRA-tuning a vision-language model achieves near state-of-the-art multimodal interleaved reasoning performance, proving that targeted adaptation can rival full fine-tuning.