Search papers, labs, and topics across Lattice.
Nanjing University
2
0
5
Current audio-language models are surprisingly bad at controlling and interpreting subtle vocal cues, failing in nearly half of situational dialogue scenarios.
Forget static, single-turn personalization – PersonaVLM unlocks long-term, evolving user alignment in MLLMs, even surpassing GPT-4o.