Search papers, labs, and topics across Lattice.
Nanjing University
1
0
2
Current audio-language models are surprisingly bad at controlling and interpreting subtle vocal cues, failing in nearly half of situational dialogue scenarios.