Search papers, labs, and topics across Lattice.
3
0
5
Instruction-based steering can redirect attention in LALMs to acoustically relevant regions, achieving over 60% overlap with ground-truth sound event locations without any training.
Widely used emotion embedding similarity metrics for speech generation are more sensitive to speaker and linguistic features than actual emotion, rendering them unreliable for evaluating emotional expressiveness.
Semantic-level uncertainty estimation methods significantly enhance the reliability of audio-aware language models, outperforming traditional approaches in critical reasoning tasks.