Search papers, labs, and topics across Lattice.
1
0
3
0
LLMs can achieve state-of-the-art audio-visual speech recognition by sparsely aligning modalities and refining with visual unit guidance, substantially boosting robustness in noisy environments.