Search papers, labs, and topics across Lattice.
2
0
4
15
Multimodal models can now achieve state-of-the-art performance in real-world tasks like document understanding and audio-video comprehension with significantly reduced inference latency thanks to novel token-reduction techniques.
Bridging the offline-streaming gap in ASR is now more achievable: a single RNN-Transducer model can deliver high accuracy in both settings, thanks to a novel consistency regularization technique.