Search papers, labs, and topics across Lattice.
Idiap Research Institute, Martigny, Switzerland
3
0
5
1
LLMs can judge speech recognition quality with near-human accuracy, blowing away traditional metrics like Word Error Rate.
Just 4 hours of speech data closes the modality gap in LLM-based ASR, rivaling full-dataset fine-tuning and unlocking effective domain adaptation.
LLM-based ASR can get a context boost without the compute cost: compress prior audio turns into learned latent tokens and retain transcripts to recover accuracy while shrinking the audio footprint.