Search papers, labs, and topics across Lattice.
3
0
6
2
LLM-based ASR can be shrunk to 2.3B parameters and still beat larger models in real-world scenarios by carefully delineating encoder and LLM roles and using a multi-stage training approach.
LLM-based ASR models can achieve state-of-the-art performance and reduce hallucinations by strategically allocating entropy reduction between the speech encoder and LLM during training.
Scanning every token to focus attention is now pass茅: HISA prunes irrelevant context blocks *before* token-level scoring, slashing compute without sacrificing selection fidelity.