Search papers, labs, and topics across Lattice.
3
0
6
9
Strong translation quality doesn't guarantee high speech or temporal fidelity, revealing critical gaps in existing evaluation practices for speech translation systems.
Over-reliance on agentic decomposition can actually *hurt* audio understanding when a strong audio frontend already provides sufficient information, highlighting the importance of conditional evidence acquisition.
ASR systems can now be more trustworthy: this work shows how to train them to abstain from transcribing uncertain segments, leading to more reliable outputs.