Search papers, labs, and topics across Lattice.
2
0
6
Forget specialized architectures: StepAudio 2.5 proves a single audio-language foundation, shaped by RLHF, can dominate ASR, TTS, and real-time dialogue simultaneously.
MLLMs still struggle with real-world document understanding, but a new benchmark and reinforcement learning approach can significantly improve their ability to extract structured information from receipts.