Search papers, labs, and topics across Lattice.
Nanyang Technological University, Hunyuan Team, Tencent, Tianjin University
2
0
5
Current audio editing models are failing spectacularly, with an Exact Match Rate below 5% in complex tasks, exposing a critical need for improvement.
Over-reliance on agentic decomposition can actually *hurt* audio understanding when a strong audio frontend already provides sufficient information, highlighting the importance of conditional evidence acquisition.