Search papers, labs, and topics across Lattice.
2
88
4
9
Audio-language models can now reason about 30-minute-long audio clips with timestamp-grounded intermediate steps, unlocking a new level of fine-grained understanding.
A 3B parameter model, Audio Flamingo 2, now rivals larger proprietary models in audio understanding and reasoning, even handling audio segments up to 5 minutes long.