Search papers, labs, and topics across Lattice.
3
0
7
0
Explicit reasoning steps ("thinking mode") boost spatial audio question answering accuracy by 5.1%, especially when combined with source separation.
You can now build a real-time, privacy-preserving conversational assistant for procedural tasks using *only* audio and IMU data, thanks to a new finetuning method that makes the assistant less chatty and more helpful.
By grounding LLMs in timestamped acoustic events instead of raw audio, LongAudio-RAG enables accurate question answering over multi-hour audio, outperforming standard RAG and text-to-SQL baselines.