Search papers, labs, and topics across Lattice.
Tencent Hunyuan
2
0
3
Spatial-Omni achieves superior spatial audio understanding in multimodal LLMs by effectively incorporating spatial cues without modifying existing architectures.
By explicitly modeling 3D space with learned spatial audio representations, JAEGER enables AV-LLMs to perform joint spatial grounding and reasoning far beyond the capabilities of 2D-centric models.