Search papers, labs, and topics across Lattice.
Fudan University
1
0
3
MLLMs can "hear" a little, but EgoSound reveals they're still largely deaf to the nuances of sound in egocentric video, especially when it comes to spatial and causal reasoning.