Search papers, labs, and topics across Lattice.
2
0
6
MLLMs still can't handle time-sensitive multimodal reasoning, often failing to integrate auditory and visual cues effectively in dynamic environments like a 4D escape room.
By incorporating language guidance into federated learning, SurgFed tackles the long-standing problem of tissue and task heterogeneity in surgical video understanding, leading to improved segmentation and depth estimation across diverse surgical settings.