Search papers, labs, and topics across Lattice.
1
0
3
Vision-language models falter at the fine-grained temporal recognition crucial for surgical video understanding, while SurgRec excels.