Search papers, labs, and topics across Lattice.
4
0
8
Vision-language models falter at the fine-grained temporal recognition crucial for surgical video understanding, while SurgRec excels.
Web agents can achieve 3x faster search and higher final accuracy by dynamically adapting their context management strategy based on the current state, rather than sticking to a single fixed approach.
Forget real-world video datasets: training VLMs on just 7.7K synthetic videos with temporal primitives beats 165K real-world examples, unlocking surprisingly effective transfer learning for video reasoning.
Directly modeling 3D geometry in dental scans unlocks a 9.58% accuracy boost in multi-disease diagnosis compared to methods relying on 2D or multi-view image representations.