Search papers, labs, and topics across Lattice.
Fudan University
3
0
5
Forget trying to wrangle dynamic 4D scenes with recurrent networks – DynamicVGGT achieves state-of-the-art reconstruction accuracy using a surprisingly effective feed-forward approach.
Stop naively aggregating knowledge for KB-VQA: MaS-VQA's Mask-and-Select mechanism shows how to prune irrelevant image regions and knowledge fragments for better reasoning.
MLLMs can now spot subtle image forgeries with SOTA accuracy by strategically using forensic tools to expose hidden inconsistencies, outperforming traditional text-centric approaches.