Search papers, labs, and topics across Lattice.
Zhejiang University
4
0
7
FadeMem achieves superior video generation by intelligently consolidating memory, ensuring that fine details fade while essential scene structures remain intact over longer time horizons.
Current vision-language models can *see* point cloud defects, but can't reliably *diagnose* them, highlighting a critical gap in grounded quality understanding.
Despite showing promise in reading raw height data, today's MLLMs often fail to translate geometric perception into reliable semantic reasoning about natural scenes, even performing worse than RGB-only models when both modalities are needed.
Unlock 2x faster reinforcement learning by distilling group feedback into actionable language refinements that guide exploration.