Search papers, labs, and topics across Lattice.
Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications
7
0
9
Achieving a score of $2.68 \times 10^{-3}$ in a depth estimation challenge reveals the untapped potential of zero-shot learning in complex visual tasks.
Autonomous code generation combined with rigorous semantic review can drastically enhance scenario mining accuracy in complex driving environments.
Zero-shot learning can now predict traffic accidents in real-time without the need for costly annotated datasets, achieving competitive results in a major competition.
By rethinking text-to-3D generation as a planning problem, this approach significantly reduces error propagation and enhances scene realism.
Achieving nearly 90% accuracy in retrieving videos based on nuanced edit instructions could redefine standards in video retrieval systems.
Achieving nearly 93% accuracy in video relational reasoning, this approach reveals how structured evidence can dramatically enhance model performance in complex visual contexts.
Transforming VLMs into active agents with cognitive maps leads to a staggering 53.2% boost in spatial reasoning accuracy.