Search papers, labs, and topics across Lattice.
Shanghai Jiaotong University
2
0
4
Reconstruct a high-fidelity, full-head 3D avatar from a single image in under one second, finally breaking the quality-speed tradeoff.
MLLMs can achieve state-of-the-art multimodal retrieval by learning to compress information into a handful of "bottleneck" tokens, forcing the model to distill relevant semantics.