Search papers, labs, and topics across Lattice.
Xiaohongshu Inc.
4
0
9
UniNote's two-stage training, combining contrastive SFT and RL, leapfrogs existing multimodal embeddings, delivering SOTA item-to-item retrieval performance with improved cost efficiency in real-world deployments.
Fine-grained 3D object grounding gets a boost: SSR3D-LLM uses latent spatial reasoning steps to iteratively refine candidate rankings, outperforming single-pointer methods and setting a new standard for unified 3D-LLMs.
Bridging the gap between human manipulation and robotic control, JoyAI-RA unlocks enhanced cross-embodiment behavior learning through multi-source pretraining.
Current multimodal LLMs struggle with guideline-constrained clinical reasoning, but a simple multi-agent framework can significantly boost their performance on real-world lung cancer diagnosis and treatment.