Search papers, labs, and topics across Lattice.
Meituan LongCat Interaction Team
5
0
8
Stop drowning your MLLMs in irrelevant context: FES-RAG shows that carefully selecting multimodal fragments boosts factual accuracy by up to 27% and slashes context length.
Multimodal agents can now reason, plan, and execute actions more effectively by integrating perception as a core component, not just an auxiliary interface.
LLMs can model user preferences more effectively by disentangling intent into multiple latent factors, leading to improved recommendation accuracy and interpretability.
Fragmented medical data hurts MLLM performance: this paper shows how a hierarchical medical knowledge graph can be used to engineer training data that substantially improves MLLM accuracy on complex clinical tasks.
Semantic grounding, not token probability, is the key to better multimodal RAG.