Search papers, labs, and topics across Lattice.
Shanghai Jiao Tong University
8
0
12
EvoMaster achieves unprecedented performance in autonomous scientific discovery, outperforming traditional frameworks by up to 316%.
Training a multimodal agent from scratch beats retrofitting existing LMMs with search tools, especially when you compress long interaction histories into visual summaries.
ALMs can now pinpoint sounds in time with far greater accuracy, thanks to a new training method that stops them from hallucinating timestamps.
Current Composed Image Retrieval benchmarks are misleading, as a new evaluation reveals that models struggle with query ambiguity and interactive scenarios.
Injecting rare disease knowledge into data synthesis and using self-supervised RL on pseudo-labels dramatically improves medical reasoning in LLMs, outperforming existing methods by up to 5.93% on rare disease tasks.
MLLMs can achieve near-identical performance on long-form visual tasks with just 2.5% of the original visual tokens by mimicking human visual attention.
Soccer tactics, previously viewed as too stochastic for accurate modeling, can now be realistically simulated with a diffusion model that captures nuanced team styles and predicts future outcomes.
Omni-LLMs struggle to identify the same objects across different modalities, but a new dataset and training strategies can significantly improve their cross-modal reasoning.