Search papers, labs, and topics across Lattice.
6
0
12
GPT-5.1 can barely crack 50% accuracy when distinguishing real from AI-generated academic images, highlighting a stark gap between generative capabilities and forensic detection.
Continuous-time trajectory estimation with spline parameterization unlocks robust visual-inertial odometry, even with sparse and noisy ranging data.
Training a multimodal agent from scratch beats retrofitting existing LMMs with search tools, especially when you compress long interaction histories into visual summaries.
Just one carefully crafted poisoned document can cripple an LLM's reasoning abilities in retrieval-augmented generation.
AudioLLMs don't have to choose between reasoning and perception: a unified schema can boost both.
MLLMs can achieve near-identical performance on long-form visual tasks with just 2.5% of the original visual tokens by mimicking human visual attention.