Search papers, labs, and topics across Lattice.
2
0
3
5
Even the best search-augmented agents, like Gemini Deep Research, are easily distracted by noisy web content, leading to surprisingly poor performance (40% accuracy) on a new multimodal reasoning benchmark.
Even state-of-the-art multimodal LLMs struggle to accurately cite their sources when reasoning across video, audio, and text, often hallucinating citations despite generating correct answers.