Search papers, labs, and topics across Lattice.
4
0
4
4
Struct-Searcher achieves a remarkable 17.2% accuracy boost in multimodal information seeking by effectively managing conflicting evidence through a dynamic structural graph.
Forget scalar rewards: GenEvolve distills structured visual experiences from successful and failed generation trajectories, enabling token-level supervision for self-improving image generation agents.
Current multimodal agents still struggle to combine ambiguous visual cues with open-web verification, highlighting a critical gap in their ability to perform complex geolocation tasks.
Even the best multimodal agents struggle with realistic visual scenarios, achieving only 27% accuracy on the new AgentVista benchmark that demands long-horizon tool use across web search, image search, and code.