Search papers, labs, and topics across Lattice.
Rice University
3
0
6
0
A hierarchical agent that separates visual and textual contexts drastically improves multi-step reasoning on complex charts, outperforming monolithic MLLMs.
LLMs can autonomously discover novel neural architectures that achieve state-of-the-art performance in specialized domains, suggesting a path towards automated scientific discovery.
Current visual grounding models struggle to infer objects from contextual roles and intentions, highlighting a critical gap in their ability to perform true scene understanding.