Search papers, labs, and topics across Lattice.
The paper introduces FAST-EQA, a novel framework for Embodied Question Answering (EQA) that enhances efficiency through question-conditioned visual target identification, global region scoring for navigation, and Chain-of-Thought reasoning over a fixed-capacity visual memory. By prioritizing relevant regions and maintaining a bounded memory of region-target hypotheses, FAST-EQA achieves improved scene coverage and answer reliability. Experiments on HMEQA and EXPRESS-Bench demonstrate state-of-the-art performance with faster inference times compared to existing methods, while also showing competitive results on OpenEQA and MT-HM3D.
Forget slow, memory-intensive embodied agents: FAST-EQA slashes inference time while boosting accuracy by focusing on question-relevant regions and maintaining a compact, actionable memory.
Embodied Question Answering (EQA) combines visual scene understanding, goal-directed exploration, spatial and temporal reasoning under partial observability. A central challenge is to confine physical search to question-relevant subspaces while maintaining a compact, actionable memory of observations. Furthermore, for real-world deployment, fast inference time during exploration is crucial. We introduce FAST-EQA, a question-conditioned framework that (i) identifies likely visual targets, (ii) scores global regions of interest to guide navigation, and (iii) employs Chain-of-Thought (CoT) reasoning over visual memory to answer confidently. FAST-EQA maintains a bounded scene memory that stores a fixed-capacity set of region-target hypotheses and updates them online, enabling robust handling of both single and multi-target questions without unbounded growth. To expand coverage efficiently, a global exploration policy treats narrow openings and doors as high-value frontiers, complementing local target seeking with minimal computation. Together, these components focus the agent's attention, improve scene coverage, and improve answer reliability while running substantially faster than prior approaches. On HMEQA and EXPRESS-Bench, FAST-EQA achieves state-of-the-art performance, while performing competitively on OpenEQA and MT-HM3D.