Mar 19, 2026arXiv:2603.18624

REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation

Shuqi Xiao, Maani Ghaffari, Chengzhong Xu, Hui Kong

AI Summary

This paper introduces REST, a training-free framework for zero-shot object-goal navigation that constructs a tree of paths as the option space for an agent. REST builds an explicit 3D map, grows an agent-centric tree of paths via sampling-based planning, and uses chain-of-thought LLM reasoning to select the next-best path based on spatial narratives. Experiments on Gibson, HM3D, and HSSD show REST achieves competitive success rates with superior path efficiency compared to existing methods.

Key Contribution

LLMs can navigate more efficiently in unfamiliar environments by reasoning over a tree of possible paths, not just isolated waypoints, enabling them to consider en-route information gain and prune unpromising branches.

Abstract

Zero-shot object-goal navigation (ZSON) requires navigating unknown environments to find a target object without task-specific training. Prior hierarchical training-free solutions invest in scene understanding (\textit{belief}) and high-level decision-making (\textit{policy}), yet overlook the design of \textit{option}, i.e., a subgoal candidate proposed from evolving belief and presented to policy for selection. In practice, options are reduced to isolated waypoints scored independently: single destinations hide the value gathered along the journey; an unstructured collection obscures the relationships among candidates. Our insight is that the option space should be a \textit{tree of paths}. Full paths expose en-route information gain that destination-only scoring systematically neglects; a tree of shared segments enables coarse-to-fine LLM reasoning that dismisses or pursues entire branches before examining individual leaves, compressing the combinatorial path space into an efficient hierarchy. We instantiate this insight in \textbf{REST} (Receding Horizon Explorative Steiner Tree), a training-free framework that (1) builds an explicit open-vocabulary 3D map from online RGB-D streams; (2) grows an agent-centric tree of safe and informative paths as the option space via sampling-based planning; and (3) textualizes each branch into a spatial narrative and selects the next-best path through chain-of-thought LLM reasoning. Across the Gibson, HM3D, and HSSD benchmarks, REST consistently ranks among the top methods in success rate while achieving the best or second-best path efficiency, demonstrating a favorable efficiency-success balance.

Computer Vision Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References41

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation

Related Papers