Search papers, labs, and topics across Lattice.
This paper introduces a human-robot joint planning system that addresses uncertainty in both task knowledge and human intent. The system uses an LLM-assisted active elicitation mechanism with hypothesis-augmented A* search to minimize interaction costs in resolving semantic ambiguity and object uncertainty. Additionally, it employs a real-time intent-aware collaboration module that infers human intent from spatial and directional cues to enable dynamic task selection. Experiments in simulation and real-world UAV deployments demonstrate a 51.9% reduction in interaction cost and a 25.4% reduction in task execution time compared to baselines.
Human-robot teams can slash interaction costs by 50% and task times by 25% when robots actively resolve uncertainty about tasks and infer human intent using LLMs and spatial reasoning.
Effective human-robot collaboration in open-world environments requires joint planning under uncertain conditions. However, existing approaches often treat humans as passive supervisors, preventing autonomous agents from becoming human-like teammates that can actively model teammate behaviors, reason about knowledge gaps, query, and elicit responses through communication to resolve uncertainties. To address these limitations, we propose a unified human-robot joint planning system designed to tackle dual sources of uncertainty: task-relevant knowledge gaps and latent human intent. Our system operates in two complementary modes. First, an uncertainty-mitigation joint planning module enables two-way conversations to resolve semantic ambiguity and object uncertainty. It utilizes an LLM-assisted active elicitation mechanism and a hypothesis-augmented A^* search, subsequently computing an optimal querying policy via dynamic programming to minimize interaction and verification costs. Second, a real-time intent-aware collaboration module maintains a probabilistic belief over the human's latent task intent via spatial and directional cues, enabling dynamic, coordination-aware task selection for agents without explicit communication. We validate the proposed system in both Gazebo simulations and real-world UAV deployments integrated with a Vision-Language Model (VLM)-based 3D semantic perception pipeline. Experimental results demonstrate that the system significantly cuts the interaction cost by 51.9% in uncertainty-mitigation planning and reduces the task execution time by 25.4% in intent-aware cooperation compared to the baselines.