Search papers, labs, and topics across Lattice.
This paper introduces an autonomous navigation framework using LLMs and multimodal sensor fusion (LiDAR and vision) for dynamic obstacle avoidance and human-aware path planning. The system employs an FPGA-accelerated fusion pipeline with a Hungarian algorithm for object tracking and a bird's-eye view representation for enhanced spatial reasoning. A fine-tuned LLM processes the fused sensory data to generate adaptive navigation strategies, improving interaction with vulnerable pedestrians compared to rule-based methods.
LLMs can now drive robots to navigate complex environments with improved safety and efficiency, especially around vulnerable pedestrians, by fusing LiDAR and vision data for enhanced spatial reasoning.
This paper presents a novel autonomous navigation framework that integrates Large Language Models (LLMs) with multimodal sensor fusion to enable dynamic obstacle avoidance and human-aware path planning in diverse environments. The proposed system leverages an FPGA-accelerated fusion pipeline, combining LiDAR and vision data for real-time perception. A Hungarian algorithm-based object matching technique ensures robust tracking, while a bird's-eye view (BEV) representation enhances spatial reasoning and occlusion handling. The fused sensory inputs are processed by a fine-tuned LLM, which contextualizes pedestrian behavior and environmental constraints to generate adaptive, human-centric navigation strategies. Unlike traditional rule-based methods, LLMs provide generalization capabilities to novel scenarios, significantly improving interaction with vulnerable pedestrians such as children, elderly individuals, and wheelchair users. Extensive evaluations in both simulated and real-world scenarios confirm the system's ability to reduce collisions and enhance navigation efficiency in high-density environments. By bridging semantic reasoning and robotic control, this work lays the foundation for next-generation intelligent navigation systems that are both safety-aware and scalable across autonomous platforms.