Search papers, labs, and topics across Lattice.
This paper introduces Semantic-Augmented DRL (SA-DRL), a framework that leverages LLMs to improve UAV deployment in VANETs by providing semantic understanding of road topology. They transform a general-purpose LLM into a domain-specific topology expert using a four-stage pipeline and inject its reasoning into a PPO agent via a Logit Fusion mechanism. Results show that SA-PPO significantly outperforms existing methods in connectivity and energy efficiency, achieving comparable performance with only 26.6% of the training episodes.
Forget blind exploration: injecting LLM-derived semantic understanding into DRL dramatically boosts UAV-aided network connectivity and slashes energy consumption.
Vehicular Ad-hoc Networks (VANETs) are the digital cornerstone of autonomous driving, yet they suffer from severe network fragmentation in urban environments due to physical obstructions. Unmanned Aerial Vehicles (UAVs), with their high mobility, have emerged as a vital solution to bridge these connectivity gaps. However, traditional Deep Reinforcement Learning (DRL)-based UAV deployment strategies lack semantic understanding of road topology, often resulting in blind exploration and sample inefficiency. By contrast, Large Language Models (LLMs) possess powerful reasoning capabilities capable of identifying topological importance, though applying them to control tasks remains challenging. To address this, we propose the Semantic-Augmented DRL (SA-DRL) framework. Firstly, we propose a fragmentation quantification method based on Road Topology Graphs (RTG) and Dual Connected Graphs (DCG). Subsequently, we design a four-stage pipeline to transform a general-purpose LLM into a domain-specific topology expert. Finally, we propose the Semantic-Augmented PPO (SA-PPO) algorithm, which employs a Logit Fusion mechanism to inject the LLM's semantic reasoning directly into the policy as a prior, effectively guiding the agent toward critical intersections. Extensive high-fidelity simulations demonstrate that SA-PPO achieves state-of-the-art performance with remarkable efficiency, reaching baseline performance levels using only 26.6% of the training episodes. Ultimately, SA-PPO improves two key connectivity metrics by 13.2% and 23.5% over competing methods, while reducing energy consumption to just 28.2% of the baseline.