Search papers, labs, and topics across Lattice.
The paper introduces HTP, a hierarchical trajectory generation method using LLMs to synthesize realistic urban mobility data while addressing privacy concerns. HTP employs a trajectory-specific residual quantization VAE (RQ-VAE) to encode GPS trajectories into travel pattern tokens, which are then used to fine-tune an LLM for conditional trajectory generation. Experiments on real-world datasets demonstrate that HTP significantly outperforms existing methods, improving generation quality by an average of 29.78%.
Synthesizing realistic, privacy-preserving urban mobility data is now possible with LLMs that generate travel patterns, not just GPS points, boosting generation quality by nearly 30%.
Urban trajectories play a crucial role in modeling urban dynamics and supporting various smart city applications. However, privacy concerns restrict access to large-scale and high-quality trajectory datasets. Trajectory generation provides a promising alternative by synthesizing realistic data to mitigate privacy risks. However, existing methods fail to explicitly capture travel patterns and can only generate fixed-length trajectories under a single condition. To address these limitations, we propose \textbf{HTP}, which \textbf{H}ierarchically generates \textbf{T}ravel patterns first and then generates GPS \textbf{P}oints by using large language models (LLMs), rather than directly generating GPS points. We first design a trajectory-specific residual quantization variational autoencoder (RQ-VAE) that quantizes micro-level GPS trajectories into compact, macro-level travel pattern tokens in a coarse-to-fine manner. These tokens capture rich segment spatial irregularities, such as point density variations caused by traffic conditions. Then, we extend the LLM vocabulary with travel pattern tokens to align trajectory representations with the LLM input, and apply supervised fine-tuning (SFT) to align the LLM with the trajectory generation task, enabling generation of travel pattern sequences under various conditions. Extensive experiments on two real-world datasets show that HTP outperforms the strongest baseline by an average of 29.78\% in terms of generation quality. Our code is available at https://github.com/slzhou-xy/HTP.