Search papers, labs, and topics across Lattice.
This paper introduces MapGCLR, a self-supervised contrastive learning method to improve BEV feature representations for online HD map construction. It enforces geospatial consistency between overlapping BEV feature grids using a contrastive loss, generating training pairs by analyzing traversal overlap in datasets. By training a model semi-supervisedly with limited labeled data and a broader unlabeled set, MapGCLR achieves superior performance compared to a supervised baseline in vectorized map perception.
Self-supervised learning can dramatically improve online HD map construction, outperforming supervised methods even with limited labeled data by leveraging geospatial consistency in BEV feature representations.
Autonomous vehicles rely on map information to understand the world around them. However, the creation and maintenance of offline high-definition (HD) maps remains costly. A more scalable alternative lies in online HD map construction, which only requires map annotations at training time. To further reduce the need for annotating vast training labels, self-supervised training provides an alternative. This work focuses on improving the latent birds-eye-view (BEV) feature grid representation within a vectorized online HD map construction model by enforcing geospatial consistency between overlapping BEV feature grids as part of a contrastive loss function. To ensure geospatial overlap for contrastive pairs, we introduce an approach to analyze the overlap between traversals within a given dataset and generate subsidiary dataset splits following adjustable multi-traversal requirements. We train the same model supervised using a reduced set of single-traversal labeled data and self-supervised on a broader unlabeled set of data following our multi-traversal requirements, effectively implementing a semi-supervised approach. Our approach outperforms the supervised baseline across the board, both quantitatively in terms of the downstream tasks vectorized map perception performance and qualitatively in terms of segmentation in the principal component analysis (PCA) visualization of the BEV feature space.