Search papers, labs, and topics across Lattice.
The paper introduces Semap, a new open benchmark dataset of 1,439 manually annotated historical map patches designed to reflect the diversity of historical map documents. It also presents a semantic segmentation framework combining procedural data synthesis with multiscale integration to improve robustness and transferability across heterogeneous map collections. The framework achieves state-of-the-art performance on both the HCMSSD and Semap datasets, demonstrating the viability and benefits of a diversity-driven approach to map recognition.
Forget specialized models: a single segmentation framework, trained on diverse historical maps, now achieves state-of-the-art performance across collections, scales, and regions.
Historical map collections are highly diverse in style, scale, and geographic focus, often consisting of many single-sheet documents. Yet most work in map recognition focuses on specialist models tailored to homogeneous map series. In contrast, this article aims to develop generalizable semantic segmentation models and ontology. First, we introduce Semap, a new open benchmark dataset comprising 1,439 manually annotated patches designed to reflect the variety of historical map documents. Second, we present a segmentation framework that combines procedural data synthesis with multiscale integration to improve robustness and transferability. This framework achieves state-of-the-art performance on both the HCMSSD and Semap datasets, showing that a diversity-driven approach to map recognition is not only viable but also beneficial. The results indicate that segmentation performance remains largely stable across map collections, scales, geographic regions, and publication contexts. By proposing benchmark datasets and methods for the generic segmentation of historical maps, this work opens the way to integrating the long tail of cartographic archives to historical geographic studies.