Search papers, labs, and topics across Lattice.
This paper introduces a novel, scalable, and fully nonparametric framework for regionalizing spatial time series data based on the minimum description length (MDL) principle. The method jointly infers a spatial partition and representative time series archetypes ("drivers") to optimally compress the spatiotemporal dataset. Experiments on synthetic and real-world air quality and vegetation index data demonstrate the method's ability to accurately recover planted regional structures and extract meaningful structural regularities.
Discovering spatial regions and their temporal signatures in massive time series data just got much faster and easier, thanks to a new method that scales log-linearly with the number of time series.
Regionalization aims to partition a spatial domain into contiguous regions that share similar characteristics, enabling more effective spatial analysis, policy making, and resource management. Existing approaches for spatial regionalization typically rely on static spatial snapshots rather than evolving time series. Meanwhile, most time series clustering methods ignore spatial structure or enforce spatial continuity through ad hoc regularization, constraining the number of inferred regions a priori either explicitly or implicitly. Utilizing the minimum description length principle from information theory, here we propose an efficient and fully nonparametric framework for the regionalization of spatial time series. Our method jointly infers a spatial partition along with a set of representative time series archetypes ("drivers") that best compress a spatiotemporal dataset, with a runtime log-linear in the number of time series. We demonstrate that this method can accurately recover planted regional structure and drivers in synthetic time series, and can extract meaningful structural regularities in large-scale empirical air quality and vegetation index records. Our method provides a principled and scalable framework for spatially contiguous partitioning, allowing interpretable temporal patterns and homogeneous regions to emerge directly from the data itself.