Apr 23, 2026arXiv:2604.21903

A Scale-Adaptive Framework for Joint Spatiotemporal Super-Resolution with Diffusion Models

Max Defez, F. Quarenghi, M. Vrac, Stephan Mandt, Tom Beucler

AI Summary

This paper introduces a scale-adaptive framework for joint spatiotemporal super-resolution using diffusion models, addressing the limitations of existing methods that are typically designed for a single pair of super-resolution factors. The framework decomposes the super-resolution task into a deterministic conditional mean prediction and a residual conditional diffusion model, incorporating an optional mass-conservation transform. By retuning factor-dependent hyperparameters (diffusion noise schedule, temporal context length, and mass-conservation function), the same architecture achieves high performance across a wide range of spatial and temporal upscaling factors, demonstrated on reanalysis precipitation data.

Key Contribution

Unlock reusable architectures for climate data super-resolution: a single diffusion model now handles spatial upscaling from 1x to 25x and temporal upscaling from 1x to 6x.

Abstract

Deep-learning video super-resolution has progressed rapidly, but climate applications typically super-resolve (increase resolution) either space or time, and joint spatiotemporal models are often designed for a single pair of super-resolution (SR) factors (upscaling spatial and temporal ratio between the low-resolution sequence and the high-resolution sequence), limiting transfer across spatial resolutions and temporal cadences (frame rates). We present a scale-adaptive framework that reuses the same architecture across factors by decomposing spatiotemporal SR into a deterministic prediction of the conditional mean, with attention, and a residual conditional diffusion model, with an optional mass-conservation (same precipitation amount in inputs and outputs) transform to preserve aggregated totals. Assuming that larger SR factors primarily increase underdetermination (hence required context and residual uncertainty) rather than changing the conditional-mean structure, scale adaptivity is achieved by retuning three factor-dependent hyperparameters before retraining: the diffusion noise schedule amplitude beta (larger for larger factors to increase diversity), the temporal context length L (set to maintain comparable attention horizons across cadences) and optionally a third, the mass-conservation function f (tapered to limit the amplification of extremes for large factors). Demonstrated on reanalysis precipitation over France (Comephore), the same architecture spans super-resolution factors from 1 to 25 in space and 1 to 6 in time, yielding a reusable architecture and tuning recipe for joint spatiotemporal super-resolution across scales.

Computer Vision Scientific Discovery & Drug Design

Citation Metrics

Citations0

Influential citations0

References20

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

A Scale-Adaptive Framework for Joint Spatiotemporal Super-Resolution with Diffusion Models

Related Papers