Search papers, labs, and topics across Lattice.
The paper introduces Reverso, a family of efficient time series foundation models for zero-shot forecasting, demonstrating that smaller hybrid models can match the performance of much larger transformer-based models. They achieve this by interleaving long convolution and linear RNN layers (DeltaNet layers), along with data augmentation and inference strategies. The resulting models are orders of magnitude smaller and more efficient, significantly improving the performance-efficiency trade-off in zero-shot time series forecasting.
Forget massive transformers: tiny hybrid models can achieve state-of-the-art zero-shot time series forecasting with 100x fewer parameters.
Learning time series foundation models has been shown to be a promising approach for zero-shot time series forecasting across diverse time series domains. Insofar as scaling has been a critical driver of performance of foundation models in other modalities such as language and vision, much recent work on time series foundation modeling has focused on scaling. This has resulted in time series foundation models with hundreds of millions of parameters that are, while performant, inefficient and expensive to use in practice. This paper describes a simple recipe for learning efficient foundation models for zero-shot time series forecasting that are orders of magnitude smaller. We show that large-scale transformers are not necessary: small hybrid models that interleave long convolution and linear RNN layers (in particular DeltaNet layers) can match the performance of larger transformer-based models while being more than a hundred times smaller. We also describe several data augmentation and inference strategies that further improve performance. This recipe results in Reverso, a family of efficient time series foundation models for zero-shot forecasting that significantly push the performance-efficiency Pareto frontier.