Search papers, labs, and topics across Lattice.
This paper investigates the utility of large language models (LLMs) for time series forecasting (TSF) across a large-scale dataset of 8 billion observations, challenging previous studies that found limited benefits. The authors demonstrate that LLMs significantly improve forecasting performance, particularly in cross-domain generalization scenarios, and that pre-alignment strategies outperform post-alignment. They further show that both the pre-trained knowledge and the model architecture of LLMs contribute to performance, with pre-training being crucial for distribution shifts and architecture excelling at modeling complex temporal dynamics.
LLMs actually *do* improve time series forecasting, especially for cross-domain generalization, overturning prior doubts with a massive 8-billion observation study.
Large language models (LLMs) have been introduced to time series forecasting (TSF) to incorporate contextual knowledge beyond numerical signals. However, existing studies question whether LLMs provide genuine benefits, often reporting comparable performance without LLMs. We show that such conclusions stem from limited evaluation settings and do not hold at scale. We conduct a large-scale study of LLM-based TSF (LLM4TSF) across 8 billion observations, 17 forecasting scenarios, 4 horizons, multiple alignment strategies, and both in-domain and out-of-domain settings. Our results demonstrate that \emph{LLM4TS indeed improves forecasting performance}, with especially large gains in cross-domain generalization. Pre-alignment outperforming post-alignment in over 90\% of tasks. Both pretrained knowledge and model architecture of LLMs contribute and play complementary roles: pretraining is critical under distribution shifts, while architecture excels at modeling complex temporal dynamics. Moreover, under large-scale mixed distributions, a fully intact LLM becomes indispensable, as confirmed by token-level routing analysis and prompt-based improvements. Overall, Our findings overturn prior negative assessments, establish clear conditions under which LLMs are not only useful, and provide practical guidance for effective model design. We release our code at https://github.com/EIT-NLP/LLM4TSF.