AIQOct 13, 2025

Forecasting Oil Production with Time-Series Foundation Models - A Benchmark Study Against Classical Machine Learning Models

A. Franco, Nidhal Belyouni, R. Ayoubi, Fatima Khalifa Alkaabi

AI Summary

This study benchmarks time series foundation models (TSFMs) against classical machine learning and deep learning models for forecasting oil well production using a real-world dataset from a Middle Eastern reservoir, evaluating performance across various forecast horizons and with/without exogenous variables. The key finding is that TimesFM, a pretrained TSFM, significantly outperforms an XGBoost baseline, achieving up to 23% improvement in RMSE at short horizons and maintaining 10-11% gains at longer horizons, while other TSFMs like Chronos and GPT4TS excel at long-term forecasting without exogenous variables. The inclusion of exogenous variables consistently improves performance across all models, emphasizing their importance for capturing operational dynamics.

Key Contribution

Pretrained time series foundation models can boost oil production forecasting accuracy by over 20%, but inconsistent handling of operational data remains a barrier to widespread adoption.

Abstract

Forecasting oil well production is critical for optimizing reservoir management and planning field operations. Traditional approaches, such as Decline Curve Analysis (DCA) and physics-based simulation, often fall short in capturing nonlinear dynamics and operational variability, particularly in mature or unconventional fields. In this study, we benchmark a wide range of data-driven forecasting models, including classical machine learning, deep learning, and time series foundation models (TSFMs), on a real-world oil production dataset from a supergiant Middle Eastern reservoir. We evaluate each model across five forecast horizons (6, 12, 24, 36, and 48 months), using normalized Mean Absolute Error (nMAE) and normalized Root Mean Squared Error (nRMSE) as unitless metrics. Each model is tested both with and without exogenous variables such as wellhead pressure, choke size, and uptime, allowing us to assess the impact of operational context on predictive performance. The evaluation is conducted per well, using hold-out windows tailored to each horizon, simulating realistic deployment conditions. Our baseline, an XGBoost-based autoregressive forecaster (XGB-AR), is outperformed by several advanced models - most notably TimesFM, a pretrained TSFM, which delivers up to 23% improvement in RMSE and 22% in MAE at short horizons, and maintains 1110% gains at longer ones. Other deep learning models such as N-BEATS and N-HiTS show improvement over the baseline on MAE for selected horizons. At long horizons without exogenous variables, foundation models like Chronos, GPT4TS, and TimeXer outperform all others, indicating stronger generalization. Averaging across all models, the inclusion of exogenous features leads to consistent improvements, up to 13.7% in RMSE, highlighting their importance for capturing operational dynamics. Without exogenous inputs, model family differences narrow, and traditional methods remain competitive at short horizons. This work is one of the first to comprehensively compare TSFMs with classical and deep learning models in the oil and gas domain. Our results underscore the promise of foundation models for long-term, multivariate forecasting, while also revealing key limitations, such as inconsistent support for exogenous variables, that must be addressed for broader industrial adoption.

Eval Frameworks & Benchmarks Scientific Discovery & Drug Design

Citation Metrics

Citations0

Influential citations0

References13

Year2025

VenueSPE Annual Technical Conference and Exhibition

Related Papers

Finding related papers...

Search

Forecasting Oil Production with Time-Series Foundation Models - A Benchmark Study Against Classical Machine Learning Models

Related Papers