Feb 23, 2026arXiv:2602.19683

GPU-Resident Gaussian Process Regression Leveraging Asynchronous Tasks with HPX

AI Summary

This paper presents a GPU-resident Gaussian Process Regression (GPR) prediction pipeline implemented within the HPX-based GPRat library to address the cubic complexity bottleneck of exact GP solvers. The implementation utilizes tiled algorithms and CUDA libraries to exploit massive parallelism for linear algebra operations on the GPU. Results demonstrate significant speedups compared to the CPU implementation, achieving up to 4.6x speedup for GP prediction and outperforming cuSOLVER by up to 11% for large datasets when using multiple CUDA streams with HPX.

Key Contribution

GPU-acceleration of Gaussian Process regression, using HPX for asynchronous task management, beats cuSOLVER's performance by up to 11% for large datasets.

Abstract

Gaussian processes (GPs) are a widely used regression tool, but the cubic complexity of exact solvers limits their scalability. To address this challenge, we extend the GPRat library by incorporating a fully GPU-resident GP prediction pipeline. GPRat is an HPX-based library that combines task-based parallelism with an intuitive Python API. We implement tiled algorithms for the GP prediction using optimized CUDA libraries, thereby exploiting massive parallelism for linear algebra operations. We evaluate the optimal number of CUDA streams and compare the performance of our GPU implementation to the existing CPU-based implementation. Our results show the GPU implementation provides speedups for datasets larger than 128 training samples. We observe speedups of up to 4.3 for the Cholesky decomposition itself and 4.6 for the GP prediction. Furthermore, combining HPX with multiple CUDA streams allows GPRat to match, and for large datasets, surpass cuSOLVER's performance by up to 11 percent.

Distributed Systems & Hardware Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

GPU-Resident Gaussian Process Regression Leveraging Asynchronous Tasks with HPX

Related Papers