Search papers, labs, and topics across Lattice.
The OpenFOAM HPC Technical Committee organized the first OpenFOAM HPC Challenge (OHC-1) to benchmark OpenFOAM's performance on modern hardware and compare hardware-constrained versus software-optimized submissions using a common RANS case. Analysis of 237 submissions across hardware and software tracks revealed a Pareto front balancing time- and energy-to-solution, highlighting the dominance of on-package HBM for single-node CPU performance and the potential of GPU ports and selective-memory optimizations to significantly outperform hardware-track results. The challenge provides a valuable snapshot of OpenFOAM's capabilities and identifies promising avenues for future optimization.
Optimizing OpenFOAM with GPU ports and selective-memory techniques slashes energy consumption by 28% and iteration time by 72% compared to purely hardware-focused approaches.
The first OpenFOAM HPC Challenge (OHC-1) was organised by the OpenFOAM HPC Technical Committee (HPCTC) to collect a snapshot of OpenFOAM's computational performance on contemporary production hardware and to compare hardware-constrained submissions with software-track optimisations. Participants ran a common incompressible steady-state RANS case, the open-closed cooling DrivAer (occDrivAer) configuration, on prescribed meshes, submitting either with the reference setup (hardware track) or with modified solvers, decomposition strategies, or accelerator offloading (software track). In total, 237 valid datapoints were submitted by 12 contributors: 175 in the hardware track and 62 in the software track. The hardware track covered 25 distinct CPU models across AMD, Intel, and ARM families, with runs spanning from single-node configurations up to 256 nodes (32768 CPU cores). Wall-clock times ranged from 7.8 minutes to 65.7 hours and reported energy-to-solution from 2.1 to 236.9 kWh. Analysis of the hardware track identified a Pareto front of optimal balance between time- and energy-to-solution, and revealed that on-package high-bandwidth memory (HBM) dominates single-node performance for next-generation CPUs. Software-track submissions achieved up to 28% lower energy per iteration, 17% higher maximum performance per node, and 72% shorter minimum time per iteration than the best hardware-track results, with full GPU ports and selective-memory optimisations leading the performance range. This manuscript describes the challenge organisation, the case setup and metrics, and presents the main findings from both tracks together with an outlook for future challenges.