May 6, 2026arXiv:2605.04842

Communication Offloading on SmartNIC DPUs: A Quantitative Approach

Jacob Wahlgren, Andong Hu, Roger Pearce, Maya Gokhale, Ivy Peng

AI Summary

This paper investigates the feasibility of using SmartNIC DPUs to offload asynchronous "fire-and-forget" communication tasks from the host CPU. They design and implement "Buddy," a communication offloading engine that runs on Nvidia BlueField-3 DPUs and x86 CPUs. Results across five applications show that offloading communication to the DPU yields up to 1.55x speedup for host-dominated workloads, but also reveals a 625x increase in DRAM traffic due to the DPU's lack of Direct Cache Access, indicating a critical design bottleneck.

Key Contribution

Offloading communication to SmartNIC DPUs can speed up host-dominated workloads by 1.55x, but the lack of Direct Cache Access creates a massive DRAM bottleneck.

Abstract

SmartNIC Data Processing Units (DPUs) offer a promising solution for saving high-end CPU resources by offloading tasks to programmable cores near the network interface. In this work, we explore the feasibility of SmartNIC DPUs in supporting an asynchronous communication model called"fire-and-forget", particularly its core message routing service. We design a communication offloading engine called Buddy that decouples communication tasks from the application process. Buddy runs flexibly on SmartNIC DPUs such as the Nvidia BlueField-3 DPU and generic x86 CPUs. Our evaluation results in five applications identify the memory-to-communication ratio as a key predictor of the offloading performance. Host-dominated workloads, such as Quicksilver and Sparse Matrix Transpose, achieved up to 1.55x speedup with communication offloaded to the DPU. We further identify a 625x increase in DRAM traffic due to the absence of Direct Cache Access support on the DPU, highlighting a critical need in future SmartNIC designs.

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware

Citation Metrics

Citations0

Influential citations0

References19

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Communication Offloading on SmartNIC DPUs: A Quantitative Approach

Related Papers