Apr 21, 2026arXiv:2604.19494

DPC: A Distributed Page Cache over CXL

Shai Bergman, Zhe Yang, Julien Eudine, Giorgio Negro, G. Negro, Onur Mutlu, O. Mutlu, Arash Tavakkol, Ji Zhang

AI Summary

This paper introduces Distributed Page Cache (DPC), a novel OS-level distributed page cache that leverages CXL 3.0 memory semantics to treat a cluster's DRAM as a single cache budget. DPC enforces a single-copy invariant at the page granularity, eliminating data redundancy and coherence overhead associated with traditional per-node page caches. Evaluated on a CXL-based emulation framework, DPC achieves speedups of up to 12.4X (5.6X geometric mean) across various data-sharing workloads.

Key Contribution

By treating a cluster's DRAM as a single cache, DPC slashes data redundancy and coherence overhead, achieving up to 12.4x speedups.

Abstract

Modern distributed file systems rely on uncoordinated, per node page caches that replicate hot data locally across the cluster. While ensuring fast local access, this architecture underutilizes aggregate cluster DRAM capacity through massive data redundancy and incurs prohibitive coherence overhead via heavyweight, lock-based protocols. In this paper, we focus on the design of a distributed page cache that treats the entire cluster's main memory as a single cache budget while preserving standard file-system interfaces and semantics. We present Distributed Page Cache (DPC), an OS-level, distributed page cache built on top of Compute Express Link (CXL) 3.0 memory semantics. DPC enforces a single-copy invariant at page granularity: each file page has exactly one owner node holding the sole resident DRAM copy, and other nodes access it via CXL-based remote mappings rather than creating replicas of the page. DPC is implemented end-to-end on a CXL-based emulation framework that models multi-host CXL 3.0 memory fabrics, enabling detailed evaluation in the absence of widespread hardware. Across real-world and representative data-sharing workloads, DPC delivers speedups of up to 12.4X, with a geometric-mean speedup of 5.6X.

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware

Citation Metrics

Citations0

Influential citations0

References65

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

DPC: A Distributed Page Cache over CXL

Related Papers