Search papers, labs, and topics across Lattice.
This paper introduces pHNSW, an algorithm-hardware co-optimized solution to accelerate HNSW search by applying PCA filtering to reduce data dimensionality and computational load. The method reduces neighbor access volume and distance calculation complexity by projecting data into a lower-dimensional PCA space before the HNSW search. Experimental results demonstrate that pHNSW achieves significant QPS improvements (14.47x-21.37x on CPU, 5.37x-8.46x on GPU) and reduces energy consumption (up to 57.4%) compared to standard HNSW.
PCA filtering and custom hardware can supercharge HNSW, yielding up to 21x speedups and halving energy consumption for approximate nearest neighbor search.
Hierarchical Navigable Small World (HNSW) has demonstrated impressive accuracy and low latency for high-dimensional nearest neighbor searches. However, its high computational demands and irregular, large-volume data access patterns present significant challenges to search efficiency. To address these challenges, we introduce pHNSW, an algorithm-hardware co-optimized solution that accelerates HNSW through Principal Component Analysis (PCA) filtering. On the algorithm side, we apply PCA filtering to reduce the dimensionality of the dataset, thereby lowering the volume of neighbor access and decreasing the computational load for distance calculations. On the hardware side, we design the pHNSW processor with custom instructions to optimize search throughput and energy efficiency. In the experiments, we synthesized the pHNSW processor RTL design with a 65nm technology node and evaluated it using DDR4 and HBM1.0 DRAM standards. The results show that pHNSW boosts Queries per Second (QPS) by 14.47x-21.37x on a CPU and 5.37x-8.46x on a GPU, while reducing energy consumption by up to 57.4% compared to standard HNSW implementation.