Search papers, labs, and topics across Lattice.
This paper introduces RecFlash, an innovative recommendation inference accelerator that leverages NAND flash memory-based in-storage computing to address the inefficiencies of irregular memory access patterns in recommendation systems. By implementing a frequency-based data remapping algorithm, RecFlash significantly enhances the performance of recommendation tasks, achieving reductions in latency and energy consumption by up to 81% and 91.9%, respectively. These findings highlight the potential of in-storage computing to optimize real-time processing of large-scale user data in recommendation systems.
RecFlash slashes recommendation inference latency by up to 81% and energy consumption by nearly 92% through smart data remapping in NAND flash memory.
Recommendation system has gained a large popularity for a variety of personalized suggestion tasks, but the ever-increasing number of user data makes real-time processing of recommendation systems difficult. NAND flash memory-based in-storage computing scheme can be one of favorable candidates among the various acceleration approaches because the flash memory typically has a larger memory capacity than the other memory types, so it can efficiently handle a large amount of user data for the recommendation inference services. However, different from other neural network applications where data is sequentially fetched from memory, the recommendation system shows the irregular random memory access pattern. Hence, most of the data loaded from the NAND flash array to the page buffer are not used, so a large portion of the internal bandwidth is underutilized, which degrades the performance on the inference acceleration of the recommendation tasks. In this paper, we propose RecFlash, a fast recommendation inference accelerator utilizing a data remapping algorithm with NAND flash-based in-storage computing (ISC). The experimental results show that our proposed method improves the latency and energy consumption by up to 81% and 91.9%, respectively, over the existing NAND flash-based ISC architecture.