Search papers, labs, and topics across Lattice.
This paper introduces AccelCIM, a framework for systematic dataflow exploration in SRAM compute-in-memory (CIM) accelerators, considering both CIM macro configurations and macro-array organizations. AccelCIM uses cycle-accurate architectural simulation and post-layout power, performance, and area (PPA) analysis for rigorous design evaluation. Applying AccelCIM to LLM applications yields practical insights for designing efficient CIM accelerators, addressing the limitations of prior work that often assumes full on-chip model residency.
Unlocking the potential of compute-in-memory accelerators for LLMs requires carefully navigating a complex dataflow design space, and AccelCIM provides the first systematic framework to do so.
SRAM-based compute-in-memory (CIM) offers high computational density and energy efficiency for deep neural network (DNN) accelerators, but its limited capacity causes on/off-chip data movement overhead for large DNN models. Existing CIM accelerator studies typically assume that DNN models fit entirely on-chip, leaving efficient dataflow design largely untapped. This paper introduces AccelCIM, a systematic dataflow exploration framework for SRAM CIM accelerator, which addresses two key limitations of prior work. (1) It formulates a systematic dataflow design space spanning CIM macro configurations and macro-array organizations. (2) It introduces rigorous design evaluation using cycle-accurate architectural simulation and post-layout PPA analysis. We conduct an extensive design space exploration and apply AccelCIM to representative LLM applications, providing practical insights for the principled design of CIM accelerators.