Search papers, labs, and topics across Lattice.
University of Bologna
1
0
3
HBM-PIM can achieve impressive matrix multiplication throughput (14.9 GFLOP/s) using a novel reduction-free outer-product dataflow, even without native reduction support.