Search papers, labs, and topics across Lattice.
2
0
3
Multi-chiplet architectures can unlock significant speedups and memory savings for low-batch MoE inference by dynamically scheduling expert computations across high-bandwidth die-to-die links.
Securing DNN accelerators doesn't have to break the bank: this co-design framework slashes memory overhead by 87% while boosting performance by 12%.