Search papers, labs, and topics across Lattice.
1
0
3
Fusing kernels in SwiGLU MLP blocks slashes memory bandwidth bottlenecks, yielding up to 13.2% speedups on H100 GPUs during agentic LLM inference.