Search papers, labs, and topics across Lattice.
Imperial College London
1
0
3
13
Fusing kernels in SwiGLU MLP blocks slashes memory bandwidth bottlenecks, yielding up to 13.2% speedups on H100 GPUs during agentic LLM inference.