Search papers, labs, and topics across Lattice.
2
0
3
2
Forget scaling compute – the future of AI hinges on a 1000x leap in energy efficiency via tight AI+Hardware co-design over the next decade.
FlashAttention-4 shatters attention bottlenecks on Blackwell GPUs, achieving up to 71% hardware utilization and 2.7x speedups over Triton, thanks to innovations like software-emulated softmax and asynchronous MMA pipelines.