Search papers, labs, and topics across Lattice.
Virginia Tech
1
0
3
Automating CUTLASS kernel synthesis and auto-tuning lets you get 2.79x speedups on real models like MiniGPT just by having an LLM rewrite your PyTorch.