Search papers, labs, and topics across Lattice.
1
0
3
8
Train billion-parameter LLMs on a single H100 GPU, no AdamW required, using a memory-efficient orthogonal transformation method.