Search papers, labs, and topics across Lattice.
2
0
5
0
Sequence recommendation models can achieve near-perfect scaling efficiency in distributed training, slashing wasted GPU cycles by up to 90%.
Forget quadratic complexity: ULTRA-HSTU achieves 21x faster inference and 4-8% better engagement in large-scale recommendation by co-designing input sequences, sparse attention, and model topology.