Search papers, labs, and topics across Lattice.
Kuaishou Inc.
3
0
4
3
A2Gen transforms short video recommendations by treating user actions as dynamic sequences, resulting in substantial improvements in user engagement metrics.
Sub-linear attention is now possible without sacrificing complete long-range dependency retention, thanks to learnable summary tokens that compress context.
Generative recommendation models like OneRec-V2 can achieve near-lossless FP8 quantization, unlocking significant latency and throughput improvements, unlike traditional recommender systems.