Search papers, labs, and topics across Lattice.
Kling Team, Kuaishou Technology
3
0
6
11
SVD-Attention slashes the quadratic cost of attention to linear for recommendation tasks by exploiting the inherent low-rank structure of user behavior sequences, without sacrificing softmax.
FlashEvaluator slashes the computational cost of evaluating multiple sequences in Generator-Evaluator frameworks while boosting accuracy by enabling direct cross-sequence comparisons.
Achieve lossless acceleration of ranking models by structurally re-parameterizing feature fusion matrix multiplication, sidestepping the accuracy drop common in lightweighting and distillation.