Search papers, labs, and topics across Lattice.
University of Michigan
2
0
3
Attention bottlenecks in long-context decoding? SANTA slashes memory bandwidth demands by stochastically sampling value vectors, achieving 1.5x speedups without sacrificing accuracy.
Transformers can directly solve quadratic programs and leverage covariance matrices for superior decision-making, outperforming traditional "predict-then-optimize" methods in portfolio construction.