Search papers, labs, and topics across Lattice.
This work was supported by the Scientific Research Innovation Capability Support Project for Young Faculty, Fundamental Research Funds for the Central Universities, and the Talent Fund of Beijing Jiaotong University. Shan Sha, Shenglong Zhou, Xin Wang, and Lingchen Kong are with the School of Mathematics and Statistics, Beijing Jiaotong University, Beijing, China. E-mail: {shansha, shlzhou, xinwang2, lchkong}@bjtu.edu.cn. Geoffrey Ye Li are with the Department of Electrical and Electronic Engineering, Faculty of Engineering, Imperial College London, London, U.K. E-mail: geoffrey.li@imperial.ac.uk.Corresponding author: Shenglong Zhou
1
0
3
3
VLA models can now be efficiently quantized *without* retraining, even surpassing full-precision performance, thanks to a new post-training method that carefully calibrates scales across attention and output heads.