Search papers, labs, and topics across Lattice.
Information Systems Technology and Design, Institute of High Performance Computing (IHPC) Agency for Science, Singapore University of Technology and Design Singapore, Technology and Research (A, STAR) Singapore
1
0
3
Forget slow matrix multiplies: this new quantization method lets you run LLaMA-2-7B in 3-bit with only bit shifts and additions, beating heavier methods like AWQ.