Search papers, labs, and topics across Lattice.
Information Systems Technology and Design, Institute of High Performance Computing (IHPC) Agency for Science, Singapore University of Technology and Design Singapore, Technology and Research (A, STAR) Singapore
2
0
6
Forget slow matrix multiplies: this new quantization method lets you run LLaMA-2-7B in 3-bit with only bit shifts and additions, beating heavier methods like AWQ.
LLM code generation benchmarks are likely overestimating model capabilities: adversarial test suite scaling reveals substantial weaknesses in even state-of-the-art models.