Search papers, labs, and topics across Lattice.
1
0
3
OpenPangu-7B inference on NPUs gets a serious speed boost via a custom-tailored speculative decoding scheme.