Search papers, labs, and topics across Lattice.
Institute of Computing Technology, Chinese Academy of Science
2
0
3
3
Solving massive optimization problems just got a whole lot faster: SDSL-Solver achieves up to 97x speedups over PARDISO by distributing sparse linear system solves across multiple nodes.
Squeezing intermediate tensors with FP8 quantization and adaptive transforms can nearly double the throughput of tensor-parallel LLM training without sacrificing accuracy.