Search papers, labs, and topics across Lattice.
Sino-German Joint Software Institute, Beihang University
3
0
5
RATrain achieves a remarkable 1.35脳 speedup in training large language models on bandwidth-limited supercomputers, challenging the assumption that high-bandwidth environments are necessary for optimal performance.
LLM inference on supercomputers doesn't have to be a bottleneck: THInfer achieves up to 84% higher throughput than A800 GPUs by co-designing hardware-aware kernels and a communication-optimized pipeline.
You can achieve near-optimal data retention in machine unlearning scenarios without needing to know individual user privacy preferences, sidestepping GDPR compliance challenges.