Search papers, labs, and topics across Lattice.
2
0
2
1
A fault in one GPU process no longer needs to crash them all: this paper introduces mechanisms for fault-resilient NVIDIA MPS, enabling more robust multi-tenant GPU clusters.
Forget independent LoRA tuning jobs: ALTO co-optimizes them for a 13.8x speedup without sacrificing adapter quality.