Search papers, labs, and topics across Lattice.
University of Cambridge
2
0
5
FoMoE shatters the full-replica barrier, enabling efficient LLM training across weakly connected data centers with unprecedented communication and memory savings.
LLMs can be made more reliable and efficient by adaptively focusing uncertainty quantification on relevant semantic themes, cutting inference time by 60% while improving factuality correlation.