Search papers, labs, and topics across Lattice.
1
0
3
5
A new family of sparse Mixture-of-Experts models, Arcee Trinity, achieves stable training at scale thanks to a novel MoE load balancing strategy (SMEBU).