Search papers, labs, and topics across Lattice.
1
0
3
Ditch the learned router: a global scheduler for Mixture-of-Experts models unlocks state-of-the-art multi-domain learning by explicitly optimizing dataset-to-expert assignments.