Search papers, labs, and topics across Lattice.
Amazon, Arizona State University
1
0
4
16
Expert upcycling lets you scale MoEs for 32% less compute by intelligently duplicating and specializing existing experts, challenging the need to train massive MoEs from scratch.