Search papers, labs, and topics across Lattice.
1
0
3
Forget fixed routing: DynaMoE dynamically adjusts expert activation per token and layer, unlocking better parameter efficiency and convergence stability in Mixture-of-Experts models.