Search papers, labs, and topics across Lattice.
1
0
3
4
ReaLB achieves 1.29x faster multimodal MoE inference by dynamically adjusting expert precision, proving that real-time adaptation can overcome modality-induced load imbalances.