Search papers, labs, and topics across Lattice.
HKUST (GZ), HKUST
2
0
5
28
ReaLB achieves 1.29x faster multimodal MoE inference by dynamically adjusting expert precision, proving that real-time adaptation can overcome modality-induced load imbalances.
LLMs that ace code generation often fail to grasp intended program semantics, as evidenced by a stark performance decline when generating executable behavioral specifications on the new CodeSpecBench benchmark.