Search papers, labs, and topics across Lattice.
1
0
3
On-device LLMs can achieve state-of-the-art performance with significantly reduced computational cost by leveraging a carefully designed Mixture-of-Experts architecture, challenging the assumption that dense models are always superior for mobile deployment.