Search papers, labs, and topics across Lattice.
3
0
6
Escape vendor lock-in and unlock faster VLM inference on edge devices with EdgeFM, an open-source framework that beats proprietary toolchains by up to 49%.
Quantizing rollouts in LLM RL pipelines introduces a training-inference gap that QaRL closes, leading to +5.5 performance on math problems.
Achieve near-lossless 2-bit LLMs with a novel quantization-aware training scheme that progressively reduces precision and intelligently handles outlier channels.