Search papers, labs, and topics across Lattice.
5
3
13
10
Forget brute-force context windows: a small vision-language model can compress hour-long videos below theoretical limits by intelligently prioritizing relevant content.
Forget agents and world models – the future of computing could be learned directly from I/O traces, turning the model itself into the computer.
Scale up offline policy training for diffusion LLMs without breaking the bank: dTRPO slashes trajectory computation costs while boosting performance up to 9.6% on STEM tasks.
Forget exotic attention mechanisms – MobileLLM-Flash achieves up to 1.8x faster LLM prefill on mobile CPUs by smartly pruning and adapting existing architectures for on-device use.
Forget scaling laws: this work shows you can get SOTA reasoning from sub-billion parameter models with *less* data, if you're smart about curation and resampling.