Search papers, labs, and topics across Lattice.
2
32
5
3
MLRA unlocks 2.8x faster LLM decoding by enabling efficient tensor parallelism for latent attention, sidestepping the memory traffic bottlenecks that plague existing methods.
The largest open-source image generative model to date, HunyuanImage 3.0, achieves state-of-the-art performance using a Mixture-of-Experts architecture and native Chain-of-Thoughts schema.