Search papers, labs, and topics across Lattice.
1
0
3
Hybrid Mamba-Transformer LLMs get a 4x speed boost in time-to-first-token and 1.4x higher throughput thanks to a new disaggregated accelerator architecture tailored to prefill and decode phases.