Search papers, labs, and topics across Lattice.
2
0
5
5
Achieve full-attention accuracy with 10x operator speedup and 4.7x throughput improvement in long-context LLM inference by overlapping KV cache transfers with computation.
LongCat-Next shatters the language-centric paradigm by unifying text, vision, and audio into a single autoregressive model with minimal modality-specific design, finally reconciling understanding and generation in discrete vision modeling.