Search papers, labs, and topics across Lattice.
MAIS&NLPR, CASIA, UCAS
3
0
5
0
Forget slow attention: FlashPrefill achieves a staggering 27x speedup in long-context prefilling by instantly discovering and thresholding sparse attention patterns.
Forget codebook indices: BitDance uses binary diffusion to predict high-entropy binary tokens, achieving SOTA image generation with a fraction of the parameters and a massive speedup.
A single tokenizer, UniWeTok, now handles both high-fidelity image reconstruction and complex semantic understanding for multimodal LLMs, outperforming existing methods with far less training data.