Lattice AI Research

Research focus

Distributed Systems & Hardware (1)Inference & Quantization (1)Natural Language Processing (1)Open-Source Models & Weights (1)Training Efficiency & Optimization (1)

Frequent co-authors

Yida Zhang (1)Zhiyong Gao (1)Shuaibing Yue (1)Jie Li (1)

Papers (2)

Mar 19, 2026

Yida Zhang +4Mar 19, 2026

A Pipelined Collaborative Speculative Decoding Framework for Efficient Edge-Cloud LLM Inference

Achieve nearly 3x faster LLM inference by intelligently splitting the workload between edge devices and the cloud, without any training.

Yida Zhang, Zhiyong Gao, Shuaibing Yue +2

Distributed Systems & Hardware Inference & Quantization

Ziyin Zhang +4Mar 19, 2026·also Ant Group

F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World

Multilingual embeddings just got a whole lot smaller and faster, with F2LLM-v2 models outperforming larger counterparts while supporting over 200 languages.

Ziyin Zhang, Zihan Liao, Han Yu +2

Natural Language Processing Open-Source Models & Weights Training Efficiency & Optimization

Search

Rui Wang

Research focus

Frequent co-authors

Papers (2)