Lattice AI Research

Research focus

Inference & Quantization (2)Natural Language Processing (1)Architecture Design (Transformers, SSMs, MoE) (1)Training Efficiency & Optimization (1)

Frequent co-authors

Shiwen Shan (1)Yintong Huo (1)Qihang Fan (1)Qihang Fan (1)

Papers (2)

May 25, 2026

May 25, 2026·also SMU

CelerLog: Fast Log Parsing via Dynamic Routing

LLMs aren't always needed: CelerLog shows you can get SOTA log parsing with a hybrid approach that's up to 18x faster and cuts token costs by 94%.

Shiwen Shan, Yintong Huo, Zhiying Wu

Inference & Quantization Natural Language Processing

Mar 6, 2026

Mar 6, 2026·also SYSU

FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling

Forget slow attention: FlashPrefill achieves a staggering 27x speedup in long-context prefilling by instantly discovering and thresholding sparse attention patterns.

Qihang Fan, Qihang Fan, Zhiying Wu +3

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Training Efficiency & Optimization

Search

Zhiying Wu

Research focus

Frequent co-authors

Papers (2)