Chenfeng Xu

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (2)Training Efficiency & Optimization (2)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Zhongzhu Zhou (2)Tri Dao (2)Shuaiwen Leon Song (2)Ben Athiwaratkun (2)

Papers (3)

Apr 21, 2026

Jinda Jia +104d ago

SAW-INT4: System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving

Forget fancy quantization schemes – a simple token-wise INT4 quantization with Hadamard rotation is all you need to nearly match FP16 accuracy in LLM serving, without sacrificing throughput.

Jinda Jia, Jisen Li, Zhongzhu Zhou +8

Distributed Systems & Hardware Inference & Quantization

Apr 15, 2026

Zhiyuan Xu +41w ago·also ×1 convolutional neural network to align with the feature dimension dd from image features, D features is important for consistent cross-view reasoning. Finally

Rethinking Image-to-3D Generation with Sparse Queries: Efficiency, Capacity, and Input-View Bias

Sparse queries offer a surprisingly effective and efficient alternative to dense representations for image-to-3D generation, achieving comparable fidelity with less input-view bias.

Zhiyuan Xu, Jiuming Liu, Masayoshi Tomizuka +2

Architecture Design (Transformers, SSMs, MoE)Computer Vision Training Efficiency & Optimization

Apr 13, 2026

1w ago·also Princeton, Together, UT Austin

Introspective Diffusion Language Models

Diffusion language models can now match autoregressive quality, thanks to a clever trick that forces them to agree with themselves.

Yifan Yu, Yuqing Jian, Junxiong Wang +12

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Training Efficiency & Optimization

Search

Chenfeng Xu

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)