Yike Guo

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Inference & Quantization (3)RLHF & Preference Learning (2)Training Efficiency & Optimization (2)Architecture Design (Transformers, SSMs, MoE) (1)

Frequent co-authors

Hao Gu (2)Hao Wang (2)Jiacheng Liu (2)Lujun Li (2)

Papers (6)

Apr 9, 2026

Hao Gu +123w ago

QaRL: Rollout-Aligned Quantization-Aware RL for Fast and Stable Training under Training--Inference Mismatch

Quantizing rollouts in LLM RL pipelines introduces a training-inference gap that QaRL closes, leading to +5.5 performance on math problems.

Hao Gu, Hao Wang, Jiacheng Liu +10

Inference & Quantization RLHF & Preference Learning Training Efficiency & Optimization

Binxing Xu +113w ago

Bit-by-Bit: Progressive QAT Strategy with Outlier Channel Splitting for Stable Low-Bit LLMs

Achieve near-lossless 2-bit LLMs with a novel quantization-aware training scheme that progressively reduces precision and intelligently handles outlier channels.

Binxing Xu, Hao Gu, Lujun Li +9

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Training Efficiency & Optimization

Mar 12, 2026

Tsinghua AIMar 12, 2026·also HKU, KAUST

DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning

Control both multi-subject identity and multi-granularity motion in video generation with DreamVideo-Omni, a framework that uses latent identity reinforcement learning to avoid identity degradation.

Yujie Wei, Xinyu Liu, Shiwei Zhang +13

Computer Vision Multimodal Models

Mar 9, 2026

Chi-Min Chan +6Mar 9, 2026

DC-W2S: Dual-Consensus Weak-to-Strong Training for Reliable Process Reward Modeling in Biological Reasoning

Strategic data curation using a dual-consensus approach beats brute-force training on large noisy datasets for process reward modeling in biological reasoning.

Chi-Min Chan, Ehsan Hajiramezanali, Xiner Li +4

Reasoning & Chain-of-Thought RLHF & Preference Learning Scientific Discovery & Drug Design

Feb 8, 2026

Zhenyuan Zhang +7Feb 8, 2026

MemFly: On-the-Fly Memory Optimization via Information Bottleneck

LLMs can now achieve better memory coherence and response fidelity thanks to MemFly's information bottleneck approach to on-the-fly memory optimization.

Zhenyuan Zhang, Xianzhang Jia, Zhiqin Yang +5

Inference & Quantization Recommendation & Information Retrieval Tool Use & Agents

Jan 8, 2026

Jan 8, 2026·also NVIDIA, CAS, D VAE for spatiotemporal latent encoding, Safety Figure 2: AM

AM$^3$Safety: Towards Data Efficient Alignment of Multi-modal Multi-turn Safety for MLLMs

MLLMs can be made significantly safer in multi-turn dialogues with a new framework that combines cold-start refusal and turn-aware policy optimization, achieving a 10% drop in attack success rate.

Han Zhu, Jiale Chen, Chengkun Cai +8

Search

Yike Guo

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (6)