Xiangjun Fan

Stronger coding agents can achieve higher success rates while requiring fewer user interventions, reshaping our understanding of effective coding assistance.

Yifan Wu, Zhuokai Zhao, Songlin Li +7

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Tool Use & Agents

Jun 18, 2026

3w ago·also DeepCybo

Structuring and Tokenizing Distributed User Interest Context for Generative Recommendation

G2Rec captures user interest prototypes more accurately than existing methods, enabling generative recommendation systems to operate without ground-truth user interests.

Ruizhong Qiu, Yinglong Xia, Dongqi Fu +6

Recommendation & Information Retrieval

Jun 17, 2026

Meta AI3w ago·also DeepMind

SAGE-OPD: Selective Agent-Guided Intervention for Multi-Turn On-Policy Distillation

Selective teacher intervention in multi-turn training can boost agent performance by over 13% by mitigating the impact of early errors.

Yifan Wu, Jiayi Liu, Xiangjun Fan +1

RLHF & Preference Learning Training Efficiency & Optimization

Jun 12, 2026

Meta AIJun 12, 2026·also NYU

RepFusion: Leveraging Multimodal Priors for Denoising in Representation Space

RepFusion reveals that multimodal large language models can dramatically enhance denoising in text-to-image systems, outperforming traditional denoising methods.

Xichen Pan, Aashu Singh, Satya Narayan Shukla +3

Multimodal Models

May 31, 2026

Meta AIMay 31, 2026

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification

Chunk-level semantic verification in OmniOPD yields a +28.64% boost in math performance over traditional OPD, challenging the reliance on token-level logit matching.

Yuhang Zhou, Yifan Wu, Mingyi Wang +4

Inference & Quantization RLHF & Preference Learning Training Efficiency & Optimization

Apr 6, 2026

Yuhang Zhou +6Apr 6, 2026·also Meta AI

Synthetic Sandbox for Training Machine Learning Engineering Agents

On-policy RL for machine learning engineering agents is now practical, thanks to a synthetic sandbox that slashes execution time by 13x while boosting performance by up to 67%.

Yuhang Zhou, Lizhu Zhang, Yifan Wu +4

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Tool Use & Agents

Mar 31, 2026

Iordanis Fostiropoulos +7Mar 31, 2026·also UIUC

GISTBench: Evaluating LLM User Understanding via Evidence-Based Interest Verification

LLMs still struggle to accurately infer user interests from interaction histories, especially when dealing with diverse engagement signals – a critical gap for effective personalization.

Iordanis Fostiropoulos, Muhammad Azhar, Abdalaziz Sawwan +5

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Mar 19, 2026

Arushi Rai +6Mar 19, 2026·also Meta AI

TARo: Token-level Adaptive Routing for LLM Test-time Alignment

Achieve significant reasoning gains in frozen LLMs (+22.4%) without retraining by adaptively routing reward model guidance at the token level during inference.

Arushi Rai, Qiang Zhang, Hanqing Zeng +4

Inference & Quantization Reasoning & Chain-of-Thought RLHF & Preference Learning

Search

Xiangjun Fan

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (9)