Le Sun

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (5)Code Generation & Program Synthesis (2)Tool Use & Agents (2)Multimodal Models (2)

Frequent co-authors

Boxi Cao (7)Hongyu Lin (7)Xianpei Han (7)Yaojie Lu (6)

Papers (7)

May 28, 2026

3d ago·also iscas.ac.cn

LiteCoder-Terminal: Scaling Long-Horizon Terminal Environments for Learning Language Agents

Forget scraping – this work shows you can generate high-quality, executable terminal environments from scratch to train language agents that outperform models trained on scraped data.

Xiaoxuan Peng, Kaiqi Zhang, Kai Zhang +6

Code Generation & Program Synthesis Data Curation & Synthetic Data Tool Use & Agents

May 25, 2026

Zhuoqun Li +156d ago·also CAS, Kuaishou

MetaphorVU: Towards Metaphorical Video Understanding

MLLMs can't grasp metaphors in videos, revealing a surprising gap in their high-order cognitive abilities compared to humans.

Zhuoqun Li, Boxi Cao, Guiping Jiang +13

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

Apr 30, 2026

ScaleBox: Enabling High-Fidelity and Scalable Code Verification for Large Language Models

LLMs trained with ScaleBox, a new high-fidelity code verification system, substantially outperform those trained with heuristic matching, suggesting current RLHF methods are bottlenecked by verification quality.

Jiasheng Zheng, Xin Zheng, Boxi Cao +9

Code Generation & Program Synthesis Distributed Systems & Hardware Eval Frameworks & Benchmarks

Apr 22, 2026

All Languages Matter: Understanding and Mitigating Language Bias in Multilingual RAG

Multilingual RAG systems are systematically suppressing "answer-critical" documents in non-English languages, crippling their ability to leverage global knowledge.

Guozhao Mo, Yafei Shi, Boxi Cao +6

Constitutional AI & AI Ethics Natural Language Processing Recommendation & Information Retrieval

Apr 18, 2026

Beyond Text-Dominance: Understanding Modality Preference of Omni-modal Large Language Models

Forget text-dominance: Today's Omni-modal LLMs surprisingly favor visual inputs, creating new challenges for cross-modal reasoning.

Xinru Yan, Boxi Cao, Yaojie Lu +5

Eval Frameworks & Benchmarks Multimodal Models

Apr 9, 2026

Apr 9, 2026·also Kuaishou

Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces

LLMs exhibit a "Utopian bias" when simulating human behavior, converging towards an unrealistic "positive average person" and failing to capture individual differences and long-tail behaviors.

Jiawei Chen, Ruoxi Xu, Boxi Cao +12

Eval Frameworks & Benchmarks Tool Use & Agents World Models & Planning

Mar 10, 2026

Mar 10, 2026·also CMU ML

Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards

LLMs trained with reinforcement learning from verifiable rewards (RLVR) become overconfident in incorrect answers, but a simple fix—decoupling reasoning and calibration objectives—can restore proper calibration without sacrificing accuracy.

Zhengzhao Ma, Zheng Ma, Xueru Wen +7

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought RLHF & Preference Learning

Search

Le Sun

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (7)