Xiaoxi Li

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Tool Use & Agents (6)Eval Frameworks & Benchmarks (3)Reasoning & Chain-of-Thought (3)RLHF & Preference Learning (2)

Frequent co-authors

Zhicheng Dou (5)Jiajie Jin (4)Yuyang Hu (4)Guanting Dong (3)

Papers (9)

Jul 16, 2026

1w ago

Rubrics on Trial: Evolving Rubrics from a Single Query via Synthetic Pairwise Evidence

Evolving rubrics from a single query can dramatically enhance LLM evaluation by eliminating reliance on external annotations and improving answer quality discrimination.

Haocheng Yang, Hao Yang, Licheng Pan +7

Eval Frameworks & Benchmarks RLHF & Preference Learning

Jun 15, 2026

VeriGraph: Towards Verifiable Data-Analytic Agents

Explicit evidence graphs in VeriGraph enable LLMs to achieve 87.61% claim grounding, transforming how we verify AI-generated conclusions.

Jiajie Jin, Wenle Liao, Yuyang Hu +3

Reasoning & Chain-of-Thought Scalable Oversight & Alignment Theory

Jun 10, 2026

Jun 10, 2026·also Microsoft Research

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

Arbor's innovative approach to autonomous research enables a cumulative learning process that outperforms existing models by over 2.5 times in real-world tasks.

Jiajie Jin, Yuyang Hu, Guanting Dong +8

Scientific Discovery & Drug Design Tool Use & Agents

May 29, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

LLM agents are shockingly vulnerable to multi-stage "trojan" attacks that inject malicious instructions into their workspace, achieving near-perfect success rates where standard prompt injection defenses fail.

Jiejun Tan, Zhicheng Dou, Yuyang Hu +3

Red-Teaming & Adversarial Robustness Tool Use & Agents

May 23, 2026

AgentFugue: Agent Scaling for Long-Horizon Tasks through Collective Reasoning

Scaling out peer agents with a shared reasoning hub, AgentFugue, unlocks a new dimension of capability gains in long-horizon tasks, proving that collective reasoning is more than just parallel compute.

Yuyang Hu, Hongjin Qian, Shuting Wang +4

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Tool Use & Agents

Apr 20, 2026

Junting Lu +13Apr 20, 2026·also RUC

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

Agent-World reveals that self-evolving environments can dramatically boost agent performance, outperforming established models by leveraging dynamic task synthesis.

Junting Lu, Wanjun Zhong, Longxiang Liu +11

Data Curation & Synthetic Data Tool Use & Agents World Models & Planning

Mar 29, 2026

Shijian Wang +13Mar 29, 2026

MuSEAgent: A Multimodal Reasoning Agent with Stateful Experiences

Forget trajectory-level rollouts: MuSEAgent learns faster and reasons better by distilling past interactions into reusable, state-aware decision experiences.

Shijian Wang, Jiarui Jin, Runhao Fu +11

Multimodal Models Reasoning & Chain-of-Thought Tool Use & Agents

Mar 19, 2026

Hao Wang +8Mar 19, 2026

CausalRM: Causal-Theoretic Reward Modeling for RLHF from Observational User Feedbacks

Observational user feedback, often dismissed as too noisy and biased, can actually power effective RLHF with the right causal modeling, achieving a 49.2% gain on WildGuardMix.

Hao Wang, Licheng Pan, Zhichao Chen +6

RLHF & Preference Learning

Feb 26, 2026

Feb 26, 2026·also Tsinghua AI, Didi Voyager Labs, Gaoling AI, SEU +3

OmniGAIA: Towards Native Omni-Modal AI Agents

Current multimodal models are stuck in bi-modal interactions, but OmniGAIA and OmniAtlas offer a path towards truly omni-modal AI assistants capable of reasoning and tool use across video, audio, and images.

Xiaoxi Li, Xiaoxi Li, Wenxiang Jiao +9

Eval Frameworks & Benchmarks Multimodal Models Tool Use & Agents

Search

Xiaoxi Li

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (9)