Shijue Huang

The Hong Kong University of Science and Technology

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Tool Use & Agents (3)Eval Frameworks & Benchmarks (2)Data Curation & Synthetic Data (1)World Models & Planning (1)

Frequent co-authors

Chenxing Li (1)Chenxin Li (1)Zhengyang Tang (1)Huangxin Lin (1)

Papers (3)

Apr 30, 2026

Apr 30, 2026·also HKU, HKUST, PKU, SCUT +2

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

LLM agents still fail to reliably automate real-world workflows, with even the best models succeeding on only two-thirds of tasks in a new live benchmark.

Chenxing Li, Chenxin Li, Zhengyang Tang +9

Eval Frameworks & Benchmarks Tool Use & Agents

Apr 20, 2026

Junting Lu +15Apr 20, 2026·also HKUST, RUC

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

Agent-World reveals that self-evolving environments can dramatically boost agent performance, outperforming established models by leveraging dynamic task synthesis.

Junting Lu, Junjie Huang, Wanjun Zhong +13

Data Curation & Synthetic Data Tool Use & Agents World Models & Planning

Feb 26, 2026

Feb 26, 2026·also HKUST, Qian Xuesen Laboratory of Space Technology, Soochow, UESTC +1

AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios

Even the best multimodal agents struggle with realistic visual scenarios, achieving only 27% accuracy on the new AgentVista benchmark that demands long-horizon tool use across web search, image search, and code.

Zhaochen Su, Jincheng Gao, Hangyu Guo +12

Eval Frameworks & Benchmarks Multimodal Models Tool Use & Agents

Search

Shijue Huang

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)