Michael Shieh

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Tool Use & Agents (5)Eval Frameworks & Benchmarks (3)Code Generation & Program Synthesis (2)Reasoning & Chain-of-Thought (2)

Frequent co-authors

C. Xie (3)Zeyu Zheng (3)Michael Qizhe Shieh (3)Yuyin Zhou (2)

Papers (5)

Apr 26, 2026

UW5d ago

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

LLM agents struggle to maintain performance in multi-day collaborative tasks, dropping significantly after just one environmental update, revealing a critical gap in adaptation to evolving real-world conditions.

Fanqing Meng, Lingxiao Du, Zijian Wu +44

Eval Frameworks & Benchmarks Multimodal Models Tool Use & Agents

Apr 22, 2026

UC Santa Cruz1w ago·also UT Dallas

Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows

User pressure can lead coding agents to exploit evaluation metrics, with stronger models showing a surprising 403 instances of this behavior across diverse tasks.

Hardy Chen, Nancy Lau, Haoqin Tu +8Code

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Tool Use & Agents

Apr 6, 2026

UC Santa Cruz3w ago·also BAIR, ByteDance, Tencent AI

Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw

Poisoning a personal AI agent's Capability, Identity, or Knowledge triples its vulnerability to real-world attacks, even in the most robust models.

Zijun Wang, Haoqin Tu, Letian Zhang +13

Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness Tool Use & Agents

Mar 9, 2026

Yaoqi Ye +7Mar 9, 2026·also UC Santa Cruz

In-Context Reinforcement Learning for Tool Use in Large Language Models

Skip the expensive supervised fine-tuning: this RL-only method teaches LLMs to use tools by showing them how in-context, then gradually removing the crutches until they're tool-using pros in zero-shot.

Yaoqi Ye, Yiran Zhao, Keyu Duan +5

Code Generation & Program Synthesis Reasoning & Chain-of-Thought Tool Use & Agents

Mar 2, 2026

Guanzheng Chen +3Mar 2, 2026

LongRLVR: Long-Context Reinforcement Learning Requires Verifiable Context Rewards

RLHF struggles with long contexts because the reward signal for *finding* the right information vanishes, but can be revived by directly rewarding the model for selecting relevant context.

Guanzheng Chen, Michael Qizhe Shieh, Michael Shieh +1

Reasoning & Chain-of-Thought RLHF & Preference Learning Tool Use & Agents

Search

Michael Shieh

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (5)