Zheng Ge

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (2)Robotics & Embodied AI (1)Tool Use & Agents (1)Reasoning & Chain-of-Thought (1)

Frequent co-authors

Yinmin Zhang (2)Chun Yuan (2)Yifan Sui (1)Xinmiao Huang (1)

Papers (3)

May 26, 2026

Yifan Sui +16May 26, 2026·also StepFun

AndroidDaily: A Verifiable Benchmark for Mobile GUI Agents on Real-World Closed-Source Applications

Current mobile GUI agents are surprisingly inept at everyday smartphone tasks, achieving only 62% success on a new benchmark of real-world Android apps.

Yifan Sui, Xinmiao Huang, Hongbing Li +14

Eval Frameworks & Benchmarks Robotics & Embodied AI Tool Use & Agents

Feb 12, 2026

Yinmin Zhang +3Feb 12, 2026

PRIME: A Process-Outcome Alignment Benchmark for Verifiable Reasoning in Mathematics and Engineering

Current verifiers often reward correct answers derived from flawed reasoning, but PRIME offers a benchmark to identify and select verifiers that actually penalize incorrect derivations.

Yinmin Zhang, Chun Yuan, Tong Xu +1

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought

Feb 6, 2026

Yanlin Lai +13Feb 6, 2026

R-Align: Enhancing Generative Reward Models through Rationale-Centric Meta-Judging

Even reward models that get the right answer can be dangerously wrong in their reasoning, leading to worse RLHF outcomes, but R-Align fixes this by explicitly aligning rationales with gold standard judgments.

Yanlin Lai, Mitt Huang, Hangyu Guo +11

Search

Zheng Ge

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)