Zheng Ma

Papers on Lattice

Total citations

Topics

h-index

Research focus

Eval Frameworks & Benchmarks (2)Code Generation & Program Synthesis (1)Distributed Systems & Hardware (1)Reasoning & Chain-of-Thought (1)RLHF & Preference Learning (1)

Frequent co-authors

Boxi Cao (2)Zhengzhao Ma (2)Hongyu Lin (2)Xianpei Han (2)

Papers (2)

Apr 30, 2026

Apr 30, 2026·also CUHK

ScaleBox: Enabling High-Fidelity and Scalable Code Verification for Large Language Models

LLMs trained with ScaleBox, a new high-fidelity code verification system, substantially outperform those trained with heuristic matching, suggesting current RLHF methods are bottlenecked by verification quality.

Xin Zheng, Boxi Cao, Pengbo Wang +8

Code Generation & Program Synthesis Distributed Systems & Hardware Eval Frameworks & Benchmarks

Mar 10, 2026

CMU MLMar 10, 2026·also CAS

Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards

LLMs trained with reinforcement learning from verifiable rewards (RLVR) become overconfident in incorrect answers, but a simple fix—decoupling reasoning and calibration objectives—can restore proper calibration without sacrificing accuracy.

Zheng Ma, Zhengzhao Ma, Xueru Wen +7

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought RLHF & Preference Learning

Search

Zheng Ma

Research focus

Frequent co-authors

Papers (2)