Mohammad Beigi

UC Davis

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Reasoning & Chain-of-Thought (1)RLHF & Preference Learning (1)

Frequent co-authors

Ming Jin (1)Lifu Huang (1)

Papers (1)

Jun 8, 2026

1d ago·also Virginia Tech

Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization

PRIME reveals a crucial precursor to reward hacking that can predict and adapt to misalignment before it manifests, offering a new lens on alignment risks in RL systems.

Mohammad Beigi, Ming Jin, Lifu Huang

Reasoning & Chain-of-Thought RLHF & Preference Learning

Search

Mohammad Beigi

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)