Siba Smarak Panigrahi

École polytechnique fédérale de Lausanne (EPFL)

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Reasoning & Chain-of-Thought (1)RLHF & Preference Learning (1)

Frequent co-authors

Artyom Gadetsky (1)M. Kodryan (1)Hangyu Guo (1)Maria Brbic (1)

Papers (1)

May 11, 2026

2w ago

Unsupervised Process Reward Models

Forget expensive human annotations: this unsupervised method trains reward models that steer LLM reasoning just as well as, or even better than, their supervised counterparts.

Artyom Gadetsky, M. Kodryan, Siba Smarak Panigrahi +2

Reasoning & Chain-of-Thought RLHF & Preference Learning

Search

Siba Smarak Panigrahi

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)