and Rohin Shah

Google DeepMind

Papers on Lattice

Total citations

Topics

Research focus

Reasoning & Chain-of-Thought (1)RLHF & Preference Learning (1)Scalable Oversight & Alignment Theory (1)

Frequent co-authors

Max Kaufmann (1)

Papers (1)

Mar 31, 2026

DeepMindMar 31, 2026

Aligned, Orthogonal or In-conflict: When can we safely optimize Chain-of-Thought?

Training LLMs to optimize for conflicting objectives between the final output and the reasoning process can significantly degrade the monitorability of Chain-of-Thought, making oversight more difficult.

Max Kaufmann, and Rohin Shah

Reasoning & Chain-of-Thought RLHF & Preference Learning Scalable Oversight & Alignment Theory

Search

and Rohin Shah

Research focus

Frequent co-authors

Papers (1)