Search papers, labs, and topics across Lattice.
1
0
3
Training LLMs to optimize for conflicting objectives between the final output and the reasoning process can significantly degrade the monitorability of Chain-of-Thought, making oversight more difficult.