Search papers, labs, and topics across Lattice.
1
0
3
0
LLMs' chain-of-thought explanations often fail to reflect the true drivers of their decisions, and this benchmark reveals that closed-source models are particularly opaque, with monitorability dropping by up to 30% under stress.