Search papers, labs, and topics across Lattice.
Kyoto University
3
0
5
Adding just one spatial word can lead MLLMs to consistently choose the wrong answer, revealing a critical vulnerability in their reasoning processes.
LLMs don't need "wait, let me think..." to reason鈥攊n fact, dropping the cutesy anthropomorphic markers can actually *improve* their performance.
RL models trained with verifiable rewards exhibit a surprising deductive-over-abductive reasoning asymmetry, even in controlled environments, suggesting a fundamental challenge in current RLVR approaches.