Search papers, labs, and topics across Lattice.
NII LLMC
2
0
5
Adding just one spatial word can lead MLLMs to consistently choose the wrong answer, revealing a critical vulnerability in their reasoning processes.
RL models trained with verifiable rewards exhibit a surprising deductive-over-abductive reasoning asymmetry, even in controlled environments, suggesting a fundamental challenge in current RLVR approaches.