Search papers, labs, and topics across Lattice.
Kyoto University, NII LLMC
3
0
7
Adding just one spatial word can lead MLLMs to consistently choose the wrong answer, revealing a critical vulnerability in their reasoning processes.
Hybrid-thinking LLMs can be dramatically improved by simply separating the feed-forward pathways for reasoning and non-reasoning modes, leading to less leakage and better accuracy.
Agent evaluation is bottlenecked by environment interaction overhead, but ACE-Bench slashes this by using static JSON files, enabling fast and reproducible training-time validation.