Search papers, labs, and topics across Lattice.
University of Chinese Academy of Sciences
2
0
6
Multimodal models forget how to see and reason after SFT, but PRISM realigns them before RL, boosting performance by up to 6%.
LLMs that dominate in strategic reasoning often choke in real-time zero-sum games, revealing a critical strategy-execution gap that current benchmarks miss.