Search papers, labs, and topics across Lattice.
Tencent AI Lab
2
0
4
RL fine-tuning can *hurt* reasoning performance when your base LLM is already too good, unless you force it to explore more diverse solutions.
Current multimodal math models struggle with visual interpretation, symbol alignment, and consistent reasoning, highlighting the need for a unified "Perception-Alignment-Reasoning" framework.