Search papers, labs, and topics across Lattice.
1
0
3
Multimodal LLMs get a serious reasoning boost from Durian, a difficulty-aware normalization that tames the instability caused by extreme samples and noisy rewards.