Search papers, labs, and topics across Lattice.
4
0
8
Surprisingly, the "think before answer" paradigm fails to enhance generative recommendation models, prompting a novel approach that redefines how reasoning is integrated into these systems.
Forget hand-crafted reward functions: $\text{RLR}^3$ leverages rubrics and LLMs to provide fine-grained, multi-criteria supervision, outperforming standard RLVR in vision-language tasks.
Sub-linear attention is now possible without sacrificing complete long-range dependency retention, thanks to learnable summary tokens that compress context.
Fine-grained rubrics unlock significantly better visual reasoning in preference optimization, rivaling GPT-5.4 with a much smaller model.