Search papers, labs, and topics across Lattice.
2
0
5
2
Forget hand-crafted reward functions: $\text{RLR}^3$ leverages rubrics and LLMs to provide fine-grained, multi-criteria supervision, outperforming standard RLVR in vision-language tasks.
ImagineAgent's clever combination of cognitive maps and generative tools lets it crush previous state-of-the-art on OV-HOI tasks while needing only 20% of the training data.