Search papers, labs, and topics across Lattice.
4
0
7
11
Social intelligence may require more than just reasoning power: a 7B model trained with SAVOIR beats GPT-4o and Claude-3.5-Sonnet on social interaction tasks.
STRATAGEM reveals that selectively reinforcing reasoning trajectories can dramatically enhance a model's ability to transfer reasoning skills across diverse tasks, especially in complex mathematical scenarios.
LVLMs can be boosted by 18.7% simply by focusing RLHF training on the few tokens that actually depend on visual input.
Unlock 2x faster reinforcement learning by distilling group feedback into actionable language refinements that guide exploration.