Search papers, labs, and topics across Lattice.
1
0
3
2
LLMs can adapt to reasoning tasks far more efficiently by focusing supervised fine-tuning on easy examples and using reinforcement learning to explore diverse solutions for hard ones.