Search papers, labs, and topics across Lattice.
1
0
3
9
SFT's instability and reward sparsity can be overcome with a novel Group Fine-Tuning (GFT) framework, leading to better LLM policies.