Search papers, labs, and topics across Lattice.
1
0
3
A 4B parameter SLM can now rival frontier agent performance in complex tool-use environments, thanks to a novel reinforcement finetuning framework that teaches it how to strategically acquire context and execute actions.