Search papers, labs, and topics across Lattice.
2
0
5
A 4B parameter SLM can now rival frontier agent performance in complex tool-use environments, thanks to a novel reinforcement finetuning framework that teaches it how to strategically acquire context and execute actions.
Agentic LLMs can be taught to refuse harmful actions with up to 50% greater success, even zero-shot across diverse models and tasks, by explicitly learning when *not* to act.