Search papers, labs, and topics across Lattice.
5
0
8
3
AgentDoG 1.5 proves you can achieve GPT-5.4-level agent safety with open-source models trained on just 1k samples, slashing deployment overhead by two orders of magnitude.
Safety benchmarks for agent systems can be rapidly adapted to new execution environments by customizing a three-dimensional safety taxonomy, enabling continuous safety evaluation as agent capabilities evolve.
Reasoning SFT doesn't just memorize, it generalizes鈥攂ut only if you train it long enough, feed it good data, and use a capable model, and even then, reasoning gains come at the cost of safety.
Current LLM safety evaluations miss the mark: ATBench reveals how risks in realistic, multi-step agent interactions emerge over time, challenging even the strongest models.
Code-executing agents can autonomously generate new, solvable math problems that are harder than existing ones, offering a scalable solution to the bottleneck of high-quality training data for advanced LLMs.