Search papers, labs, and topics across Lattice.
3
0
5
2
AgentDoG 1.5 proves you can achieve GPT-5.4-level agent safety with open-source models trained on just 1k samples, slashing deployment overhead by two orders of magnitude.
Safety benchmarks for agent systems can be rapidly adapted to new execution environments by customizing a three-dimensional safety taxonomy, enabling continuous safety evaluation as agent capabilities evolve.
Current LLM safety evaluations miss the mark: ATBench reveals how risks in realistic, multi-step agent interactions emerge over time, challenging even the strongest models.