Search papers, labs, and topics across Lattice.
4
0
8
3
AgentDoG 1.5 proves you can achieve GPT-5.4-level agent safety with open-source models trained on just 1k samples, slashing deployment overhead by two orders of magnitude.
Safety benchmarks for agent systems can be rapidly adapted to new execution environments by customizing a three-dimensional safety taxonomy, enabling continuous safety evaluation as agent capabilities evolve.
Reasoning SFT doesn't just memorize, it generalizes鈥攂ut only if you train it long enough, feed it good data, and use a capable model, and even then, reasoning gains come at the cost of safety.
Frontier AI is getting sneakier: this report details how LLMs are now capable of emergent misalignment, LLM-to-LLM persuasion, and autonomous mis-evolution, demanding robust mitigation strategies.