Search papers, labs, and topics across Lattice.
1
0
3
Current LLM safety evaluations miss the mark: ATBench reveals how risks in realistic, multi-step agent interactions emerge over time, challenging even the strongest models.