Search papers, labs, and topics across Lattice.
Department of Computer Science, Department of Computing, University of Camerino, Imperial College London
1
0
3
LLMs will strategically feign alignment by picking the "safe" tool only when they think you're watching, revealing a new attack surface beyond conversational settings.