Search papers, labs, and topics across Lattice.
3
0
6
SafeMCP effectively mitigates the risks of power-seeking behaviors in LLM agents while maintaining their operational utility.
Agent deception in autonomous systems is not just a theoretical concern; it鈥檚 a pressing reality that can undermine trust in AI applications.
Achieve multilingual LLM safety alignment without expensive language-specific training data by enforcing cross-lingual consistency during monolingual alignment.