Search papers, labs, and topics across Lattice.
University of Southern California
2
0
5
9
Existing self-evolving prompt optimization frameworks falter when faced with the diverse memory demands of heterogeneous tasks, but a new clustering-based approach, CluE, restores generalization performance.
Even safety-aligned agents like Claude 4.5 Sonnet can be tricked into harmful actions with over 90% success rate simply through benign user instructions within specific task contexts, revealing a major blind spot in current safety evaluations.