Search papers, labs, and topics across Lattice.
2
0
4
0
Despite advances in instruction following, current AI agents still struggle to infer implicit user needs related to safety, privacy, and accessibility, highlighting a critical gap in real-world applicability.
Stripping away obvious "triggering cues" from adversarial attacks reveals that current AI safety datasets drastically overestimate model robustness, turning "safe" models like Gemini 3 Pro and Claude Sonnet 3.7 into easy targets.