Search papers, labs, and topics across Lattice.
3
3
5
10
SentGuard detects 90.5% of unsafe content within two sentences, revolutionizing real-time moderation for large language models.
Guard models trained with BraveGuard can detect safety threats in computer-use agents with over 82% accuracy, a significant leap from conventional methods.
Even state-of-the-art multimodal LLMs like GPT-5.2 and Claude 4.5 can be jailbroken nearly half the time using OpenRT's diverse suite of attacks, revealing a critical lack of generalization across attack paradigms.