Lattice AI Research

Research focus

Red-Teaming & Adversarial Robustness (2)Natural Language Processing (1)Eval Frameworks & Benchmarks (1)RLHF & Preference Learning (1)Scalable Oversight & Alignment Theory (1)

Frequent co-authors

Yulin Chen (2)Yufei He (2)Bryan Hooi (2)Yuexin Li (1)

Papers (3)

May 28, 2026

Yuexin Li +7May 28, 2026

AliMark: Enhancing Robustness of Sentence-Level Watermarking Against Text Paraphrasing

Sentence-level watermarks can now survive aggressive paraphrasing attacks like sentence splitting and merging, thanks to a new alignment-based approach.

Yuexin Li, Wenjie Qu, Linyu Wu +5

Natural Language Processing Red-Teaming & Adversarial Robustness

May 25, 2026

NUSMay 25, 2026·also NTU

When In-Distribution Gains Fail: Evaluating Weak-to-Strong Reward Models under Preference Shift

Weak-to-strong reward models can ace the test but still fail in the real world, revealing a hidden brittleness in current preference learning approaches.

Khoi Le, Tri Cao, Phong Nguyen +4

Eval Frameworks & Benchmarks RLHF & Preference Learning Scalable Oversight & Alignment Theory

Apr 14, 2026

Yulin Chen +8Apr 14, 2026

WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents

A dedicated guard agent, trained via reasoning-intensive methods, can effectively neutralize prompt injection attacks in web-navigating agents without sacrificing performance.

Yulin Chen, Tri Cao, Haoran Li +6

Multimodal Models Red-Teaming & Adversarial Robustness Tool Use & Agents

Search

Tri Cao

Research focus

Frequent co-authors

Papers (3)