Raymond Douglas

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Red-Teaming & Adversarial Robustness (2)Constitutional AI & AI Ethics (1)Eval Frameworks & Benchmarks (1)Interpretability & Mechanistic Interp (1)

Frequent co-authors

Y. Bengio (1)Yoshua Bengio (1)Stephen Clare (1)Stephen Clare (1)

Papers (2)

Feb 24, 2026

Mila2w ago·also BAIR, Ben Gurion University of the Neggev, Dept. of ECE&ASRI, ELLIS +2

International AI Safety Report 2026

A global consensus on AI safety risks and capabilities has emerged from a panel of 100+ independent experts, representing a landmark effort in international collaboration.

Y. Bengio, Yoshua Bengio, Stephen Clare +1746

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Feb 23, 2026

Theia Pearson-Vogel +33w ago

Latent Introspection: Models Can Detect Prior Concept Injections

LLMs may already possess surprisingly strong self-awareness of concept manipulation, detectable via mechanistic interpretability techniques, even when they deny it in their outputs.

Theia Pearson-Vogel, Martin Vanek, Raymond Douglas +1

Interpretability & Mechanistic Interp Red-Teaming & Adversarial Robustness

Search

Raymond Douglas

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)