Katherine M. Collins

Research focus

Eval Frameworks & Benchmarks (2)Constitutional AI & AI Ethics (1)Red-Teaming & Adversarial Robustness (1)Scalable Oversight & Alignment Theory (1)

Frequent co-authors

Kelsey R. Allen (2)Sasha Robinson (1)Kerem Oktar (1)Ilia Sucholutsky (1)

Papers (2)

Feb 24, 2026

Sasha Robinson +42w ago

Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models

Even when explicitly warned about potential deception, LLMs can still be persuaded to make incorrect decisions, highlighting a critical gap between task performance and vigilance.

Sasha Robinson, Kerem Oktar, Katherine M. Collins +2

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Feb 19, 2026

MIT CSAIL3w ago·also BAIR

AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games

VLMs are nowhere near human-level general intelligence: they score less than 10% of human performance across a diverse set of human-designed games, especially struggling with world-model learning, memory, and planning.

Lance Ying, Lance Ying, Ryan Truong +20

Eval Frameworks & Benchmarks Scalable Oversight & Alignment Theory Tool Use & Agents

Search

Katherine M. Collins

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)