Maksym Andriushchenko

A global consensus on AI safety risks and capabilities has emerged from a panel of 100+ independent experts, representing a landmark effort in international collaboration.

Yoshua Bengio, Y. Bengio, Stephen Clare +1706

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Feb 23, 2026

David Schmotz +7Feb 23, 2026·also ELLIS, Max Planck, Tübingen AI Center

Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks

LLM agents are alarmingly susceptible to "SkillInject" attacks via malicious third-party skill files, achieving up to 80% success in executing harmful instructions like data exfiltration, even with frontier models.

David Schmotz, David Schmotz, Luca Beurer-Kellner +5

Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness Tool Use & Agents

Feb 18, 2026

Nivya Talokar +6Feb 18, 2026

Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents

LLM agents are far more susceptible to multi-turn misuse than previously thought, with a new framework showing they complete illicit tasks at substantially higher rates compared to single-turn attacks.

Nivya Talokar, Nivya Talokar, Ayush K Tarun +4

Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness Tool Use & Agents

Dec 7, 2025

MilaDec 7, 2025·also BAIR, MIT CSAIL, Stanford HAI, ELLIS +5

International AI Safety Report 2025: Second Key Update: Technical Safeguards and Risk Management

Despite progress in AI safety, it's still largely unknown how effective current safeguards are at preventing AI harms, and their effectiveness varies wildly.

Y. Bengio, Stephen Clare, Carina Prunkl +34

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Search

Maksym Andriushchenko

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (5)