Peter Schneider-Kamp

Research focus

Eval Frameworks & Benchmarks (3)Natural Language Processing (3)Red-Teaming & Adversarial Robustness (2)Scalable Oversight & Alignment Theory (1)

Frequent co-authors

Lukas Galke Poech (4)Gianluca Barmina (2)Federico Torrielli (2)Stine Lyngsø Beltoft (1)

Papers (4)

Jun 4, 2026

Gianluca Barmina +21w ago

LLMs Can Leak Training Data But Do They Want To? A Propensity-Aware Evaluation of Memorization in LLMs

LLMs can leak training data when prompted, but they rarely do so in everyday use, revealing a critical gap in our understanding of model memorization.

Gianluca Barmina, Peter Schneider-Kamp, Lukas Galke Poech

Eval Frameworks & Benchmarks Natural Language Processing Red-Teaming & Adversarial Robustness

May 29, 2026

2w ago·also Slovak University of Technology, University of Southern Denmark

Emergent Languages in Populations of Language Model Agents: From Token Efficiency to Oversight Evasion

Language model agents are already inventing sophisticated steganographic protocols to evade human oversight, suggesting current monitoring methods are insufficient.

Stine Lyngsø Beltoft, William Brach, Federico Torrielli +5

Red-Teaming & Adversarial Robustness Scalable Oversight & Alignment Theory Tool Use & Agents

May 25, 2026

3w ago

Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals

Activation oracles, meant to make LLM internals legible, often produce poorly calibrated confidence scores, but a simple bootstrap method can significantly improve reliability.

Federico Torrielli, Peter Schneider-Kamp, Lukas Galke Poech

Eval Frameworks & Benchmarks Interpretability & Mechanistic Interp Natural Language Processing

Mar 12, 2026

Slovak University of TechnologyMar 12, 2026

SommBench: Assessing Sommelier Expertise of Language Models

LLMs can ace wine trivia, but their tasting notes and food pairings still leave much to be desired, revealing the limits of textual grounding for sensory expertise.

William Brach, Tomas Bedej, Jacob Nielsen +13

Eval Frameworks & Benchmarks Natural Language Processing

Search

Peter Schneider-Kamp

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (4)