Lukas Galke Poech

Research focus

Eval Frameworks & Benchmarks (2)Red-Teaming & Adversarial Robustness (1)Scalable Oversight & Alignment Theory (1)Tool Use & Agents (1)

Frequent co-authors

Federico Torrielli (2)Peter Schneider-Kamp (2)Stine Lyngsø Beltoft (1)William Brach (1)

Papers (3)

May 29, 2026

University of Southern Denmark3d ago·also Ordbogen, Slovak University of Technology, Turin

Emergent Languages in Populations of Language Model Agents: From Token Efficiency to Oversight Evasion

Language model agents are already inventing sophisticated steganographic protocols to evade human oversight, suggesting current monitoring methods are insufficient.

Stine Lyngsø Beltoft, William Brach, Federico Torrielli +5

Red-Teaming & Adversarial Robustness Scalable Oversight & Alignment Theory Tool Use & Agents

May 25, 2026

University of Southern Denmark1w ago·also Turin

Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals

Activation oracles, meant to make LLM internals legible, often produce poorly calibrated confidence scores, but a simple bootstrap method can significantly improve reliability.

Federico Torrielli, Peter Schneider-Kamp, Lukas Galke Poech

Eval Frameworks & Benchmarks Interpretability & Mechanistic Interp Natural Language Processing

May 21, 2026

1w ago·also University of Southern Denmark

ChronoMedKG: A Temporally-Grounded Biomedical Knowledge Graph and Benchmark for Clinical Reasoning

LLMs' clinical reasoning accuracy plummets by 30% when time matters, but a new temporally-aware knowledge graph recovers nearly two-thirds of that loss.

Md Shamim Ahmed, Farzaneh Firoozbakht, Lukas Galke Poech +2

Eval Frameworks & Benchmarks Recommendation & Information Retrieval Scientific Discovery & Drug Design

Search

Lukas Galke Poech

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)