Hoagy Cunningham

Papers on Lattice

Total citations

Topics

h-index

Research focus

Constitutional AI & AI Ethics (1)Eval Frameworks & Benchmarks (1)Red-Teaming & Adversarial Robustness (1)

Frequent co-authors

Xuanli He (1)Bilgehan Sel (1)Bilgehan Sel (1)F. Ali (1)

Papers (1)

Apr 16, 2026

AnthropicApr 16, 2026

Segment-Level Coherence for Robust Harmful Intent Probing in LLMs

LLM safety probes can be made significantly more robust to adversarial attacks by requiring consistent evidence across token segments, not just isolated spikes.

Xuanli He, Bilgehan Sel, Bilgehan Sel +8

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Search

Hoagy Cunningham

Research focus

Frequent co-authors

Papers (1)