Tiejin Chen

LLMs have "pure incorrectness" features that correlate with wrong answers but don't actually *cause* them, suggesting that simply identifying error-correlated activations isn't enough for effective intervention.

Het Patel, Tiejin Chen, Hua Wei +1

Interpretability & Mechanistic Interp

Jan 24, 2026

Conformal Feedback Alignment: Quantifying Answer-Level Reliability for Robust LLM Alignment

Forget weighting preferences alone – this new method uses conformal prediction to directly quantify and leverage the reliability of the *answers* themselves, leading to more robust and data-efficient LLM alignment.

Tiejin Chen, Xiaoou Liu, Vishnu Nandam +2

Search

Tiejin Chen

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)