Search papers, labs, and topics across Lattice.
Arizona State University
3
15
4
8
MLLMs can leak sensitive information from images, exposing new privacy risks that traditional models do not face.
LLMs have "pure incorrectness" features that correlate with wrong answers but don't actually *cause* them, suggesting that simply identifying error-correlated activations isn't enough for effective intervention.
Forget weighting preferences alone – this new method uses conformal prediction to directly quantify and leverage the reliability of the *answers* themselves, leading to more robust and data-efficient LLM alignment.