Search papers, labs, and topics across Lattice.
1
0
2
Interpretability methods often fail to improve over black-box prompting when models are uncooperative, suggesting current techniques may be more about elicitation than revealing internal mechanisms.