Search papers, labs, and topics across Lattice.
Institute for the Foundations of Learning and Data, Fraunhofer Heinrich-Hertz-Institut, Fraunhofer Heinrich-Hertz-Institut Technische Universit盲t Berlin BIFOLD -Berlin
1
0
2
Activation steering turns interpretability into a hands-on debugging tool, but watch out for unintended consequences and limited generalization.