Search papers, labs, and topics across Lattice.
Harvard University
1
0
3
LLMs' harmful outputs stem from a surprisingly compact and unified set of weights, suggesting a fundamental, addressable structure underlying even emergent misalignment.