Search papers, labs, and topics across Lattice.
1
0
3
Sparse autoencoders, hyped as a key interpretability tool, may not be learning much more than random feature sets, casting doubt on their ability to decompose model internals.