Search papers, labs, and topics across Lattice.
1
0
3
8
Forget probing transformer block outputs – the *real* OOD performance gains in ViTs come from selectively probing feedforward network activations or self-attention outputs depending on the severity of the distribution shift.