Search papers, labs, and topics across Lattice.
1
0
3
Achieve more efficient reasoning in Transformers without increasing test-time cost by using training-only techniques that guide attention and dynamically adjust sharpness.