Search papers, labs, and topics across Lattice.
The University of Queensland
1
0
3
4
Forget scaling and RLHF: carefully selecting internal attention signals from the right layers lets a zero-shot 8B model match a 14B reinforcement-learned re-ranker in complex reasoning tasks.