Search papers, labs, and topics across Lattice.
1
0
3
RL unlocks genuinely new tool-use capabilities in LLMs by enabling compositional strategies that surpass what's achievable through mere re-sampling, challenging the notion that RL only improves reliability.