Search papers, labs, and topics across Lattice.
Microsoft
1
0
2
By explicitly prompting for reflection on failure, ERL unlocks up to 81% better performance in complex RL tasks and 11% gains in tool-using reasoning.