Search papers, labs, and topics across Lattice.
1
0
3
RL agents can learn more robust vision-and-language navigation policies by exploring diverse trajectories and comparing their performance, even without expert demonstrations or value networks.