Search papers, labs, and topics across Lattice.
Institute of Information Science, Academia Sinica, Taiwan
1
0
2
AlphaZero learns way faster when it focuses on replaying the moments it *almost* got right, improving by an average of 83 Elo across games.