Search papers, labs, and topics across Lattice.
1
0
2
By weighting Q-learning updates based on action similarity, QSIM tames overestimation in multi-agent RL, leading to more stable and effective learning.