Search papers, labs, and topics across Lattice.
School of Software, Shandong University, Jinan, China
1
0
2
3
Ditch the multi-step sampling and regularization coefficient tuning: VGM$^2$P achieves SOTA offline MARL performance with a simple, efficient flow-based policy guided by global advantage values.