Search papers, labs, and topics across Lattice.
University of Basel, Switzerland
2
0
2
Forget tedious hyperparameter tuning: adaptive optimizers like DP-SignSGD and DP-Adam maintain performance across privacy levels, unlike DP-SGD whose learning rate plummets with increased privacy.
AdamW's decoupled weight decay prevents Neural Collapse, challenging the assumption that this phenomenon is universal across optimization methods.