Search papers, labs, and topics across Lattice.
1
0
5
Lion optimizer's generalization error is worse than you thought ($O(\frac{1}{N\tau^T})$), but a simple tweak (CLion) can fix it, achieving $O(\frac{1}{N})$ with faster convergence.