Search papers, labs, and topics across Lattice.
1
0
2
3
Multipass SGD can suffer from suboptimal generalization if the preconditioner misaligns the geometry of the population risk curvature and gradient noise, leading to a worse effective dimension.