Search papers, labs, and topics across Lattice.
1
0
2
Even in a minimal neural network setting, gradient descent's convergence to a robust solution is provably, and surprisingly, slow, scaling as $Θ(1/\ln(t))$.