Search papers, labs, and topics across Lattice.
The paper introduces StrassenNet, a neural network architecture designed to learn fast matrix multiplication algorithms by searching for low-rank decompositions of the matrix-multiplication tensor. By training StrassenNet on 2x2 matrix multiplication, the network consistently rediscovers Strassen's algorithm, converging to a rank-7 tensor. Experiments on 3x3 multiplication suggest that a rank of 23 may be the smallest effective rank, outperforming models with ranks 19-22, and preliminary results extend the method to border-rank decompositions, aligning with known bounds.
A neural net, StrassenNet, suggests the true rank of 3x3 matrix multiplication may be 23, potentially overturning decades of assumptions.
Fast matrix multiplication can be described as searching for low-rank decompositions of the matrix--multiplication tensor. We design a neural architecture, \textsc{StrassenNet}, which reproduces the Strassen algorithm for $2\times 2$ multiplication. Across many independent runs the network always converges to a rank-$7$ tensor, thus numerically recovering Strassen's optimal algorithm. We then train the same architecture on $3\times 3$ multiplication with rank $r\in\{19,\dots,23\}$. Our experiments reveal a clear numerical threshold: models with $r=23$ attain significantly lower validation error than those with $r\le 22$, suggesting that $r=23$ could actually be the smallest effective rank of the matrix multiplication tensor $3\times 3$. We also sketch an extension of the method to border-rank decompositions via an $\varepsilon$--parametrisation and report preliminary results consistent with the known bounds for the border rank of the $3\times 3$ matrix--multiplication tensor.