Groups
Category
In underdetermined linear systems (more variables than equations), gradient descent started at zero converges to the minimum Euclidean norm solution without any explicit regularizer.
Deep learning generalization theory tries to explain why overparameterized networks can fit (interpolate) training data yet still perform well on new data.