Groups
Category
Grokking is when a model suddenly starts to generalize well long after it has already memorized the training set.
Deep learning generalization theory tries to explain why overparameterized networks can fit (interpolate) training data yet still perform well on new data.