Groups
Category
Level
Grokking is when a model suddenly starts to generalize well long after it has already memorized the training set.
Generalization bounds explain why deep neural networks can perform well on unseen data despite having many parameters.
PAC-Bayes provides high-probability generalization bounds for randomized predictors by comparing a data-dependent posterior Q to a fixed, data-independent prior P through KL(Q||P).
Deep learning generalization theory tries to explain why overparameterized networks can fit (interpolate) training data yet still perform well on new data.