Groups
Generalization bounds explain why deep neural networks can perform well on unseen data despite having many parameters.
The Lottery Ticket Hypothesis (LTH) says that inside a large dense neural network there exist small sparse subnetworks that, when trained in isolation from their original initialization, can reach comparable accuracy to the full model.