Groups
Category
L2 regularization (also called ridge or weight decay) adds a penalty proportional to the sum of squared weights to discourage large parameters.
The Lottery Ticket Hypothesis (LTH) says that inside a large dense neural network there exist small sparse subnetworks that, when trained in isolation from their original initialization, can reach comparable accuracy to the full model.