Groups
Category
L2 regularization (also called ridge or weight decay) adds a penalty proportional to the sum of squared weights to discourage large parameters.
Statistical learning theory explains why a model that fits training data can still predict well on unseen data by relating true risk to empirical risk plus a complexity term.