Groups
Category
Level
Minimum Description Length (MDL) picks the model that compresses the data best by minimizing L(M) + L(D|M).
Early stopping halts training when the validation loss stops improving, preventing overfitting and saving compute.
Dropout randomly turns off (zeros) some neurons during training to prevent the network from memorizing the training data.
Empirical Risk Minimization (ERM) chooses a model that minimizes the average loss on the training data.
VC dimension measures how many distinct labelings a hypothesis class can realize on any set of points of a given size.
The biasโvariance tradeoff explains how prediction error splits into bias squared, variance, and irreducible noise for squared loss.