Groups
Category
Level
Minimum Description Length (MDL) picks the model that compresses the data best by minimizing L(M) + L(D|M).
Cross-entropy loss measures how well predicted probabilities match the true labels by penalizing confident wrong predictions heavily.
Softmax turns arbitrary real-valued scores (logits) into probabilities that sum to one.
Cross-entropy measures how well a proposed distribution Q predicts outcomes actually generated by a true distribution P.