Groups
Category
Level
Minimum Description Length (MDL) picks the model that compresses the data best by minimizing L(M) + L(D|M).
Rรฉnyi entropy generalizes Shannon entropy by measuring uncertainty with a tunable emphasis on common versus rare outcomes.
An f-divergence measures how different two probability distributions P and Q are by averaging a convex function f of the density ratio p(x)/q(x) under Q.
The Maximum Entropy Principle picks the probability distribution with the greatest uncertainty (entropy) that still satisfies the facts you know (constraints).
Rateโdistortion theory tells you the minimum number of bits per symbol needed to represent data while keeping average distortion below a target D.
The Information Bottleneck (IB) principle formalizes the tradeoff between compressing an input X and preserving information about a target Y using the objective min_{p(t|x)} I(X;T) - \beta I(T;Y).
Cross-entropy measures how well a proposed distribution Q predicts outcomes actually generated by a true distribution P.
KL divergence measures how much information is lost when using model Q to approximate the true distribution P.
Mutual Information (MI) measures how much knowing one random variable reduces uncertainty about another.
Shannon entropy quantifies the average uncertainty or information content of a random variable in bits when using base-2 logarithms.