🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
⏱️Coach🧩Problems🧠Thinking🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
Back to Concepts
📡

Information Theory

Entropy, divergence, and mutual information — measuring uncertainty and information in learning systems.

10 concepts

Intermediate6

📚TheoryIntermediate

Shannon Entropy

Shannon entropy quantifies the average uncertainty or information content of a random variable in bits when using base-2 logarithms.

#shannon entropy#information gain#mutual information+12
📚TheoryIntermediate

KL Divergence

KL divergence measures how much information is lost when using model Q to approximate the true distribution P.

#kl divergence#relative entropy#cross-entropy+12
📚TheoryIntermediate

Cross-Entropy

Cross-entropy measures how well a proposed distribution Q predicts outcomes actually generated by a true distribution P.

#cross-entropy#entropy#kl divergence+12
📚TheoryIntermediate

Mutual Information

Mutual Information (MI) measures how much knowing one random variable reduces uncertainty about another.

#mutual information#entropy#kl divergence+12
∑MathIntermediate

Rényi Entropy & Divergence

Rényi entropy generalizes Shannon entropy by measuring uncertainty with a tunable emphasis on common versus rare outcomes.

#renyi entropy#renyi divergence#shannon entropy+12
📚TheoryIntermediate

Minimum Description Length (MDL)

Minimum Description Length (MDL) picks the model that compresses the data best by minimizing L(M) + L(D|M).

#minimum description length#mdl#bic+12

Advanced4

📚TheoryAdvanced

Information Bottleneck

The Information Bottleneck (IB) principle formalizes the tradeoff between compressing an input X and preserving information about a target Y using the objective min_{p(t|x)} I(X;T) - \beta I(T;Y).

#information bottleneck#mutual information#kl divergence+12
📚TheoryAdvanced

Rate-Distortion Theory

Rate–distortion theory tells you the minimum number of bits per symbol needed to represent data while keeping average distortion below a target D.

#rate-distortion#mutual information#blahut-arimoto+12
📚TheoryAdvanced

Maximum Entropy Principle

The Maximum Entropy Principle picks the probability distribution with the greatest uncertainty (entropy) that still satisfies the facts you know (constraints).

#maximum entropy principle#jaynes#exponential family+12
∑MathAdvanced

f-Divergences

An f-divergence measures how different two probability distributions P and Q are by averaging a convex function f of the density ratio p(x)/q(x) under Q.

#f-divergence#csiszar divergence#kullback–leibler+11