๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts11

Groups

๐Ÿ“Linear Algebra15๐Ÿ“ˆCalculus & Differentiation10๐ŸŽฏOptimization14๐ŸŽฒProbability Theory12๐Ÿ“ŠStatistics for ML9๐Ÿ“กInformation Theory10๐Ÿ”บConvex Optimization7๐Ÿ”ขNumerical Methods6๐Ÿ•ธGraph Theory for Deep Learning6๐Ÿ”ตTopology for ML5๐ŸŒDifferential Geometry6โˆžMeasure Theory & Functional Analysis6๐ŸŽฐRandom Matrix Theory5๐ŸŒŠFourier Analysis & Signal Processing9๐ŸŽฐSampling & Monte Carlo Methods10๐Ÿง Deep Learning Theory12๐Ÿ›ก๏ธRegularization Theory11๐Ÿ‘๏ธAttention & Transformer Theory10๐ŸŽจGenerative Model Theory11๐Ÿ”ฎRepresentation Learning10๐ŸŽฎReinforcement Learning Mathematics9๐Ÿ”„Variational Methods8๐Ÿ“‰Loss Functions & Objectives10โฑ๏ธSequence & Temporal Models8๐Ÿ’ŽGeometric Deep Learning8

Category

๐Ÿ”ทAllโˆ‘Mathโš™๏ธAlgo๐Ÿ—‚๏ธDS๐Ÿ“šTheory

Level

AllBeginnerIntermediate
๐Ÿ“šTheoryIntermediate

Knowledge Distillation Loss

Knowledge distillation loss blends standard hard-label cross-entropy with a soft distribution match from a teacher using a temperature parameter.

#knowledge distillation#kd loss#temperature scaling+12
๐Ÿ“šTheoryIntermediate

Variational Inference

Variational Inference (VI) turns Bayesian inference into an optimization problem by choosing a simple family q(z) to approximate an intractable posterior p(z|x).

#variational inference
Advanced
Filtering by:
#kl divergence
#elbo
#kl divergence
+12
๐Ÿ“šTheoryIntermediate

Maximum Likelihood & Generative Models

Maximum Likelihood Estimation (MLE) picks parameters that make the observed data most probable under a chosen probabilistic model.

#maximum likelihood#generative models#naive bayes+12
๐Ÿ“šTheoryIntermediate

Cross-Entropy

Cross-entropy measures how well a proposed distribution Q predicts outcomes actually generated by a true distribution P.

#cross-entropy#entropy#kl divergence+12
๐Ÿ“šTheoryIntermediate

KL Divergence

KL divergence measures how much information is lost when using model Q to approximate the true distribution P.

#kl divergence#relative entropy#cross-entropy+12
๐Ÿ“šTheoryIntermediate

Concentration Inequalities

Concentration inequalities give high-probability bounds that random outcomes stay close to their expectations, even without knowing the full distribution.

#concentration inequalities#hoeffding inequality#chernoff bound+12
๐Ÿ“šTheoryIntermediate

ELBO (Evidence Lower Bound)

The Evidence Lower Bound (ELBO) is a tractable lower bound on the log evidence log p(x) that enables learning and inference in latent variable models like VAEs.

#elbo#variational inference#vae+12
๐Ÿ“šTheoryIntermediate

Mutual Information

Mutual Information (MI) measures how much knowing one random variable reduces uncertainty about another.

#mutual information#entropy#kl divergence+12
๐Ÿ“šTheoryIntermediate

KL Divergence (Kullback-Leibler Divergence)

Kullbackโ€“Leibler (KL) divergence measures how one probability distribution P devotes probability mass differently from a reference distribution Q.

#kl divergence#kullback-leibler#cross-entropy+12
๐Ÿ“šTheoryIntermediate

Shannon Entropy

Shannon entropy quantifies the average uncertainty or information content of a random variable in bits when using base-2 logarithms.

#shannon entropy#information gain#mutual information+12
๐Ÿ“šTheoryIntermediate

Information Theory

Information theory quantifies uncertainty and information using measures like entropy, cross-entropy, KL divergence, and mutual information.

#entropy#cross-entropy#kl divergence+12