๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts12

Groups

๐Ÿ“Linear Algebra15๐Ÿ“ˆCalculus & Differentiation10๐ŸŽฏOptimization14๐ŸŽฒProbability Theory12๐Ÿ“ŠStatistics for ML9๐Ÿ“กInformation Theory10๐Ÿ”บConvex Optimization7๐Ÿ”ขNumerical Methods6๐Ÿ•ธGraph Theory for Deep Learning6๐Ÿ”ตTopology for ML5๐ŸŒDifferential Geometry6โˆžMeasure Theory & Functional Analysis6๐ŸŽฐRandom Matrix Theory5๐ŸŒŠFourier Analysis & Signal Processing9๐ŸŽฐSampling & Monte Carlo Methods10๐Ÿง Deep Learning Theory12๐Ÿ›ก๏ธRegularization Theory11๐Ÿ‘๏ธAttention & Transformer Theory10๐ŸŽจGenerative Model Theory11๐Ÿ”ฎRepresentation Learning10๐ŸŽฎReinforcement Learning Mathematics9๐Ÿ”„Variational Methods8๐Ÿ“‰Loss Functions & Objectives10โฑ๏ธSequence & Temporal Models8๐Ÿ’ŽGeometric Deep Learning8

Category

๐Ÿ”ทAllโˆ‘Mathโš™๏ธAlgo๐Ÿ—‚๏ธDS๐Ÿ“šTheory

Level

AllBeginnerIntermediate
๐Ÿ“šTheoryIntermediate

Multi-Task Loss Balancing

Multi-task loss balancing aims to automatically set each taskโ€™s weight so that no single loss dominates training.

#multi-task learning#uncertainty weighting#homoscedastic uncertainty+12
๐Ÿ“šTheoryIntermediate

Knowledge Distillation Loss

Knowledge distillation loss blends standard hard-label cross-entropy with a soft distribution match from a teacher using a temperature parameter.

#knowledge distillation
Advanced
Filtering by:
#regularization
#kd loss
#temperature scaling
+12
๐Ÿ“šTheoryIntermediate

RLHF Mathematics

RLHF turns human preferences between two model outputs into training signals using a probabilistic model of choice.

#rlhf#bradley-terry#pairwise comparisons+11
๐Ÿ“šTheoryIntermediate

Metric Learning

Metric learning is about automatically learning a distance function so that similar items are close and dissimilar items are far in a feature space.

#metric learning#mahalanobis distance#contrastive loss+12
๐Ÿ“šTheoryIntermediate

Stochastic Depth

Stochastic Depth randomly drops whole residual layers during training while keeping the full network at inference time.

#stochastic depth#resnet#residual block+12
๐Ÿ“šTheoryIntermediate

Early Stopping

Early stopping halts training when the validation loss stops improving, preventing overfitting and saving compute.

#early stopping#validation loss#patience+11
๐Ÿ“šTheoryIntermediate

Label Smoothing

Label smoothing replaces a hard one-hot target with a slightly softened distribution to reduce model overconfidence.

#label smoothing#cross-entropy#softmax+12
๐Ÿ“šTheoryIntermediate

Dropout

Dropout randomly turns off (zeros) some neurons during training to prevent the network from memorizing the training data.

#dropout#inverted dropout#bernoulli mask+12
๐Ÿ“šTheoryIntermediate

Spectral Normalization

Spectral normalization rescales a weight matrix so its largest singular value (spectral norm) is at most a target value, typically 1.

#spectral normalization#spectral norm#singular value+12
๐Ÿ“šTheoryIntermediate

Empirical Risk Minimization

Empirical Risk Minimization (ERM) chooses a model that minimizes the average loss on the training data.

#empirical risk minimization#expected risk#loss function+12
๐Ÿ“šTheoryIntermediate

ELBO (Evidence Lower Bound)

The Evidence Lower Bound (ELBO) is a tractable lower bound on the log evidence log p(x) that enables learning and inference in latent variable models like VAEs.

#elbo#variational inference#vae+12
๐Ÿ“šTheoryIntermediate

Bias-Variance Tradeoff

The biasโ€“variance tradeoff explains how prediction error splits into bias squared, variance, and irreducible noise for squared loss.

#bias variance tradeoff#mse decomposition#polynomial regression+12