🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts15

Groups

📐Linear Algebra15📈Calculus & Differentiation10🎯Optimization14🎲Probability Theory12📊Statistics for ML9📡Information Theory10🔺Convex Optimization7🔢Numerical Methods6🕸Graph Theory for Deep Learning6🔵Topology for ML5🌐Differential Geometry6∞Measure Theory & Functional Analysis6🎰Random Matrix Theory5🌊Fourier Analysis & Signal Processing9🎰Sampling & Monte Carlo Methods10🧠Deep Learning Theory12🛡️Regularization Theory11👁️Attention & Transformer Theory10🎨Generative Model Theory11🔮Representation Learning10🎮Reinforcement Learning Mathematics9🔄Variational Methods8📉Loss Functions & Objectives10⏱️Sequence & Temporal Models8💎Geometric Deep Learning8

Category

🔷All∑Math⚙️Algo🗂️DS📚Theory

Level

AllBeginnerIntermediate
📚TheoryIntermediate

Multi-Task Loss Balancing

Multi-task loss balancing aims to automatically set each task’s weight so that no single loss dominates training.

#multi-task learning#uncertainty weighting#homoscedastic uncertainty+12
📚TheoryIntermediate

Knowledge Distillation Loss

Knowledge distillation loss blends standard hard-label cross-entropy with a soft distribution match from a teacher using a temperature parameter.

#knowledge distillation
12
Advanced
Filtering by:
#regularization
#kd loss
#temperature scaling
+12
📚TheoryIntermediate

RLHF Mathematics

RLHF turns human preferences between two model outputs into training signals using a probabilistic model of choice.

#rlhf#bradley-terry#pairwise comparisons+11
📚TheoryIntermediate

Metric Learning

Metric learning is about automatically learning a distance function so that similar items are close and dissimilar items are far in a feature space.

#metric learning#mahalanobis distance#contrastive loss+12
📚TheoryIntermediate

Stochastic Depth

Stochastic Depth randomly drops whole residual layers during training while keeping the full network at inference time.

#stochastic depth#resnet#residual block+12
📚TheoryIntermediate

Early Stopping

Early stopping halts training when the validation loss stops improving, preventing overfitting and saving compute.

#early stopping#validation loss#patience+11
📚TheoryIntermediate

Label Smoothing

Label smoothing replaces a hard one-hot target with a slightly softened distribution to reduce model overconfidence.

#label smoothing#cross-entropy#softmax+12
📚TheoryIntermediate

Dropout

Dropout randomly turns off (zeros) some neurons during training to prevent the network from memorizing the training data.

#dropout#inverted dropout#bernoulli mask+12
📚TheoryIntermediate

Spectral Normalization

Spectral normalization rescales a weight matrix so its largest singular value (spectral norm) is at most a target value, typically 1.

#spectral normalization#spectral norm#singular value+12
📚TheoryIntermediate

Empirical Risk Minimization

Empirical Risk Minimization (ERM) chooses a model that minimizes the average loss on the training data.

#empirical risk minimization#expected risk#loss function+12
∑MathIntermediate

Maximum A Posteriori (MAP) Estimation

Maximum A Posteriori (MAP) estimation chooses the parameter value with the highest posterior probability after seeing data.

#map estimation#posterior mode#bayesian inference+12
∑MathIntermediate

Multivariate Gaussian Distribution

A multivariate Gaussian (normal) distribution models a vector of real-valued variables with a bell-shaped probability hill in many dimensions.

#multivariate normal#gaussian distribution#covariance matrix+11