🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts18

Groups

📐Linear Algebra15📈Calculus & Differentiation10🎯Optimization14🎲Probability Theory12📊Statistics for ML9📡Information Theory10🔺Convex Optimization7🔢Numerical Methods6🕸Graph Theory for Deep Learning6🔵Topology for ML5🌐Differential Geometry6∞Measure Theory & Functional Analysis6🎰Random Matrix Theory5🌊Fourier Analysis & Signal Processing9🎰Sampling & Monte Carlo Methods10🧠Deep Learning Theory12🛡️Regularization Theory11👁️Attention & Transformer Theory10🎨Generative Model Theory11🔮Representation Learning10🎮Reinforcement Learning Mathematics9🔄Variational Methods8📉Loss Functions & Objectives10⏱️Sequence & Temporal Models8💎Geometric Deep Learning8

Category

🔷All∑Math⚙️Algo🗂️DS📚Theory

Level

AllBeginnerIntermediate
⚙️AlgorithmIntermediate

Sharpness-Aware Minimization (SAM)

Sharpness-Aware Minimization (SAM) trains models to perform well even when their weights are slightly perturbed, seeking flatter minima that generalize better.

#sharpness-aware minimization#sam optimizer#robust optimization+11
∑MathIntermediate

Cross-Entropy Loss

Cross-entropy loss measures how well predicted probabilities match the true labels by penalizing confident wrong predictions heavily.

#cross-entropy
12
Advanced
Filtering by:
#logistic regression
#binary cross-entropy
#softmax
+11
📚TheoryIntermediate

RLHF Mathematics

RLHF turns human preferences between two model outputs into training signals using a probabilistic model of choice.

#rlhf#bradley-terry#pairwise comparisons+11
📚TheoryAdvanced

Transfer Learning Theory

Transfer learning theory studies when and why a model trained on a source distribution will work on a different target distribution.

#transfer learning#domain adaptation#hΔh-divergence+12
∑MathIntermediate

Elastic Net Regularization

Elastic Net regularization combines L1 (Lasso) and L2 (Ridge) penalties to produce models that are both sparse and stable.

#elastic net#lasso#ridge regression+12
∑MathIntermediate

L2 Regularization (Ridge/Weight Decay)

L2 regularization (also called ridge or weight decay) adds a penalty proportional to the sum of squared weights to discourage large parameters.

#l2 regularization#ridge regression#weight decay+12
📚TheoryIntermediate

Grokking & Delayed Generalization

Grokking is when a model suddenly starts to generalize well long after it has already memorized the training set.

#grokking#delayed generalization#weight decay+12
⚙️AlgorithmAdvanced

Hamiltonian Monte Carlo (HMC)

Hamiltonian Monte Carlo (HMC) uses gradients of the log-density to propose long-distance moves that still land in high-probability regions.

#hamiltonian monte carlo#hmc#mcmc+11
⚙️AlgorithmAdvanced

Natural Gradient Method

Natural gradient scales the ordinary gradient by the inverse Fisher information matrix to account for the geometry of probability distributions.

#natural gradient#fisher information#empirical fisher+12
📚TheoryIntermediate

Cross-Entropy

Cross-entropy measures how well a proposed distribution Q predicts outcomes actually generated by a true distribution P.

#cross-entropy#entropy#kl divergence+12
∑MathIntermediate

Maximum A Posteriori (MAP) Estimation

Maximum A Posteriori (MAP) estimation chooses the parameter value with the highest posterior probability after seeing data.

#map estimation#posterior mode#bayesian inference+12
∑MathIntermediate

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) chooses parameters that make the observed data most probable under a chosen model.

#maximum likelihood#log-likelihood#bernoulli mle+12