๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Concepts356

Groups

๐Ÿ“Linear Algebra15๐Ÿ“ˆCalculus & Differentiation10๐ŸŽฏOptimization14๐ŸŽฒProbability Theory12๐Ÿ“ŠStatistics for ML9๐Ÿ“กInformation Theory10๐Ÿ”บConvex Optimization7๐Ÿ”ขNumerical Methods6๐Ÿ•ธGraph Theory for Deep Learning6๐Ÿ”ตTopology for ML5๐ŸŒDifferential Geometry6โˆžMeasure Theory & Functional Analysis6๐ŸŽฐRandom Matrix Theory5๐ŸŒŠFourier Analysis & Signal Processing9๐ŸŽฐSampling & Monte Carlo Methods10๐Ÿง Deep Learning Theory12๐Ÿ›ก๏ธRegularization Theory11๐Ÿ‘๏ธAttention & Transformer Theory10๐ŸŽจGenerative Model Theory11๐Ÿ”ฎRepresentation Learning10๐ŸŽฎReinforcement Learning Mathematics9๐Ÿ”„Variational Methods8๐Ÿ“‰Loss Functions & Objectives10โฑ๏ธSequence & Temporal Models8๐Ÿ’ŽGeometric Deep Learning8

Category

๐Ÿ”ทAllโˆ‘Mathโš™๏ธAlgo๐Ÿ—‚๏ธDS๐Ÿ“šTheory

Level

AllBeginnerIntermediate
โˆ‘MathIntermediate

Expectation, Variance & Moments

Expectation is the long-run average value of a random variable and acts like the balance point of its distribution.

#expectation#variance#moments+12
โˆ‘MathIntermediate

Random Variables & Distributions

A random variable maps uncertain outcomes to numbers and is described by a distribution that assigns likelihoods to values or ranges.

#random variable#pmf
910111213
Advanced
#pdf
+12
โˆ‘MathIntermediate

Probability Axioms & Rules

Kolmogorovโ€™s axioms define probability as a measure on events: non-negativity, normalization, and countable additivity.

#kolmogorov axioms#probability measure#sample space+12
๐Ÿ“šTheoryIntermediate

Loss Landscape Analysis

A loss landscape is the โ€œterrainโ€ of a modelโ€™s loss as you move through parameter space; valleys are good solutions and peaks are bad ones.

#loss landscape#sharpness#hessian eigenvalues+12
๐Ÿ“šTheoryIntermediate

Weight Initialization Strategies

Weight initialization sets the starting values of neural network parameters so signals and gradients neither explode nor vanish as they pass through layers.

#xavier#glorot#he+12
โš™๏ธAlgorithmIntermediate

Gradient Clipping & Normalization

Gradient clipping limits how large gradient values or their overall magnitude can become during optimization to prevent exploding updates.

#gradient clipping#clipping by norm#clipping by value+12
โˆ‘MathIntermediate

Lagrange Multipliers & Constrained Optimization

Lagrange multipliers let you optimize a function while strictly satisfying equality constraints by introducing auxiliary variables (the multipliers).

#lagrange multipliers#constrained optimization#kkt conditions+11
โš™๏ธAlgorithmIntermediate

Learning Rate Schedules

Learning rate schedules control how fast a model learns over time by changing the learning rate across iterations or epochs.

#learning rate schedules#step decay#cosine annealing+12
โš™๏ธAlgorithmIntermediate

Adam & Adaptive Methods

Adam is an optimization algorithm that combines momentum (first moment) with RMSProp-style adaptive learning rates (second moment).

#adam#adaptive methods#rmsprop+12
โš™๏ธAlgorithmIntermediate

Momentum Methods

Momentum methods add an exponentially weighted memory of past gradients to make descent steps smoother and faster, especially in ravines and ill-conditioned problems.

#momentum#heavy-ball#polyak momentum+12
โš™๏ธAlgorithmIntermediate

Stochastic Gradient Descent (SGD)

Stochastic Gradient Descent (SGD) updates model parameters using small random subsets (mini-batches) of data, making learning faster and more memory-efficient.

#stochastic gradient descent#mini-batch#random shuffling+12
โš™๏ธAlgorithmIntermediate

Gradient Descent

Gradient descent is a simple, repeatable way to move downhill on a loss surface by stepping in the opposite direction of the gradient.

#gradient descent#batch gradient descent#learning rate+12